Monday, June 4, 2012

linux gone wrong


I've been hacking on linux kernel for embedded hardware (mainly PDAs)
for quite a while already and I'm sick to death of the bullshit that
OEMs keep doing. That's why I decided to put up this rant on some
common failures found in OEM code and some advice to OEM programmers/managers
on how to reduce development costs and improve the quality of their code.

Short version: don't suffer NIH and be friendly with the community.
Long version is below. It will probably get edited to become shorter and I'll add more points to it

Ok, there are two levels of problems.
One level is the people at
SoC manufacturers writing support code for the CPU and its builtin peripherals.
Second level, much more wilder and uneducated, are the coders (well, actually,
most of them are electrical engineers) who write the support code for the
finished products - handsets you can go grab at a local store.

The first level is generally friendly and adequate, but that's not the
rule of thumb. Most problems here are caused by managers, lawyers and
other fucktards who're trying to conceal their proprietary shit or
'intellectual property' as you would say in a better company.

Let's take Qualcomm as a good (bad) example. They tried to do their best
about hiding as much details about their hardware as possible. As the result,
audio, camera, gps and almost all drivers are implemented as close-sourced
bits and kernel-side drivers are just stubs that export kernel code to userland.
Naturally, this would be rejected by all sane developers. Luckily mainline
linux kernel is one of the few places where sanity and technical advantages
are still valued.
What's the result of this? Well, MSM/QSD kernel tree severely lags behind
both vanilla kernel and google trees, and up until recently (when the architecture
was re-designed to be more open and standard APIs like ALSA replaced some
proprietary bits) any major Android release (like Gingerbread) meant
HTC, Sony Ericsson and other unhappy souls haunted by Qualcomm had
to waste almost a year of development efforts just to forward-port their crap.

On the other hand, there are good examples like Samsung and Texas Instruments
whose SoCs have all the code mainlined, including V4L2 drivers for camera and
ALSA for audio. And there are those who started like Qcom and rolled out
a mess of crappy BSP code but then realized that it's better to play by the
rules than invest money into doing everything the other way round (NVIDIA).

------
Ok, let's come to the second level - finished products and OEM code.
Judging by the code we can see released publically,
there are no code conventions, no source control. There are tons of #ifdeffery
and duplicate code. That's not the problem per se, that's just the result
of the lack of development culture.

Some notable examples that make me mad.
Qualcomm/HTC: totally ignoring existing interfaces for power supply,
regulators. Reimplementing the existing code (mostly copy-pasting). This leads
to maintainability problems.

Asus [TF101]: instead of using platform data, they decided to hack up
the existing drivers and add code directly to gpio/mmc/sound drivers.

Samsung [Galaxy S2]: a lot of duplicate drivers for the same piece of
hardware, broken makefiles and kconfigs, hacks in drivers.

------
Ranting must be productive. Therefore below I will write a list of
advice on how to write and maintain code for certain subsystems and give
a brief rationale so that it doesn't sound like moaning of a hardware geek.

General
-- Release early, release often, push your code upstream
Rationale: this way your code will get reviewed and potential
problems will get detected at earlier stages. You will not end up
with several mutually incompatible code versions.

-- Do not use the same machine type for all boards. Register one with
arm-linux website.
Rationale: the machine type was introduced to allow having one kernel
binary support multiple boards. When a vendor hardcodes the same
machine type for a range of devices, it makes it impossible to build
a single kernel for these devices and makes maintaining kernel difficult.
Therefore, such code is not accepted upstream and supporting new releases
costs more development efforts.

-- Avoid compile-time #ifdef, especially in board code
Rationale: they prevent multiple boards from being built into a single
kernel image. Also, the code in the rarely used #ifdef branches
tends to get broken. Therefore, use machine type or system revision
to select the code path at runtime.

-- Do not use non-static 'extern' calls. If strictly needed,
use EXPORT_SYMBOL declarations.
Rationale: this adds an extra level of NULL pointer checks and
improves stability and security.

-- Do not reinvent what is already implemented
Rationale: your code will not get accepted upstream. Your code will
not receive bug and security fixes when someone fixes upstream faults.
You will waste time porting it between kernel versions. I will write
a rant about you. If you need some functionality missing upstream,
please submit a patch for it to the mailing list and fix upstream instead
of copy-pasting.

-- Do not pollute the source code with dead #ifdef code, comments
from ancient proprietary VCS systems. Do follow code conventions.
Rationale: this eases up understanding your code and maintaining it.

-- Use public version control repositories.
Rationale: this allows third-party coders to find patches for specific
bugfixes, submit their improvements and generallty reduces the complexity
of keeping up with the upstream and improves code quality.


-- Do not try to hide the code.
Rationale: what you will end up with is that your driver will not
get upstream, enthusiasts will blame you for all deadly sins,
you will have to fix your crap every release.

some notes on drivers

ISP driver (image capture pipeline, video camera)
-- Use V4L2 API
Rationale: you can push the driver upstream, reuse camera sensor
drivers already upstream, reuse user-space libraries and video
processing applications.

Power supply (battery, chargers)
-- Use pda_power driver.
Rationale: it provides most of the functionality custom charger drivers
reimplement, including monitoring usb/ac power supply and notifying battery
drivers.

Sound drivers
-- Use Alsa. Do not use custom IOCTLs.
Rationale: you will be able to reuse existing userland sound libraries and
kernel-level codec drivers.

So, what I want to see is a finished device, like a phone, a tablet, that
can work with the kernel from kernel.org without hackery and additional patches.