Sunday, September 7, 2014

GSoC results or how I nearly wasted summer

This year I got a chance to participate in the Google Summer of Code with the FreeBSD project.
I have long wanted to learn about the FreeBSD kernel internals and port it to some ARM board but was getting sidetracked whenever I tried to do it. So it looked like a perfect chance to stop procrastinating.

Feel free to scroll down to the "other stuff" paragraph if you don't feel like reading thre paragraphs of a typical whinig associated with debugging C code.

Initially I wanted to bring up some real piece of hardware. My choice was the Nexus 5 phone which has Qualcomm SoC. However, some of the FreeBSD developers suggested that instead I port the kernel to Android Emulator. It sounded like a sane idea since Android Emulator is available for the major OSes and is easy to set up which will allow to expose more people to the project. Besides, unlike QEMU, it can emulate some peripherals specific to a mobile device such as sensors.

What is Android Emulator? In fact, it is an emulator based on QEMU which emulates a virtual board called "Goldfish" which includes a set of devices such as interrupt controller, timer, input devices. It is worth noting that Android Emulator is based on an ancient version of QEMU though starting with Android L one can obtain the binaries of the emulator from git which is based on a more recent build.


I have started from implementing the basics: the interrupt controller, the timer and the UART driver. That has allowed me to see the boot console but then I got stuck for literally months with kernel crashing at various points in what seemed like a totally random fashion. I have spent nights single-stepping the kernel with my sight slowly turning from barely red eyes to emitting hard radiation. It was definitely a problem with caching but ultimately I gave up trying to fix it. Since I knew that FreeBSD kernel worked fine on ARM in a recent version of QEMU, it was clearly a bug in Android Emulator, even though it has not manifested itself in Linux. Since Android Emulator is going to eventually get updated to a new QEMU base and I was running out of time, I decided to just use a nightly build of emulator instead of the one coming with the Android SDK and that fixed this particular issue. 
But hey! At least I've learnt about the FreeBSD virtual memory management and the UMA allocator the hard way.

Next up was fighting the timer and random freezes while initializing the random number generator (sic!). That was easy - turns out I just forgot to call the function. Anyway it would make sense to move that call out of the board files and call it for every board that does not define the "NO_EVENTTIMERS" option in the kernel config.

Fast forward to the start of August when there are only a couple days left and I still don't have a working block device driver to boot rootfs! Writing a MMC driver turned out to be very easy and I started celebrating when I got the SD card image from Raspberry PI booting in Android Emulator and showing the login prompt.

It worked but with a huge kludge in the OpenFirmware bindings. Somehow one of the functions which should've returned the driver bus name returned (seemingly) random crap so instead of a NULL pointer check I had to add a check that the address points to the kernel VM range. I have then tracked down the real source of the problem.
I was basing my MMC driver on the LPC SoC driver. LPC driver itself is suspicious - for example, the place where DMA memory is allocated says "allocating Framebuffer memory" which may be an indicator that it was written in a hurry and possibly only barely tested. At the MMC bus attach routine, it was calling the "device_set_ivars" function and setting the ivars pointer to the mmc host structure. I have observed the similiar pattern in some other drivers. In the OpenFirmware code, the ivars structure was dereferenced as a completely other structure and a string was taken from a member of this structure.

How the hell did it happen in a world where people are using the recent Clang compiler and compile with "-Wall" and "-Werror"? Uhh, well, if you're dealing with some kind of OOP in plain C, casting to void pointer and back and other ill-typed wicked tricks are inevitable. Just look at that prototype and cry with me:

>> device_set_ivars(device_t dev, void *ivar);

For now, I'm pushing the changes to the following branch. Please note that I'm rebasing against master and force-pushing in process if you're willing to test.
https://github.com/astarasikov/freebsd/tree/android_goldfish_arm_master
So, what has been achieved and what am I still going to do? Well, debugging MMU issues and OpenFirmware bug has delayed me substantially so I've not done everything I've planned. Still
  • Basic hardware works in Goldfish: Timer, IRQs, Ethernet, MMC, UART
  • It is using NEWCONS for Framebuffer Console (which is cool!)
  • I've verified that Android Emulator works in FreeBSD with linuxulator
  • I have fixed Android Emulator to compile natively on FreeBSD and it mostly works (at least, console output) except graphics (which I think should be an easy fix. there are a lot of places where "ifdef __linux__" is completely unjustified and should rather be enabled for all unices)
How to test and contribute?
You can use the linux build of Android Emulator using Linuxulator. Unfortunately, linuxulator only supports 32-bit binaries so we cannot run nightly binaries which have the MMU issues fixed.
Therefore, I have fixed the emulator to compile natively on FreeBSD! I've pushed the branch named "l-preview-freebsd" to github. You will also need the gtest. I cloned the repo from CyanogenMod and pushed a tag named "l-preview-freebsd", which is actually "cm-11.0".

Compiling the emulator:
>> git clone https://github.com/astarasikov/qemu.git
>> https://github.com/astarasikov/android_external_gtest
>> cd cd android_external_gtest
>> git checkout cm-11.0
>> cd ../qemu.git
>> git checkout l-preview-freebsd
>> bash android-build-freebsd.sh


Please refer to README file in git for the detailed instructions on building kernel.

The only thing that really bothers me about low-level and embedded stuff is that it is extremely difficult to estimate how long it may take to implement a certain feature because every time you end up debugging obscure stuff and the "useful stuff you learn or do" / "time wasted" ratio is very small. On a bright side, while *BSD lag a bit behind Linux in terms of device drivers and performance optimizations, reading and debugging *BSD code is much easier than GNU projects like EGLIBC, GCC and GLib.

Other stuff.
Besides GSoC at the start of summer I had a chance to work with Chris Wade (who is an amazing hacker, by the way). We started working on an ARM virtualization project and have spent nice days debugging caching issues and instruction decoding while trying to emulate a particular cortex A8 SoC on an A15 chip. Unfortunately working on GSoC, going to a daily job and doing this project simultaneously turned out to be surprisingly difficult and I had to quit at least one of the activities. Still I wish Chris good luck with this project and if you're interested in virtualization and iPhones, sign up for early demo at virtu.al

Meanwhile I'm planning to learn ARMv8 ISA. It's a pity there is no hackable hardware available for reasonable prices yet. QEMU works fine with VirtIO peripherals though. But I'm getting increasingly worried about devkits costing more and more essentially making it more difficult for a novice to become an embedded software engineer.