I hate software: March 2014

I've always wanted to try writing a toy compiler, but have not made myself actually learn the theory of parsing (I plan to do it and post some notes into the blog soon though). However, recently I've been playing with Clang and LLVM. I've not yet used it for compiling, but I want to share my experience of using and extending Clang's error detection tools.

LLVM is a specification of platform-independent bytecode and a set of tools aimed to make the development of JITed interpreters and portable native binaries easier. A significant portion of work for LLVM was done by Apple and nowadays it is widely used in industry. For example, NVIDIA uses it for compiling CUDA code, AMD uses it to generate shaders in its open-source driver. Clang is a parser and a compiler for a set of C-like languages, which includes C, C++ and Objective-C. This compiler has several properties that may be interesting for developers:

It allows you to traverse the parsed AST and transform it. For example, add or remove the curly brackets around if-else conditionals to troll your colleagues.
It allows you to define custom annotations via the __attribute__ extension which again can be intercepted after the AST is generated but is not yet compiled.
It supports nearly all the features of all revisions of C and C++ and is compatible with the majority of GCC compiler options which allows to use it as a drop-in replacement. By the way, FreeBSD has switched to Clang, and on Apple OS X gcc is actually a symlink to clang!

So, why should you care? Imagine how many times you wanted to add some cool feature but realized macros were not enough. Or you wanted to enforce some code convention? With clang one could easily write a static code analyzer that would catch the notorious Apple "double fail" bug. Which makes me wonder why they did not use their own technology :)

LLVM provides several frameworks for finding bugs at runtime. For example, AddressSanitizer and MemorySanitizer to catch access to uninitialized or unallocated memory.

I was given the following interesting problem at work: build some solution that would allow to detect where the application is leaking memory. Sounds like a common problem, with no satisfying answer.

Using Valgrind is prohibitively slow - a program running under it can be easily 50 times slower than without it.
Using named SLAB areas (like linux kernel does) is not an option. First of, in the worst case using SLAB means only half of the memory is available for the allocation. Secondly, such approach allows to know objects of what class are occupying the memory, but now where and why they were allocated
Using TCMalloc which hooks malloc/free calls also turned out to be slow enough to cause different behaviour in release and debugging environment, so some lightweight solution had to be designed.

Anyway, while thinking of a good way to do it, I found out that Clang 3.4 has something called LeakSanitizer (also lsan and liblsan) which is already ported to GCC 4.9. In short, it is a lightweight version of tcmalloc used in Google Perftools. It collects the information about memory allocations and prints leak locations when the application exits. It can use the LLVM symbolizer or GCC libbacktrace to print human-readable locations instead of addresses. However, it has some issues:

It has an explicit check in the __lsan::DoLeakCheck() function which disallows it to be called twice. Therefore, we cannot use it to print leaks at runtime without shutting down the process
Leak detection cannot be turned off when it is not needed. Hooked malloc/memalign functions are always used, and the __lsan_enable/disable function pair only controls whether statistics should be ignored or not.

The first idea was to patch the PLT/GOT tables in ELF to dynamically choose between the functions from libc and lsan. It is a very dirty approach, though it will work. You can find a code example at https://gist.github.com/astarasikov/9547918.

However, patching GOT we only divert the functions for a single binary, and we'd have to patch the GOT for each loaded shared library which is, well, boring. So, I decided to patch liblsan instead. I had to patch it either way, to remove the dreaded limitation in DoLeakCheck. I figured it should be safe to do. Though there is a potential deadlock while parsing ELF header (as indicated by a comment in lsan source), you can work around it by disabling leak checking in global variables.

What I did was to set up a number of function pointers to the hooked functions, initialized with lsan wrappers (to avoid false positives for memory allocation during libc constructors) and add two functions, __lsan_enable_interceptors and __lsan_disable_interceptors to switch between libc and lsan implementations. This should allow to use leak detection for both our code and third-party loadable shared libraries. Since lsan does not have extra dependencies on clang/gcc it was enough to stick a new CMakeLists.txt and it can now be built standalone. So now one can load the library with LD_PRELOAD and query the new functions with "dlsym". If they're present - it is possible to selectively enable/disable leak detection, if not - the application is probably using vanilla lsan from clang.

There are some issues, though

LSAN may have some considerable memory overhead. It looks like it doesn't make much sense to disable leak checking since the memory consumed by LSAN won't be reclaimed until the process exits. On the other hand, we can disable leak detection at application startup and only enable it when we need to trace a leak (for example, an app has been running continuously for a long time, and we don't want to stop it to relaunch in a debug configuration).
We need to ensure that calling a non-hooked free() on a hooked malloc() and vice-versa does not lead to memory corruption. This needs to be looked into, but it seems that both lsan and libc just print a warning in that case, and corruption does not happen (but a memory leak does, therefore it is impractical to repeatedly turn leak detection on and off)

We plan to release the patched library once we perform some evaluation and understand whether it is a viable approach.

Some ideas worth looking into may be:

Add a module to annotate and type-check inline assembly. Would be good for Linux kernel
Add a module to trace all pointer puns. For example, in Linux kernel and many other pieces of C code, casting to a void pointer and using the container_of macro is often used to emulate OOP. Now, using clang, one could possibly allow to check the types when some data is registered during initialization, casted to void and then used in some other function and casted back or even generate the intermediate code programmatically.
Automatically replace shared variables/function pointer calls with IPC messages. That is interesting if one would like to experiment with porting Linux code to other systems or turning Linux into a microkernel

I hate software

Tuesday, March 18, 2014

A story about clang tools