laa-laa | 12 November, 2008 10:36
A while ago my Symbian OS based project needed to process RDF data: parse it, store it, query it and serialize it. For that purpose I decided to port a readily available library, and after some browsing around I ended up with Redland RDF libraries. Redland had the features I needed, its LGPL2.1/Apache2.0 dual license was generous enough, and it was written in C without any complicated dependencies.
This post describes how I tested and fixed out-of-memory (OOM) robustness issues and memory leaks. I won't describe the porting process itself since it was relatively straightforward with P.I.P.S. libraries and there are already plenty of examples about porting code using P.I.P.S. and Open C.
Why test for OOM?
Redland, like many open source projects, grew up in unixland where memory is plentiful and supplemented by virtual memory. Programs are combined in shell scripting style where individual programs run only for a short while and the process resources are then automatically freed. On the other hand, devices based on Symbian OS are still rather resource-constrained. Compared to desktop systems, there’s only a little RAM and no virtual memory. Therefore OOM errors are more likely to happen. Additionally in Symbian OS, the scripting architectural style is rarely used. Thus program lifecycles are different from unixland. For example, in my project I planned to use Redland in long-running background servers. Leaking memory in a long-running background server is a sure way to memory allocation failures and all kinds of errors.
My target was therefore to make the library resilient to OOMs: when OOMs do occur, they are handled gracefully. No crashes. No incorrect results. No memory leaks.
OOM loop
There's already a rather well established OOM testing technique on Symbian OS called OOM loop. The basic idea is to inject allocation failures using __UHEAP_SETFAIL() heap failure macro, which in turn uses User::__DbgSetAllocFail(), and then see how the code deals with allocation failures. John Pagonis writes extensively about the OOM loop construct in his Symbian Developer Network technical paper.
I started by implementing some integration test cases that exercised the Redland libraries in a similar fashion I was planning to use them in my real program. I attempted to run the test functions in an OOM loop with lots of iterations. For example, using the EDeterministic heap failure mode to fail every kth allocation for k=1..2000. One particular issue I soon discovered was the fact that the library code used abort() extensively for handling fatal errors like memory allocation failures. For my purposes, terminating the process was not the kind of recovery strategy I was looking for, so I stubbed the standard C library implementation of abort() with my own version that throws an exception with User::Leave(). These leaves could be caught in library caller code and dealt with properly.
This enabled me to enter the following, somewhat test-driven bug fixing loop:
This way I was able to discover and fix literally hundreds of bugs in the libraries. Most of them were relatively simple failures to check the return code of some potentially failing function, simple memory leaks and so on. Some bugs were a little more complicated, for example, requiring design-level clarifications to object ownership passing rules.
Improved heap failure tool
The OOM loop approach described above also had its issues:
To counter these issues, I decided to not use the Symbian OS heap failure tool but to write my own. Fortunately for me, the Redland libraries only used a small set of memory management functions: malloc(), calloc(), realloc() and free(). No other functions like strdup() were used. This made it easy to implement my own versions of these functions in the porting layer DLL that already contained the abort() replacement. For replacement, I used User::Alloc(), User::AllocZ(), User::ReAlloc() and User::Free().
I decided to store the heap failure tool state information (failure mode, allocation counter, pseudorandom seed) in the DLL thread-local storage. I also added an OOM counter that would keep track of all allocation failures, simulated and real. I also added some DLL API functions to set the heap failure parameters and to query/reset the OOM counter. For heap failure simulation, I didn't feel the need to implement all heap failure modes supported by the native heap failure tool. I was happy with just the deterministic (EDeterministic) and pseudorandom (ERandom) modes.
This setup fully addresses the issues I had:
The OOM counter was also useful in non-testing setup. I could use it to invalidate results of any operation to make sure the system was not running in an inconsistent state and producing "almost correct" results.
Of course, all the bug fixes have been submitted back to the open source project to benefit the whole community.
Open C, S60, Symbian C++, Testing |
Permalink |
Add comment |
Trackbacks (0)