<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="http://blogs.forum.nokia.com/styles/rss.css" type="text/css"?>
<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  xmlns="http://my.netscape.com/rdf/simple/0.9/"
>

 <channel>
  <title>Lauri Aalto&#039;s Forum Nokia Blog</title>
  <link>http://blogs.forum.nokia.com/blog/lauri-aaltos-forum-nokia-blog</link>
  <description>A Forum Nokia Blog</description>
 </channel>
    <item>
   <title>Fixing out-of-memory issues in Redland RDF libraries</title>
   <description>&lt;p&gt;
A while ago my Symbian OS based project needed to process &lt;a href=&quot;http://en.wikipedia.org/wiki/Resource_Description_Framework&quot;&gt;RDF&lt;/a&gt; data: parse it, store it, query it and serialize it. For that purpose I decided to port a readily available library, and after some browsing around I ended up with &lt;a href=&quot;http://www.librdf.org/&quot;&gt;Redland RDF libraries&lt;/a&gt;. Redland had the features I needed, its LGPL2.1/Apache2.0 dual license was generous enough, and it was written in C without any complicated dependencies.
&lt;/p&gt;
&lt;p&gt;
This post describes how I tested and fixed out-of-memory (OOM) robustness issues and memory leaks. I won&#039;t describe the porting process itself since it was relatively straightforward with &lt;a href=&quot;http://wiki.forum.nokia.com/index.php/P.I.P.S&quot;&gt;P.I.P.S.&lt;/a&gt; libraries and there are already plenty of examples about porting code using P.I.P.S. and Open C.
&lt;/p&gt;
&lt;p&gt;
&lt;strong&gt;Why test for OOM?&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
Redland, like many open source projects, grew up in unixland where memory is plentiful and supplemented by virtual memory. Programs are combined in shell scripting style where individual programs run only for a short while and the process resources are then automatically freed. On the other hand, devices based on Symbian OS are still rather resource-constrained. Compared to desktop systems, there&amp;rsquo;s only a little RAM and no virtual memory. Therefore OOM errors are more likely to happen. Additionally in Symbian OS, the scripting architectural style is rarely used. Thus program lifecycles are different from unixland. For example, in my project I planned to use Redland in long-running background servers. Leaking memory in a long-running background server is a sure way to memory allocation failures and all kinds of errors.
&lt;/p&gt;
&lt;p&gt;
My target was therefore to make the library resilient to OOMs: when OOMs do occur, they are handled gracefully. No crashes. No incorrect results. No memory leaks.
&lt;/p&gt;
&lt;p&gt;
&lt;strong&gt;OOM loop&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
There&#039;s already a rather well established OOM testing technique on Symbian OS called OOM loop. The basic idea is to &lt;a href=&quot;http://en.wikipedia.org/wiki/Fault_injection&quot;&gt;inject&lt;/a&gt; allocation failures using &lt;a href=&quot;http://www.symbian.com/developer/techlib/v9.2docs/doc_source/reference/reference-cpp/E32_EKA2/__UHEAP_SETFAILDefine.html&quot;&gt;__UHEAP_SETFAIL()&lt;/a&gt; heap failure macro, which in turn uses &lt;a href=&quot;http://www.symbian.com/developer/techlib/v9.2docs/doc_source/reference/reference-cpp/E32_EKA2/UserClass.html#%3a%3aUser%3a%3a__DbgSetAllocFail()&quot;&gt;User::__DbgSetAllocFail()&lt;/a&gt;, and then see how the code deals with allocation failures. John Pagonis &lt;a href=&quot;http://developer.symbian.com/wiki/pages/viewpage.action?pageId=432#EliminatingeMemoryLeaksinSymbianOSC%2B%2BProjects-2.BeingProactive&quot;&gt;writes extensively about the OOM loop construct&lt;/a&gt; in his Symbian Developer Network technical paper.
&lt;/p&gt;
&lt;p&gt;
I started by implementing some integration test cases that exercised the Redland libraries in a similar fashion I was planning to use them in my real program. I attempted to run the test functions in an OOM loop with lots of iterations. For example, using the EDeterministic heap failure mode to fail every &lt;em&gt;k&lt;/em&gt;th allocation for &lt;em&gt;k&lt;/em&gt;=1..2000. One particular issue I soon discovered was the fact that the library code used abort() extensively for handling fatal errors like memory allocation failures. For my purposes, terminating the process was not the kind of recovery strategy I was looking for, so I &lt;a href=&quot;http://en.wikipedia.org/wiki/Method_stub&quot;&gt;stubbed&lt;/a&gt; the standard C library implementation of abort() with my own version that throws an exception with &lt;a href=&quot;http://www.symbian.com/developer/techlib/v9.2docs/doc_source/reference/reference-cpp/E32_EKA2/UserClass.html#%3a%3aUser%3a%3aLeave()&quot;&gt;User::Leave()&lt;/a&gt;. These leaves could be caught in library caller code and dealt with properly.
&lt;/p&gt;
&lt;p&gt;
This enabled me to enter the following, somewhat &lt;a href=&quot;http://en.wikipedia.org/wiki/Test-driven_development&quot;&gt;test-driven&lt;/a&gt; bug fixing loop:
&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;Write new test code or extend old tests. Run all tests. Repeat until some of the tests fail.&lt;/li&gt;
	&lt;li&gt;Fix any problems discovered.&lt;/li&gt;
	&lt;li&gt;Go back to step 1.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;
This way I was able to discover and fix literally hundreds of bugs in the libraries. Most of them were relatively simple failures to check the return code of some potentially failing function, simple memory leaks and so on. Some bugs were a little more complicated, for example, requiring design-level clarifications to object ownership passing rules.
&lt;/p&gt;
&lt;p&gt;
&lt;strong&gt;Improved heap failure tool&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
The OOM loop approach described above also had its issues:
&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;It was hard to determine where to set heap failure limits (the maximum value of &lt;em&gt;k&lt;/em&gt;).&lt;/li&gt;
	&lt;li&gt;Not all allocation failures would result to observable bugs but still have the system under test running in a slightly inconsistent state.&lt;/li&gt;
	&lt;li&gt;Some complicated bugs were hard to debug because the allocation failure and the observed error were highly decoupled i.e. very far from each other.&lt;/li&gt;
	&lt;li&gt;Some integration test cases would detect errors in dependent libraries (e.g. &lt;a href=&quot;http://www.sqlite.org/&quot;&gt;sqlite&lt;/a&gt; database or &lt;a href=&quot;http://xmlsoft.org/&quot;&gt;libxml2&lt;/a&gt; parser) that I was not interested in fixing.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;
To counter these issues, I decided to not use the Symbian OS heap failure tool but to write my own. Fortunately for me, the Redland libraries only used a small set of memory management functions: malloc(), calloc(), realloc() and free(). No other functions like strdup() were used. This made it easy to implement my own versions of these functions in the porting layer DLL that already contained the abort() replacement. For replacement, I used &lt;a href=&quot;http://www.symbian.com/developer/techlib/v9.2docs/doc_source/reference/reference-cpp/E32_EKA2/UserClass.html#%3a%3aUser%3a%3aAlloc%28%29&quot;&gt;User::Alloc()&lt;/a&gt;, &lt;a href=&quot;http://www.symbian.com/developer/techlib/v9.2docs/doc_source/reference/reference-cpp/E32_EKA2/UserClass.html#%3a%3aUser%3a%3aAllocZ%28%29&quot;&gt;User::AllocZ()&lt;/a&gt;, &lt;a href=&quot;http://www.symbian.com/developer/techlib/v9.2docs/doc_source/reference/reference-cpp/E32_EKA2/UserClass.html#%3a%3aUser%3a%3aReAlloc%28%29&quot;&gt;User::ReAlloc()&lt;/a&gt; and &lt;a href=&quot;http://www.symbian.com/developer/techlib/v9.2docs/doc_source/reference/reference-cpp/E32_EKA2/UserClass.html#%3a%3aUser%3a%3aFree%28%29&quot;&gt;User::Free()&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
I decided to store the heap failure tool state information (failure mode, allocation counter, pseudorandom seed) in the DLL thread-local storage. I also added an OOM counter that would keep track of all allocation failures, simulated and real. I also added some DLL API functions to set the heap failure parameters and to query/reset the OOM counter. For heap failure simulation, I didn&#039;t feel the need to implement all heap failure modes supported by the native heap failure tool. I was happy with just the deterministic (EDeterministic) and pseudorandom (ERandom) modes.
&lt;/p&gt;
&lt;p&gt;
This setup fully addresses the issues I had:
&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;A suitable upper bound for &lt;em&gt;k&lt;/em&gt; was reached in deterministic failure mode when the test function ran successfully, produced correct results and no OOMs were registered.&lt;/li&gt;
	&lt;li&gt;I could query the porting layer state for OOM failure count and see whether there had been any (undetected) OOM errors while running the test code.&lt;/li&gt;
	&lt;li&gt;On a simulated allocation failure, I could set the heap failure tool to issue a debugger breakpoint using the __BREAKPOINT() macro i.e. &amp;quot;int 3&amp;quot; assembly instruction on x86/WINSCW emulator. This way I quickly discover the root causes for errors occurring much later in the test case.&lt;/li&gt;
	&lt;li&gt;Dependent libraries were not affected since they were not using my heap failure tool.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;
The OOM counter was also useful in non-testing setup. I could use it to invalidate results of any operation to make sure the system was not running in an inconsistent state and producing &amp;quot;almost correct&amp;quot; results.
&lt;/p&gt;
&lt;p&gt;
Of course, all the bug fixes have been submitted back to the open source project to benefit the whole community.
&lt;/p&gt;</description>
   <link>http://blogs.forum.nokia.com/blog/lauri-aaltos-forum-nokia-blog/2008/11/12/fixing-out-of-memory-issues-in-redland-rdf-libraries</link>
      <pubDate>Wed, 12 Nov 2008 10:36:01 +0200</pubDate>   
  </item>
  </rdf:RDF>

