Using Dharma to rediscover Node.js out-of-band write in UTF8 decoder

A month ago, Node.js released a security update for a bug in V8's utf-8 decoder affecting Buffer to String conversions. Since numerous native functions for networking and I/O are affected, a malicious user could deliver a crafted input to crash a remote Node.js process. A truncated four-bytes sequence can be used to create a misalignment in the WriteUtf16Slow function, resulting in a segmentation fault. For more details on the actual vulnerability, have a look at the V8 patch and the original bug report.

Just after the release of the patch, I started experimenting with this vulnerability to create a proof-of-concept:


Almost around the same time, I noticed that Christoph Diehl from Mozilla published a grammar-based fuzzer named Dharma. The tool parses formal grammar definitions and generates test cases. Although the concept is not new, Mozilla released a neat implementation with great efficiency.


Can we rediscover the same bug using Dharma? 

As an excuse to play with Dharma, I decided to try to replicate the same Buffer vulnerability. In this post, I will guide you through the setup and execution.

First, we need to create a grammar to define Node's Buffer functions. From the official API doc, I started classifying all APIs in three categories: definitionspermutations (from Buffer to Buffer) and operations (from Buffer to other types).

Based on this model, all test cases will resemble the following template:


The resulting buffer.dg grammar has been merged in the official Github repository.

With Dharma, we can now generate test cases with a simple command:


At this point, we just need to execute our test cases and wait for the results. After trying a few different solutions, I ended up using a very simple bash script:


After leaving the fuzzer alone for the night, I came back in the morning to discover a multitude of core dumps. Hidden among thousands of V8::FatalProcessOutOfMemory and SIGILL Illegal instruction errors, I finally discovered a sample that was triggering something interesting.

Looking at the backtrace,  we can confirm that we're triggering the same vulnerability. If you're interested, I've uploaded the auto-generated test case.


Now what?!

Node.js Buffer provides a very powerful API with raw memory allocation capabilities. Ilja van Sprundel outlined some of the risks during a recent webcast, and the latest vulnerability was a clear demonstration of the possible outcomes. Having already spent a few hours on building the grammar, I expanded this little fuzzing exercise with the goal of discovering similar vulnerabilities. After a few days of generation/execution and over 400,000 test cases, I have yet to triggered another segmentation fault in Node.js' Buffer. Although this exercise doesn't give us a definitive assurance, it is probably a good sign of the maturity of the API. Nonetheless, grammar-based fuzzing is fun and can lead to interesting results.

0 comments: