[Dailydave] The difference between block-based fuzzing and AFL

Tue Sep 13 11:56:49 EDT 2016

The benefit of a tool like AFL is that it’s black-box: you don’t need a grammar, you don’t need a complicated, rich and deep specification of a protocol like RPC that encapsulates checksums, encryption, etc. 

AFL (and fuzzers like it) have a strategy to work around their lack of knowledge/a deep specification, though: just recompile your application to skip checksums and turn off encryption. 

Augh! It’s so cheesy! The indignity! You spend months of your life writing hand-crafted protocol emulators that deal with every corner case of a 500 page specification to fuzz the crap out of something and some dude with a reinforcement-learning-based mutation-fuzzer written in 200 lines of C code that doesn’t know anything about any protocol comes up behind you and says “yeah just comment out the checksum, bro.” 

It seems to work, though? The bugs you find with or without the checksums/encryption/”computation” should exist with or without that, so why not analyze the software without? To synthesize the bugs into something real you’ll need to do some by-hand work to make the checksums work out or add encryption, but it seems unlikely that a bug you find will require input that can’t be checksummed or something. It would be bad if your surgery on the program to remove checksums / encryption also removed other checks like length checks or a filter, or otherwise damaged the generality of your analysis. The fix that AFL like tools offer for this is basically “don’t do that.” Maybe that is a thing that, in your mind, buckets AFL in some kind of “specialized tool” category. 

And in fact, AFL has found bugs in straight up RPC protocols like CapnProto. 

From: dailydave-bounces at lists.immunityinc.com [mailto:dailydave-bounces at lists.immunityinc.com] On Behalf Of Dave Aitel
Sent: Tuesday, September 13, 2016 8:34 AM
To: dailydave at lists.immunityinc.com
Subject: [Dailydave] The difference between block-based fuzzing and AFL

So let's take a quick break from thinking about how messed up Wassenaar is or what random annoying thing the EFF or ACLU said about 0day today and talk about fuzzers. AFL has everyone's mind share, but I you have to point out that it is still a VERY specialized tool. 

The process of taking a file, sending it into some processing unit, and then figuring out if it crashes, sounds easy and generic. But in practice you have to carefully optimize how you do it to get any kind of speed and effectiveness out of it. 

This is another thing about the Cyber Grand Challenge: I think they optimized the problem set in a way using that limited system call VM for AFL-like fuzzers. I'm just going to assume none of the problem sets were a complex RPC-like protocol, because we would have seen zero people solve them and DARPA knows that.

What I mean is this: It is very hard to optimize the block-based fuzzing technique for automation. But they solve two completely different types of problems. 

AFL-like fuzzers excel at files for one reason: Files don't do computation. SPIKE-like fuzzers excel at protocols because they are there to handle challenge responses, size-fields, checksums, encryption, and other things common in network protocols. There's also minor differences in how they handle mutation. And of course, in many cases a SPIKE-like fuzzer is EASIER to set up and use than something like AFL, with less problem-optimization needed for valuable results.

But still, no comparison of a file-fuzzer to a block-based or protocol fuzzer (PEACH/SPIKE/CODENOMICON) is going to be apples to apples. It's more like apples to dragons.

-dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.immunityinc.com/pipermail/dailydave/attachments/20160913/296db8ac/attachment-0001.html>