[Dailydave] The difference between block-based fuzzing and AFL

Michal Zalewski lcamtuf at coredump.cx
Wed Sep 14 03:15:40 EDT 2016


> AFL-like fuzzers excel at files for one reason: Files don't do computation.

I don't look at the it this way.

To put it bluntly, the overriding principle behind AFL is that it
intentionally takes away choice and forces you to simplify problems
instead of complicating the test suite.

Quite often, that's the right thing to do, even if it *feels*
insulting or wrong to a pro. There are fuzzing frameworks that are
incredibly flexible and expressive, allowing you to create complex
protocol specs, fiddle with dozens of knobs, accommodate dozens of
process and I/O models, and so forth. When used just right, they are
great - but the flexibility comes at a great cost:

1) The tools require a ton of thoughtful configuration, often to the
tune of days per job. This goes against the main selling point of
fuzzing: it's supposed to be cheap and is accessible to anyone. On a
cosmic scale... an amazing fuzzer that can be competently used by 100
people in the world is worth a lot less than a barely-competent fuzzer
that any developer can get running in 30 minutes or so.

2) Even the experts almost always taint the configuration: they make
*tons* of flawed assumptions about the underlying protocol (or the
fuzzing algorithms) and have no practical feedback loop to correct
their mistakes.

FWIW, the AFL approach of taking away choice is surprisingly useful
even against network protocols - heck, it found stuff in BIND,
OpenSSL, OpenSSH, nginx, dnsmasq, ntpd, etc. But yeah, it's a pretty
clear trade-off: there are situation where having more choice is
necessary, in which case, more expressive tools are the way to go. But
I look at them as necessary evil: quite simply, we're too dumb to use
them right.

/mz

PS. For what it's worth, I've seen people build interesting "hybrid"
designs, where a shim would take a part of the AFL-generated output as
a "protocol script" encoding some of the metadata about how to feed
the rest of the output into the targeted binary. That works, oddly
enough.


More information about the Dailydave mailing list