Coverset[1]: Optimizing Seed Selection For Fuzzing
Randomly mutating well-formed program inputs or simply fuzzing, is a highly
effective and widely used strategy to find bugs in software. Other than showing
fuzzers find bugs, there has been little systematic effort in understanding
the science of how to fuzz properly. In this paper, we focus on how to
mathematically formulate and reason about one critical aspect in fuzzing: how
best to pick seed files to maximize the total number of bugs found during a
fuzz campaign. We design and evaluate six different algorithms using over 650
CPU days on Amazon Elastic Compute Cloud (EC2) to provide ground truth data.
Overall, we find 240 bugs in 8 applications and show that the choice of
algorithm can greatly increase the number of bugs found. We also show that
current seed selection strategies as found in Peach may fare no better than
picking seeds at random. We make our data set and code publicly available.
Installation
Download the source code of Coverset as well as the baseline data from
http://security.ece.cmu.edu/coverset/coverset.tar.bz2
The file is gigabytes, so please be kind and don't repeatedly download.
After downloading the source code, simply follow the next three steps:
- Download and install the latest online release of VirtualBox. If
you face problems running the VM, try enabling VT-x and make sure you can
invoke VirtualBox before proceeding.
- Download and install the latest online release of Vagrant.
- Download, unzip and setup the Coverset package:
$ unzip coverset.zip
$ cd coverset
$ vagrant up
Quick Start
To connect to the VM, run:
$ vagrant ssh
vagrant$ cd /vagrant
The Coverset package contains the source code of our simulation framework and our minset computation tool:
- simulation: The simulation framework.
- minset: The source code of the minset computation tool.
- bin: This folder contains the minset computation tool and a script to parse the BFF output.
Our experimental data will also available in the same folder:
- results: Our results. Running the simulation framework will update this directory.
- crashers: The directory contains the results of our 650 CPU-day fuzzing campaign.
- coverage: The directory contains the coverage tarballs computed by the coverage pintool.
- seeds: The directory contains the seeds used during the simulation.
Reference
[1] Optimizing Seed Selection for Fuzzing, Alexandre Rebert,
Sang Kil Cha, Thanassis Avgerinos, Jonathan Foote, David Warren, Gustavo Grieco, and David Brumley, In Proc. of 23rd USENIX Security Symposium, 2014