Detecting Incorrect Build Rules
Automated build systems are routinely used by software engineers to minimize the number of objects that need to be recompiled after incremental changes to the source files of a project. In order to achieve efficient and correct builds, developers must provide the build tools with dependency information between the files and modules of a project, usually expressed in a macro language specific to each build tool. Most build systems offer good support for well-known languages and compilers, but as projects grow larger, engineers tend to include source files generated using custom tools. In order to guarantee correctness, the authors of these tools are responsible for enumerating all the files whose contents an output depends on. Unfortunately, this is a tedious process and not all dependencies are captured in practice, which leads to incorrect builds. We automatically uncover such missing dependencies through a novel method that we call build fuzzing. The correctness of build definitions is verified by modifying files in a project, triggering incremental builds and comparing the set of changed files to the set of expected changes. These sets are determined using a dependency graph inferred by tracing the system calls executed during a clean build. We evaluate our method by exhaustively testing build rules of open-source projects, uncovering issues leading to race conditions and faulty builds in 31 of them. We provide a discussion of the bugs we detect, identifying anti-patterns in the use of the macro languages. We fix some of the issues in projects where the features of build systems allow a clean solution.