C/C++ programs often use conditional compilation to implement variations of a program. While conditional compilation is extremely flexible and easy to use, it leads to code that is hard to maintain. Using examples from open-source systems, this post demonstrates why such code is often referred to as the »#ifdef Hell« and what can be done to keep conditional compilation in check.
C and C++ (like many other languages) feature a preprocessor that prepares the source code before it is handed to the actual compiler. This preprocessor offers features like file inclusion, macro expansion and conditional compilation. Conditional compilation allows to exclude parts of a source code file from compilation by the C/C++ compiler if a condition is met. To support this, the preprocessor provides a set of if-else directives.
An example is shown in the figure below (taken from the Firefox source code but actually an arithmetic module provided by IBM¹). The code between the #if and #endif directives will only be compiled if the macro DECSUBSET is defined, making the code use a special arithmetic subset defined in ANSI X3.274. Otherwise the code in the red frame will not be compiled (and, hence, never executed).
In many systems this mechanism is used to express variations that cannot or should not be handled at runtime. Types of variation include:
Conditional compilation is extremely flexible and, hence, often used to implement variants. However, it creates several challenges in the code and in the architecture as well as for testing.
The code easiest to understand is code without any conditions; just one statement after the other. It is easy to understand because for each statement it is crystal clear under which condition it is executed: always! Vice versa, every condition, e.g. an if or while statement, makes code harder to understand. This is true for »normal« conditions that are evaluated at runtime as well as for compile-time conditions used by conditional compilation.
However, compile-time conditions add a completely new layer of complexity as they are not part of the programming language, e.g. C++, but part of the preprocessor language. Particularly, if runtime and compile-time conditions are intermixed (as in the example above), the reader of the code always has too keep track of the two language layers. Hence, the mental load for understanding code using conditional compilation is high.
As the preprocessor defines a language of its own, one can easily create source code artifacts that are valid w.r.t. to the preprocessor but not w.r.t. to the C language. This, however, is only discovered after the preprocessing step. Moreover, it allows to use preprocessor conditions literally everywhere. For example, even within a runtime condition which obviously makes it more difficult to understand.
This is illustrated by Example #2, again taken from Firefox (inflate.c²), where a compile-time condition is used to »inject « a ternary expression into a runtime condition. Liebig and other researchers refer to this type of compile-time conditions as undisciplined preprocessor annotations.
Example #3 is taken from the Linux source code (atariNCR5380.c³) and shows how difficult things can be become in code that uses lots of compile-time conditions.
In the most simple case a system doesn’t have any variation, i.e. all customers get exactly the same binaries. Even then, lifecycle management is non-trivial as usually different versions of these binaries float around. If you add variants, however, things quickly become a lot more complex because you are essentially maintaining not one but multiple systems.
This becomes most obvious in the area of testing as you don’t have to test one but multiple systems (see below). All this is still manageable as long as you are dealing with few, clearly defined variants. If you face a proliferation of variants, however, things can get easily out of control.
Hence, variation should not simply happen but be a central architectural concept that is as explicit as possible. Also, variation points should be few and limited to certain places in a system. Conditional compilation, however, can be used everywhere and, hence, is bound to grow in an uncontrolled manner. To illustrate this, the following treemap shows the source code of Firefox version 41.0.2. Each rectangle in the treemap symbolizes a .c or .cpp file. The size of the rectangle reflects the size of the file measured in lines of code. The colors in the treemaps show if a file uses conditional compilation (red) or not (green)*.
The analyzed code comprises 9,608 files that contain about 5.6 million lines of code (MLOC). Of these, almost 3,000 files contain conditional compilation. These files contain about 3.5 MLOC, i.e. more than 60% of the code. As the treemap illustrates, conditional compilation is not limited to specific parts of the system but almost omnipresent.
I did not perform a historical study for the system and, hence, do not know if the amount of conditional compilation grew over time. However, my experience is that it is hard to concentrate conditional compilation in specific parts of the system. If a system uses conditional compilation to the extent the example above does, variation obviously becomes hard to handle as it becomes virtually impossible to actively manage the variants of the system.
Next to the effects on code understanding and architecture, conditional compilation also poses a major challenge for testing. If you attempt to cover all the paths in a piece of code that have compile-time conditions, you have to create a test setup that compiles the code for all combinations of the conditions (if there are finitely many). And for each combination you have to execute all the test cases that cover the runtime conditions. This makes testing a lot harder as the setup required for re-compiling the variants is complicated. Moreover, the additional compilations steps take time and thereby lengthen the overall test execution time.
While the list of problems associated with conditional compilation is already quite long, there’s more to come:
I hope the above convinced you that conditional compilation is a problem for software maintenance. But what to do about it? From my point of view, the answer strongly depends on your goals and the position of your system in its lifecycle. Hence, the following paragraphs outline strategies for avoiding, managing and removing conditional compilation.
When implementing new systems, I strongly advocate an implementation of variations without conditional compilation. This doesn’t mean that all variations have to be managed at runtime. If compile-time management of variations is required, however, it should be on coarse-grained level, e.g. by including or excluding complete files. This can be managed with the build system or the now common dependency injection containers (which technically introduce load-time variants).
The big question, however, is how to deal with with grown systems that already exhibit a fair amount of conditional compilation. In my experience, it is rarely possible to sit down and fully re-engineer the system for several reasons. First, there is usually no time and budget to do this. Second, this type of re-engineering is highly error-prone; particularly if there are no good automatic tests. Hence, I propose an approach to manage the existing conditional compilation. This management should cover the following aspects:
With such a management of conditional compilation, one cannot improve the quality of the code but ensure that things don’t deteriorate further. Which is a lot better than doing nothing about the problem at all. If the system is expected to live a lot longer, one should seriously think about complementing the management strategy with an actual improvement strategy. For this, I propose the following steps:
Conditional compilation is technique that is widespread in C/C++ and other languages that feature a preprocessor. Nevertheless, it is known to make systems hard to maintain. To deal with it, one needs to define clear rules and must ensure that these rule are adhered to. Removing conditional compilation from systems is hard but achievable and worthwhile if the system is expected to be long-lived. If you are faced with maintaining a system infested with conditional compilation and don’t know how to go forward with it, please don’t hesitate to contact me. We at CQSE will not be able to magically solve your problem but we provide an in-depth analysis of your system and the right tools to manage conditional compilation in large and grown code bases. This may help you and your team to survive the #ifdef hell.
Conditional compilation is a topic that has been well researched over the last couple of years. If you are interested in the scientific background, I can recommend the following papers that also served as inspiration when writing this post:
¹ Copyright © IBM Corporation, 2000–2012. All rights reserved. This software is made available under the terms of the ICU License – ICU 1.8.1 and later.
² Copyright © 1995–2012 Mark Adler
³ Copyright © 1993, Drew Eckhardt
* To determine the color, each source file was searched for the #endif preprocessor directive. The search was performed on the token stream, so comments containing the string »#endif« were excluded. If a file contains two or more #endifs, it is colored red. If it contains zero or one, it is colored green. Considering only files with at least two#endifs to contain conditional compilation was used as a simple analysis heuristic to exclude include guards from the analysis. Overall 34,843 instances of the #endif preprocessor directive were found in the Firefox source code.