Compilation
In C, transforming source code into an executable program involves several distinct stages. Understanding this process helps you manage larger projects, debug effectively, and create more maintainable code.
Overview of the Compilation Processβ
When you compile a C program, it goes through three main phases:
- Preprocessing: Handles directives like
#include
,#define
, and conditional compilation - Translation/Compilation: Converts preprocessed code into object files
- Linking: Combines object files and libraries into an executable
Stage | Input | Output |
---|---|---|
Source file | file.c | β |
Preprocessing | Source + #include /macros | Translation unit (TU) |
Compilation | Translation unit | Object file (.o ) |
Archiving (opt.) | One or more object files | Library (.a , .so ) |
Linking | Object files + libraries | Executable |
The Preprocessorβ
The preprocessor is the first phase in the compilation process. It handles directives that begin with #
and manipulates the source code before actual compilation begins.
#include Directiveβ
The #include
directive tells the preprocessor to insert the contents of another file into the current file:
#include <stdio.h> // Search in standard system directories
#include "myfile.h" // Search first in current directory, then system directories
This mechanism allows you to reuse code across multiple files, making programs more maintainable.
Macro Definitions with #defineβ
The #define
directive creates macros, which are text substitutions performed before compilation:
// Simple constant macros
#define PI 3.14159
#define MAX_SIZE 100
// Parameterized macros (use with caution)
#define SQUARE(x) ((x) * (x))
When using parameterized macros, always wrap parameters in parentheses to avoid unexpected behavior:
// Bad: without parentheses
#define BAD_SQUARE(x) x * x
// Problem example:
int a = BAD_SQUARE(3 + 1); // Expands to 3 + 1 * 3 + 1, which equals 7, not 16!
// Good: with parentheses
#define GOOD_SQUARE(x) ((x) * (x))
// Works as expected:
int b = GOOD_SQUARE(3 + 1); // Expands to ((3 + 1) * (3 + 1)), which equals 16
Even with proper parentheses, macros can still cause issues with side effects:
int x = 5;
int y = GOOD_SQUARE(++x); // x is incremented twice!
For these reasons, inline functions are often preferable to parameterized macros in modern C code.
Conditional Compilationβ
The preprocessor allows you to include or exclude blocks of code based on conditions:
#if defined _WIN32
#include <windows.h>
// Windows-specific code
#elif defined __APPLE__
#include <MacOS.h>
// macOS-specific code
#elif defined __linux__
#include <linux/module.h>
// Linux-specific code
#else
#error "Unsupported platform"
#endif
Common directives for conditional compilation include:
#if
,#elif
,#else
,#endif
: General conditional blocks#ifdef
/#ifndef
: Shortcuts for#if defined
/#if !defined
#error
: Generates a compiler error with a message#pragma once
: Non-standard but widely supported directive for include guards
Include Guardsβ
Include guards prevent a header file from being included multiple times in a translation unit, which can cause redefinition errors:
/* file: myheader.h */
#ifndef MYHEADER_H
#define MYHEADER_H
// Header content goes here
struct Point {
double x;
double y;
};
// Function declarations
extern double distance(struct Point a, struct Point b);
#endif /* MYHEADER_H */
Always use include guards in header files. The naming convention is typically the filename in uppercase with dots replaced by underscores.
Separate Compilationβ
Separate compilation allows you to split a program into multiple source files that can be compiled independently and then linked together.
Declarations vs. Definitionsβ
In C, understanding the difference between declarations and definitions is crucial for separate compilation:
- Declaration: Introduces an identifier and its type without creating it or allocating storage
- Definition: Creates the entity and may allocate storage for it
// Declaration (no storage allocated)
extern int counter;
int calculate(int a, int b);
// Definition (storage allocated)
int counter = 0;
int calculate(int a, int b) {
return a + b;
}
Header and Source Filesβ
A common organization pattern in C projects:
- Header files (.h): Contain declarations, include guards, and possibly inline functions
- Source files (.c): Contain definitions and implement the functionality declared in headers
/* math_utils.h */
#ifndef MATH_UTILS_H
#define MATH_UTILS_H
extern double square_root(double x);
extern double power(double base, int exponent);
#endif /* MATH_UTILS_H */
/* math_utils.c */
#include "math_utils.h"
double square_root(double x) {
// Implementation
}
double power(double base, int exponent) {
// Implementation
}
Proper Header File Organizationβ
When creating and using header files:
- Always include the corresponding header at the top of its
.c
file - Use include guards in all header files
- Include only what is necessary
- Declare functions and variables in header files, define them in source files
/* Example of good header file organization */
/* geometry.h */
#ifndef GEOMETRY_H
#define GEOMETRY_H
// Include required headers
#include <stddef.h> // For size_t
// Constants
#define PI 3.14159265358979323846
// Type definitions
typedef struct {
double x;
double y;
} Point;
// Function declarations
extern double distance(Point a, Point b);
extern double area_of_circle(double radius);
#endif /* GEOMETRY_H */
Object Files and Linkingβ
Object Filesβ
An object file (.o
or .obj
) is the output of the compilation phase before linking. It contains:
- Machine code for the compiled source file
- A symbol table with information about functions and variables
- Relocation information for linking with other object files
Object files cannot be executed directly because they may have unresolved references to symbols defined in other files.
The Linking Processβ
The linker combines multiple object files into an executable by:
- Resolving external references between object files
- Combining sections (code, data, etc.)
- Assigning final addresses to all symbols
- Generating the executable file format
main.o:
- Defined: main
- Referenced: calculate
math.o:
- Defined: calculate
- Referenced: (none)
β Linker β executable
Static vs. Dynamic Linkingβ
Static Linkingβ
In static linking, all library code is copied into the final executable:
[App Object Files] + [Library Object Files] β [Complete Executable]
Advantages:
- Executable runs without dependencies on library versions
- Potentially faster startup time
Disadvantages:
- Larger executable size
- Updates require recompilation
Dynamic Linkingβ
With dynamic linking, the executable contains references to shared libraries that are loaded at runtime:
[App Object Files] β [Executable with References] + [Shared Libraries]
Advantages:
- Smaller executable size
- Libraries can be updated independently
- Libraries can be shared between applications
Disadvantages:
- Dependencies on correct library versions being available
- Slightly slower startup time
On Linux, shared libraries typically have the .so
(shared object) extension. On Windows, they use .dll
(dynamic-link library), and on macOS, they often use .dylib
(dynamic library).
Build Automation with Makeβ
We've already covered Makefile here, but let's review the basics and add a few more things that will be useful later.
For larger projects with multiple source files, manually compiling each file becomes tedious. The make
utility automates this process.
A Makefile
contains rules that specify how to derive target files from source files, and is structured like this:
target: prerequisites
command
command
...
- The
target
is the file to be created prerequisites
are the files that the target depends oncommands
are the actions needed to create the target (must be indented with a tab)
Example Makefileβ
# Compiler and flags
CC = gcc
CFLAGS = -Wall -g
# Target executable
TARGET = myprogram
# Object files
OBJS = main.o utils.o math_funcs.o
# Default target
all: $(TARGET)
# Linking rule
$(TARGET): $(OBJS)
$(CC) $(CFLAGS) -o $(TARGET) $(OBJS)
# Compilation rules
main.o: main.c utils.h math_funcs.h
$(CC) $(CFLAGS) -c main.c
utils.o: utils.c utils.h
$(CC) $(CFLAGS) -c utils.c
math_funcs.o: math_funcs.c math_funcs.h
$(CC) $(CFLAGS) -c math_funcs.c
# Clean target (phony)
.PHONY: clean
clean:
rm -f $(OBJS) $(TARGET)
- Compile only (source to object):
gcc -c file.c
: generatesfile.o
without linking. - Link only (object to executable):
gcc -o program file.o
: linksfile.o
into theprogram
executable. - Compile & link in one step:
gcc file.c -o program
: compilesfile.c
and immediately links intoprogram
.
Implicit rules and Pattern rulesβ
Make has built-in implicit rules. For example, it knows how to create .o
files from .c
files:
CC = gcc
CFLAGS = -Wall -g
TARGET = myprogram
OBJS = main.o utils.o math_funcs.o
all: $(TARGET)
$(TARGET): $(OBJS)
$(CC) $(CFLAGS) -o $@ $^
# No explicit compilation rules needed due to implicit rules
.PHONY: clean
clean:
rm -f $(OBJS) $(TARGET)
Makefile Variables and Automatic Variablesβ
$@
: Represents the target filename$<
: Represents the first prerequisite filename$^
: Represents all prerequisites
output: input1.c input2.c
gcc -o $@ $^
# Equivalent to: gcc -o output input1.c input2.c
Make doesnβt know anything about the internals of C compilation, it only compares file modification times. A rule is reβrun if its target is older than any listed prerequisite. If you forget to list a header or source file, changes to it wonβt trigger rebuilding. Always declare every dependency in your Makefile to keep builds correct.
Best Practices for Project Organizationβ
Recommended Directory Structureβ
For medium to large projects, a good structure is:
myproject/
βββ Makefile
βββ include/ # Public header files
β βββ module1.h
β βββ module2.h
βββ src/ # Source files
β βββ module1.c
β βββ module2.c
βββ test/ # Test files
β βββ test_module1.c
β βββ test_module2.c
βββ build/ # Build artifacts (generated)
βββ obj/
βββ bin/
Header File Best Practicesβ
- Include only necessary headers
- Use forward declarations when possible
- Use include guards consistently
- Keep headers lightweight and focused
Compilation Best Practicesβ
- Use
-Wall
to enable important warnings - Consider using
-Werror
to treat warnings as errors - Use separate compilation for faster builds when making changes
- Use header dependencies in your Makefile
π Exercise: Multi-file projectβ
Create a simple multi-file project with:
- A header file
math_utils.h
with declarations foradd
,subtract
, andmultiply
functions - An implementation file
math_utils.c
with definitions for those functions - A main program
calculator.c
that uses those functions - A Makefile to build the project
Show solution
math_utils.h:
#ifndef MATH_UTILS_H
#define MATH_UTILS_H
extern int add(int a, int b);
extern int subtract(int a, int b);
extern int multiply(int a, int b);
#endif /* MATH_UTILS_H */
math_utils.c:
#include "math_utils.h"
int add(int a, int b) {
return a + b;
}
int subtract(int a, int b) {
return a - b;
}
int multiply(int a, int b) {
return a * b;
}
calculator.c:
#include <stdio.h>
#include "math_utils.h"
int main(void) {
int x = 10;
int y = 5;
printf("%d + %d = %d\n", x, y, add(x, y));
printf("%d - %d = %d\n", x, y, subtract(x, y));
printf("%d * %d = %d\n", x, y, multiply(x, y));
return 0;
}
Makefile:
CC = gcc
CFLAGS = -Wall -g
TARGET = calculator
OBJS = calculator.o math_utils.o
all: $(TARGET)
$(TARGET): $(OBJS)
$(CC) $(CFLAGS) -o $@ $^
calculator.o: calculator.c math_utils.h
$(CC) $(CFLAGS) -c $<
math_utils.o: math_utils.c math_utils.h
$(CC) $(CFLAGS) -c $<
.PHONY: clean
clean:
rm -f $(OBJS) $(TARGET)