Photo by Clément H on Unsplash

This is What Would Happen if You Wrote: gcc main.c

This article breaks down gcc and explains the steps that occur when compiling a file with it.

Michelle Giraldo
3 min readSep 20, 2019

--

First off, gcc actually stands for GNU Compiler Collection.

The GNU Compiler Collection was produced by the GNU Project, a free-software project that had a large number of people collaborating and cooperating to create it.

So it’s actually a standard compiler for many languages and projects on GNU and Linux, both of which are free/open-source operating systems. The most common languages that are compiled with gcc happen to be C and C++, as well as Objective C and Objective C++.

In the case of gcc main.c, gcc will be compiling a C file. This is indicated by the .c file extension at the end of main, which is the file name.

When gcc compiles main.c, it will pass through four main stages: Pre-processing, Compilation, Assembly, and Linking.

These stages will use different tools and it’s even possible to stop gcc at certain steps, but otherwise, in the end an executable file is produced.

To sum up: in order to produce an executable file, pieces of code need to be re-arranged or filled in. Like when one function refers to another, or adding more code instructions for the functions used by the program that come from a library.

STAGE 1: Pre-Processing

This stage is called pre-processing as it actually processes different things other than removing the comments from the file. These things include processing macros, include-files, and conditional compilation instructions.

Macros are processed by expanding them. As a macro is actually a piece of code that has a given name. The name is written in the file and this step will replace the name with the macro’s contents. It’s how the macro ends up doing what it does. There are actually two kinds of macros: object-like, as in data objects, or function-like, which resemble function calls. They are made by using #define.

Include files are basically the header files. They are what’s normally written at the top of the program, there are many different kinds of header files. A really common one is written like this:

#include <stdio.h>

Using the -E option with gcc would actually stop the compilation process here, causing the gcc output to only have pre-processed. It would be written like so:

gcc -E main.c

STAGE 2: Compilation

This is the stage in which the pre-processor’s output and the source code are taken in order to create the assembler source code.

Using the -S option with gcc like so:

gcc -S main.c

Would actually produce the assembly code, stopping the compilation process here. It may seem a bit confusing as it produces assembly code but doesn’t actually go onto the stage of assembly. That’s because this second stage, compilation, is what generates the assembly code as…

STAGE 3: Assembly

This stage will translate that assembly code into object code. Which is necessary for the last step:

STAGE 4: Linking

This is the final stage, in which the object file, or files, are grabbed along with the libraries. They are then combined, or linked so to say, producing a single executable file.

Writing gcc main.c will pass through all these stages, producing an executable file with the name “a.out”.

a.out” is actually the default executable file name that is outputted by gcc. In order to execute this file, you’d have to do “./a.out”.

This is the syntax of writing the filename:

gcc [c file name].c -o [executable file name]

The executable file name is generally written the same as the c file name, it just doesn’t include the .c extension.

(As clarification, “a.out” is actually written without quotes. It’s just in quotes in order to not confuse the “.” as the end of the sentence)

--

--

Michelle Giraldo
Michelle Giraldo

Written by Michelle Giraldo

Graduate of Holberton School, New Haven as the former Student Tutor of Cohort 11, with a completion of the AR/VR specialization.

No responses yet