For my second post about compilers, I thought I’d show what happens with various optimization levels in GCC. Note I’m still focusing on gcc because I’m too lazy to fire up VirtualBox and Windows right now.
As a reminder from the first, here’s my wonderfully useless sample program. All it does is go through a loop and does not use the output.
int main(int argc, char* argv[])
{
int i = 0;
int loop;
for (loop = 0; loop < 50; loop++)
++i;
return 0;
}
The option flag -O1 turns on the first level of optimizations in gcc/g++. Instead of copying them here, I’ll refer you to the GNU manpage for optimizations for gcc/g++.
gcc -O1 main.c -o main_c1
main():
55 push ebp
89 e5 mov ebp,esp
b8 32 00 00 00 mov eax,0x32
83 e8 01 sub eax,0x1
75 fb jne 804839c
b8 00 00 00 00 mov eax,0x0
5d pop ebp
c3 ret
90 nop
90 nop
90 nop
90 nop
90 nop
90 nop
90 nop
90 nop
Compared to the previous post (no optimization), we see here that the assembly is better optimized. This version does not allocate space for the local variables i and loop,set them to zero, and increment loop and i at each pass. Instead, it loads 50 into the EAX register, subtracts 1, and checks if the zero flag is set. If not, it loops back to the subtraction and continues until it reaches zero. It then nukes it’s local stack and returns while being padded out (again, I’ll go over that another day). As we turn more optimization on, the compiler does more analysis and realizes that it doesn’t need to increment the variables since we’re not doing anything with them. For brevity, note that the assembly output from g++ is identical to the main function from gcc.
gcc -O2 main.c -o main_c1
At -O2, the compiler does even more analysis and has more “smarts” turned on. Let’s look at the assembler output below.
main():
55 push ebp
31 c0 xor eax,eax
89 e5 mov ebp,esp
5d pop ebp
c3 ret
90 nop
90 nop
90 nop
90 nop
90 nop
90 nop
90 nop
90 nop
90 nop
Here the compiler realized that nothing is done with the variables, and nothing was done with the output. So, the compiler doesn’t even do the loop any more. It still pushes the stack frame pointer. It moves 0 into EAX (the xor eax,eax generates shorter op codes and is a touch faster than actually pushing zero there). It still sets up the stack frame (main is a function after all) and then brings back the previous frame and returns. Again, with optimization the g++ assembly matches the gcc output for main(). gcc/g++ -O3 generates the same output as -O2.
You might ask yourself “Why do the compilers even set up the stack frame for the main function?” The answer is, they have to. main() HAS to exist in a C or C++ program, even if it really doesn’t do anything.
So have fun, and poke around your programs to see what all gets generated from the compiler. You’ll see that there’s a lot more in your program than you realized. Plus, the Art of Assembly Language is now available in a Linux version if you want a decent reference to learn assembler.