Is there any performance difference between i++ and ++i in a for loop in C? Let’s know!

Prerequisites

For the examples I used FreeBSD 12.1-RELEASE-p2 and its default C compiler (clang) version 8.0.1

Let’s go!

Let’s write a simple code for the postfix increment:

int main(void) {
        for (int i = 0; i < 10; i++);
}

Compile it: cc -O0 inc.c, and then use objdump(1) to disassemble it (the option --no-show-raw-insn added to omit the opcodes to clarify): objdump --no-show-raw-insn -M intel -S -D -l a.out | less (I use the Intel syntax (-M intel) because my syntax highlighter don’t understand the AT&T syntax):

int main(void) {
  2012e0:       push   rbp
  2012e1:       mov    rbp,rsp
  2012e4:       mov    DWORD PTR [rbp-0x4],0x0
/home/dvz/src/inc.c:2
        for (int i = 0; i < 10; i++);
  2012eb:       mov    DWORD PTR [rbp-0x8],0x0
  2012f2:       cmp    DWORD PTR [rbp-0x8],0xa
  2012f6:       jge    20130f <main+0x2f>
  2012fc:       jmp    201301 <main+0x21>
  201301:       mov    eax,DWORD PTR [rbp-0x8]
  201304:       add    eax,0x1
  201307:       mov    DWORD PTR [rbp-0x8],eax
  20130a:       jmp    2012f2 <main+0x12>
/home/dvz/src/inc.c:3
}
  20130f:       mov    eax,DWORD PTR [rbp-0x4]
  201312:       pop    rbp
  201313:       ret    

Let’s go through the assembly listing of the for loop step-by-step. First, we initialize the i variable (the counter) with zero:

mov    DWORD PTR [rbp-0x8],0x0

Then we compare the counter with immediate value 10 (hexadecimal 0xA):

cmp    DWORD PTR [rbp-0x8],0xa

If it’s greater or equal we ended the loop:

jge    20130f <main+0x2f>

Otherwise, we increment the counter:

mov    eax,DWORD PTR [rbp-0x8]
add    eax,0x1
mov    DWORD PTR [rbp-0x8],eax

And continue the loop:

jmp    2012f2 <main+0x12>

As you can see, the increment is performed in three instructions:

  1. Load the counter from a memory location to the register eax:
mov    eax,DWORD PTR [rbp-0x8]
  1. Add 1 to eax:
add    eax,0x1
  1. Store eax to a memory location:
mov    DWORD PTR [rbp-0x8],eax

Ok, it’s clear. But how about the prefix increment? Let’s see!

Let’s write a simple C code:

int main(void) {
        for (int i = 0; i < 10; ++i)
}

Compile and disassemble it using the same commands as for previous example:

int main(void) {
  2012e0:       push   rbp
  2012e1:       mov    rbp,rsp
  2012e4:       mov    DWORD PTR [rbp-0x4],0x0
/home/dvz/src/inc.c:2
        for (int i = 0; i < 10; ++i);
  2012eb:       mov    DWORD PTR [rbp-0x8],0x0
  2012f2:       cmp    DWORD PTR [rbp-0x8],0xa
  2012f6:       jge    20130f <main+0x2f>
  2012fc:       jmp    201301 <main+0x21>
  201301:       mov    eax,DWORD PTR [rbp-0x8]
  201304:       add    eax,0x1
  201307:       mov    DWORD PTR [rbp-0x8],eax
  20130a:       jmp    2012f2 <main+0x12>
/home/dvz/src/inc.c:3
}
  20130f:       mov    eax,DWORD PTR [rbp-0x4]
  201312:       pop    rbp
  201313:       ret    

What we see?

mov    eax,DWORD PTR [rbp-0x8]
add    eax,0x1
mov    DWORD PTR [rbp-0x8],eax

The same three instructions!

So, is there any performance difference between i++ and ++i in a for loop in C? No, it is not!

Last modified: February 12, 2020

Author

Comments

Write a Reply or Comment

Your email address will not be published.