LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 33763 - Potential code size reduction optimization: reusing function tails
Summary: Potential code size reduction optimization: reusing function tails
Status: NEW
Alias: None
Product: clang
Classification: Unclassified
Component: -New Bugs (show other bugs)
Version: trunk
Hardware: PC All
: P enhancement
Assignee: Unassigned Clang Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-07-12 10:37 PDT by jpakkane
Modified: 2017-07-12 10:37 PDT (History)
1 user (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jpakkane 2017-07-12 10:37:39 PDT
Suppose you have code that looks like this:

int func1();
int func2();
int func3();
int func4();

int func5() {
  int i = 0;
  i+=func2();
  i+=func3();
  i+=func4();
  return i+func1();
}

int func6() {
  int i=1;
  i+=func3();
  i+=func3();
  i+=func4();
  return i+func1();
}

It gets compiled to the following assembly when using -Os:

func5():
        pushq   %rbx
        call    func2()
        movl    %eax, %ebx
        call    func3()
        addl    %eax, %ebx
        call    func4()
        addl    %eax, %ebx
        call    func1()
        addl    %ebx, %eax
        popq    %rbx
        ret
func6():
        pushq   %rbx
        call    func3()
        leal    1(%rax), %ebx
        call    func3()
        addl    %eax, %ebx
        call    func4()
        addl    %eax, %ebx
        call    func1()
        addl    %ebx, %eax
        popq    %rbx
        ret


However the ends of the two functions are identical. This could be compiled into the following which is functionally the same but takes less space:

func5():
        pushq   %rbx
        call    func2()
        movl    %eax, %ebx
common_tail:
        call    func3()
        addl    %eax, %ebx
        call    func4()
        addl    %eax, %ebx
        call    func1()
        addl    %ebx, %eax
        popq    %rbx
        ret
func6():
        pushq   %rbx
        call    func3()
        leal    1(%rax), %ebx
        jmp common_tail

Testing with compiler explorer says that neither GCC, Clang, MSVC nor ICC do this optimization but some embedded compilers do.