The calling convention for picking registers for arguments is described at (https://gcc.gnu.org/wiki/avr-gcc). I've found a mismatch between when AVR-GCC loads arguments from the stack, and when we do. AVR-GCC Test case: avr-gcc -S tmp.c -o tmp.s -O2 #include <stdint.h> typedef uint8_t i8; typedef uint16_t i16; typedef uint32_t i32; typedef uint64_t i64; void thing(i64 a, i64 b) { *((i64*)0x4) = b; } This generates the following assembly to load argument 'b' from the stack /* prologue: function */ /* frame size = 0 */ /* stack size = 8 */ .L__stack_usage = 8 ldi r30,lo8(4) ldi r31,0 st Z,r10 std Z+1,r11 std Z+2,r12 std Z+3,r13 std Z+4,r14 std Z+5,r15 std Z+6,r16 std Z+7,r17 AVR-LLVM example: define void @ret_void_args_i64_i64(i64 %a, i64 %b) { store volatile i64 %b, i64* inttoptr (i64 4 to i64*) ret void } This generates the following assembly: sts 4, r18 sts 11, r17 sts 10, r16 sts 9, r15 sts 8, r14 sts 7, r13 sts 6, r12 sts 5, r11 sts 4, r10 ret Clearly, we aren't following the calling convention in this case. We should be loading 'b' from the stack. This will break programs which take large numbers/big arguments on the stack.
From what I can see, AVR-GCC is not living up to the calling convention listed on (https://gcc.gnu.org/wiki/avr-gcc). If we start with Rn=26, and process two arguments i64 %a, i64 %b, we will do * Rn = 26 * Begin processing %a * Rn -= 8 = 18 * Rn >= 8, therefore this argument will be stored in registers r18-r25 * Begin processing %b * Rn -= 8 = 10 * Rn >= 8, therefore this argument will be stored in registers r10-r17 LLVM does this exactly, but AVR-GCC does not. I will empirically check at what point AVR-GCC does start storing registers on the stack.
AVR-GCC seems to only store up to 8 bytes in registers. If I create a function with the arguments (i32, i32) or (i64), everything is passed in registers. The moment I try (i32,i32,i32) or (i64,i8), every argument passed the first 8 bytes is loaded from the stack. I will email the AVR-GCC mailing list and see their thoughts.
I misunderstood the assembler, this is not a bug.