As reported in https://bugs.freebsd.org/237074, the multimedia/vlc3 port fails to build for armv6 with a fatal backend error: [... lots of stuff ... ] *** Bad machine code: Using an undefined physical register *** - function: AVI_ChunkRead - basic block: %bb.11 if.then13 (0x804066b68) - instruction: TCRETURNri %22:tcgpr, implicit $sp, implicit $r0, implicit killed $r1 - operand 2: implicit $r0 *** Bad machine code: Using an undefined physical register *** - function: AVI_ChunkRead - basic block: %bb.11 if.then13 (0x804066b68) - instruction: TCRETURNri %22:tcgpr, implicit $sp, implicit $r0, implicit killed $r1 - operand 3: implicit killed $r1 fatal error: error in backend: Found 2 machine code errors. Minimized test case: // clang -cc1 -triple armv6kz---gnueabihf -S -target-cpu arm1176jzf-s -O2 -stack-protector 2 libavi-min.c void c(_Bool *); void a() { _Bool b; c(&b); } const struct { int d; int (*e)(); } f[] = {}; int h(void); int AVI_ChunkRead_p_chk() { int g = h(); if (g) return f[g].e(0, 0); a(); return 0; } Using -stack-protector 1, or lowering the optimization level to -O1 makes it work again. Also, if you add (int, int) to the 'e' member of the struct, like so: const struct { int d; int (*e)(int, int); } f[] = {}; it works. So maybe this is related to C varargs functions.
As far as I can tell, there are three key aspects here: 1. The function must trigger stack protection 2. The function must make a tail call through a function pointer. 3. The scheduler must place the load of the function pointer between the physical register copies and the tail call. These three factors combined make FindSplitPointForStackProtector pick the wrong split point, and fail. The final schedule for the block with the tail call follows for the case that fails. I guess it's suspicious that the TCRETURNri isn't attached to the CopyToRegs with glue. *** Final schedule *** SU(5): t3: i32,ch = CopyFromReg t0, Register:i32 %0 SU(6): t21: i32,ch = LDRLIT_ga_pcrel_ldr<Mem:(load 4 from got)> TargetGlobalAddress:i32<[0 x %struct.anon]* @f> 0 [TF=8], t0 SU(4): t6: i32 = ADDrsi t21, t3, TargetConstant:i32<26>, TargetConstant:i32<14>, Register:i32 $noreg, Register:i32 $noreg SU(2): t9: i32 = MOVi TargetConstant:i32<0>, TargetConstant:i32<14>, Register:i32 $noreg, Register:i32 $noreg SU(1): t17: ch,glue = CopyToReg t15, Register:i32 $r1, t9, t15:1 t15: ch,glue = CopyToReg t0, Register:i32 $r0, t9 SU(3): t11: i32,ch = LDRi12<Mem:(load 4 from %ir.e, !tbaa !3)> t6, TargetConstant:i32<4>, TargetConstant:i32<14>, Register:i32 $noreg, t0 SU(0): t18: ch = TCRETURNri t11, Register:i32 $r0, Register:i32 $r1, t17
I think I'll have a patch soon
https://reviews.llvm.org/D60427
r360099