Created attachment 21123 [details] reproducer The regression starts happening with r346397: Return "[IndVars] Smart hard uses detection" The patch has been reverted because it ended up prohibiting propagation of a constant to exit value. For such values, we should skip all checks related to hard uses because propagating a constant is always profitable. Differential Revision: https://reviews.llvm.org/D53691 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346397 91177308-0d34-0410-b5e6-96231b3b80d8 If I run opt -S -o - foo2.ll -indvars with the patch I get define i16 @test10() { entry: br label %loop1 loop1: ; preds = %loop1, %entry %l1 = phi i16 [ 0, %entry ], [ %l1.add, %loop1 ] %l1.add = add nuw nsw i16 %l1, 1 %cmp1 = icmp ult i16 %l1.add, 2 br i1 %cmp1, label %loop1, label %loop2.preheader loop2.preheader: ; preds = %loop1 br label %loop2 loop2: ; preds = %loop2, %loop2.preheader %k2 = phi i16 [ %k2.add, %loop2 ], [ 182, %loop2.preheader ] %l2 = phi i16 [ %l2.add, %loop2 ], [ 0, %loop2.preheader ] %l2.add = add nuw nsw i16 %l2, 1 tail call void @foo(i16 %k2) %k2.add = add nuw nsw i16 %k2, 1 %cmp2 = icmp ult i16 %l2.add, 2 br i1 %cmp2, label %loop2, label %loop2.end loop2.end: ; preds = %loop2 %k2.add.lcssa = phi i16 [ %k2.add, %loop2 ] ret i16 %k2.add.lcssa } and without it: define i16 @test10() { entry: br label %loop1 loop1: ; preds = %loop1, %entry %l1 = phi i16 [ 0, %entry ], [ %l1.add, %loop1 ] %l1.add = add nuw nsw i16 %l1, 1 %cmp1 = icmp ult i16 %l1.add, 2 br i1 %cmp1, label %loop1, label %loop2.preheader loop2.preheader: ; preds = %loop1 br label %loop2 loop2: ; preds = %loop2, %loop2.preheader %k2 = phi i16 [ %k2.add, %loop2 ], [ 182, %loop2.preheader ] %l2 = phi i16 [ %l2.add, %loop2 ], [ 0, %loop2.preheader ] %l2.add = add nuw nsw i16 %l2, 1 tail call void @foo(i16 %k2) %k2.add = add nuw nsw i16 %k2, 1 %cmp2 = icmp ult i16 %l2.add, 2 br i1 %cmp2, label %loop2, label %loop2.end loop2.end: ; preds = %loop2 %0 = add i16 182, 2 ret i16 %0 } Note the difference in how the return value is calculated. I suppose that indvars doesn't consider this to be the constant case that should always be allowed, but I think it's quite unfortunate if this simplification isn't done.
Taking a look.
Ok, it seems that isa<SCEVConstant> check is too restrictive. We could instead check availability on loop entry. It fails some other tests though (likely they've improved); I'll take a careful look at them and come up with a patch tomorrow.
Fix on review: https://reviews.llvm.org/D58113
I think this may be the same issue we're seeing in <https://bugs.freebsd.org/237515> ("textproc/htmldoc hangs at runtime when compiled with clang 8"), and which is also reported at <https://github.com/michaelrsweet/htmldoc/issues/349>. In this case, a loop like: do { *lineptr++ = ' '; col ++; } while (col & 7); is optimized to: # %bb.52: # %do.body.preheader # in Loop: Header=BB53_46 Depth=3 #DEBUG_VALUE: parse_pre:y <- [DW_OP_constu 10312, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:col <- $esi #DEBUG_VALUE: parse_pre:lineptr <- $r12 #DEBUG_VALUE: parse_pre:dataptr <- $rbx #DEBUG_VALUE: parse_pre:flat <- [DW_OP_constu 10352, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:start <- $r13 #DEBUG_VALUE: parse_pre:t <- [DW_OP_constu 10368, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:left <- [DW_OP_constu 10336, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:right <- [DW_OP_constu 10356, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:bottom <- [DW_OP_constu 10360, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:top <- [DW_OP_constu 10332, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:x <- [DW_OP_constu 10296, DW_OP_minus] [$rbp+0] .loc 2 5493 27 # ps-pdf.cxx:5493:27 movl %esi, %eax xorb $7, %al movzbl %al, %edx movl %edx, %r14d andl $7, %r14d movq %rsi, %r13 .Ltmp7097: movl %esi, %eax negb %al movzbl %al, %r15d andl $7, %r15d andl $7, %edx incq %rdx .loc 2 5493 30 is_stmt 0 # ps-pdf.cxx:5493:30 movq %r12, %rdi .loc 2 5493 27 # ps-pdf.cxx:5493:27 leaq (%r12,%r15), %r12 .Ltmp7098: .loc 2 5493 30 # ps-pdf.cxx:5493:30 movl $32, %esi .Ltmp7099: callq memset .Ltmp7100: .LBB53_53: # %do.body # Parent Loop BB53_15 Depth=1 # Parent Loop BB53_30 Depth=2 # Parent Loop BB53_46 Depth=3 # => This Inner Loop Header: Depth=4 #DEBUG_VALUE: parse_pre:y <- [DW_OP_constu 10312, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:dataptr <- $rbx #DEBUG_VALUE: parse_pre:flat <- [DW_OP_constu 10352, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:t <- [DW_OP_constu 10368, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:left <- [DW_OP_constu 10336, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:right <- [DW_OP_constu 10356, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:bottom <- [DW_OP_constu 10360, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:top <- [DW_OP_constu 10332, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:x <- [DW_OP_constu 10296, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:lineptr <- undef #DEBUG_VALUE: parse_pre:col <- undef #DEBUG_VALUE: parse_pre:lineptr <- [DW_OP_plus_uconst 1, DW_OP_stack_value] undef #DEBUG_VALUE: parse_pre:col <- [DW_OP_plus_uconst 1, DW_OP_stack_value] undef .loc 2 5496 17 is_stmt 1 # ps-pdf.cxx:5496:17 decq %r15 .Ltmp7101: .loc 2 5495 10 # ps-pdf.cxx:5495:10 jne .LBB53_53 .Ltmp7102: # %bb.54: # %for.inc170.loopexit What happens is that in case col & 7 is already true before the loop, the value calculated with "negb %al" becomes zero, and then the loop will count back from that. E.g the "decq %r15" just before .Ltmp7101 will effectively never reach zero, and a stack overflow follows. Before r346397 (and its predecessor r345814) it got optimized to: # %bb.71: # %do.body.preheader # in Loop: Header=BB53_45 Depth=3 #DEBUG_VALUE: parse_pre:flat <- [DW_OP_constu 10400, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:lineptr <- $rdi #DEBUG_VALUE: parse_pre:dataptr <- $rbx #DEBUG_VALUE: parse_pre:col <- $r15d #DEBUG_VALUE: parse_pre:start <- $r13 #DEBUG_VALUE: parse_pre:t <- [DW_OP_constu 10360, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:left <- [DW_OP_constu 10344, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:right <- [DW_OP_constu 10348, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:bottom <- [DW_OP_constu 10352, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:top <- [DW_OP_constu 10340, DW_OP_minus] [$rbp+0] #DEBUG_VALUE: parse_pre:page <- [DW_OP_constu 10296, DW_OP_minus] [$rbp+0] .loc 2 5493 27 # ps-pdf.cxx:5493:27 movl %r15d, %eax xorb $7, %al movzbl %al, %r12d movl %r12d, %edx andl $7, %edx leaq (%rdi,%rdx), %r14 addq $1, %r14 andl $7, %r12d incq %rdx .loc 2 5493 30 is_stmt 0 # ps-pdf.cxx:5493:30 movl $32, %esi callq memset .Ltmp7053: .loc 2 5493 27 # ps-pdf.cxx:5493:27 leal (%r15,%r12), %r15d .Ltmp7054: addl $1, %r15d
(In reply to Dimitry Andric from comment #4) > I think this may be the same issue we're seeing in > <https://bugs.freebsd.org/237515> ("textproc/htmldoc hangs at runtime when > compiled with clang 8"), and which is also reported at > <https://github.com/michaelrsweet/htmldoc/issues/349>. This report is about a missed optimization; if you're seeing a crash or hang, please file a separate bug.
(In reply to Eli Friedman from comment #5) > (In reply to Dimitry Andric from comment #4) > > I think this may be the same issue we're seeing in > > <https://bugs.freebsd.org/237515> ("textproc/htmldoc hangs at runtime when > > compiled with clang 8"), and which is also reported at > > <https://github.com/michaelrsweet/htmldoc/issues/349>. > > This report is about a missed optimization; if you're seeing a crash or > hang, please file a separate bug. Right, this is now bug 41998.
(In reply to Max Kazantsev from comment #3) > Fix on review: https://reviews.llvm.org/D58113 This patch just landed today. The test case on the bug works, so I'm closing this, but I have no idea if this was reduced from something which might need re-examined. The fix was rather narrow to the particular test case.
(In reply to listmail from comment #7) > (In reply to Max Kazantsev from comment #3) > > Fix on review: https://reviews.llvm.org/D58113 > This patch just landed today. The test case on the bug works, so I'm > closing this, but I have no idea if this was reduced from something which > might need re-examined. The fix was rather narrow to the particular test > case. Thanks for pushing the fix. It indeed solves the problem I saw.