LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 39673 - -indvars regression after r346397
Summary: -indvars regression after r346397
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC Linux
: P enhancement
Assignee: Max Kazantsev
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-11-15 01:23 PST by Mikael Holmén
Modified: 2019-06-12 23:20 PDT (History)
5 users (show)

See Also:
Fixed By Commit(s):


Attachments
reproducer (1.05 KB, text/plain)
2018-11-15 01:23 PST, Mikael Holmén
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mikael Holmén 2018-11-15 01:23:18 PST
Created attachment 21123 [details]
reproducer

The regression starts happening with r346397:

    Return "[IndVars] Smart hard uses detection"
    
    The patch has been reverted because it ended up prohibiting propagation
    of a constant to exit value. For such values, we should skip all checks
    related to hard uses because propagating a constant is always profitable.
    
    Differential Revision: https://reviews.llvm.org/D53691
    
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346397 91177308-0d34-0410-b5e6-96231b3b80d8


If I run

  opt -S -o - foo2.ll -indvars

with the patch I get

define i16 @test10() {
entry:
   br label %loop1

loop1:                                            ; preds = %loop1, %entry
   %l1 = phi i16 [ 0, %entry ], [ %l1.add, %loop1 ]
   %l1.add = add nuw nsw i16 %l1, 1
   %cmp1 = icmp ult i16 %l1.add, 2
   br i1 %cmp1, label %loop1, label %loop2.preheader

loop2.preheader:                                  ; preds = %loop1
   br label %loop2

loop2:                                            ; preds = %loop2, 
%loop2.preheader
   %k2 = phi i16 [ %k2.add, %loop2 ], [ 182, %loop2.preheader ]
   %l2 = phi i16 [ %l2.add, %loop2 ], [ 0, %loop2.preheader ]
   %l2.add = add nuw nsw i16 %l2, 1
   tail call void @foo(i16 %k2)
   %k2.add = add nuw nsw i16 %k2, 1
   %cmp2 = icmp ult i16 %l2.add, 2
   br i1 %cmp2, label %loop2, label %loop2.end

loop2.end:                                        ; preds = %loop2
   %k2.add.lcssa = phi i16 [ %k2.add, %loop2 ]
   ret i16 %k2.add.lcssa
}

and without it:

define i16 @test10() {
entry:
   br label %loop1

loop1:                                            ; preds = %loop1, %entry
   %l1 = phi i16 [ 0, %entry ], [ %l1.add, %loop1 ]
   %l1.add = add nuw nsw i16 %l1, 1
   %cmp1 = icmp ult i16 %l1.add, 2
   br i1 %cmp1, label %loop1, label %loop2.preheader

loop2.preheader:                                  ; preds = %loop1
   br label %loop2

loop2:                                            ; preds = %loop2, 
%loop2.preheader
   %k2 = phi i16 [ %k2.add, %loop2 ], [ 182, %loop2.preheader ]
   %l2 = phi i16 [ %l2.add, %loop2 ], [ 0, %loop2.preheader ]
   %l2.add = add nuw nsw i16 %l2, 1
   tail call void @foo(i16 %k2)
   %k2.add = add nuw nsw i16 %k2, 1
   %cmp2 = icmp ult i16 %l2.add, 2
   br i1 %cmp2, label %loop2, label %loop2.end

loop2.end:                                        ; preds = %loop2
   %0 = add i16 182, 2
   ret i16 %0
}

Note the difference in how the return value is calculated.

I suppose that indvars doesn't consider this to be the constant case that should always be allowed, but I think it's quite unfortunate if this simplification isn't done.
Comment 1 Max Kazantsev 2019-02-11 04:24:05 PST
Taking a look.
Comment 2 Max Kazantsev 2019-02-11 05:04:14 PST
Ok, it seems that isa<SCEVConstant> check is too restrictive. We could instead check availability on loop entry. It fails some other tests though (likely they've improved); I'll take a careful look at them and come up with a patch tomorrow.
Comment 3 Max Kazantsev 2019-02-12 03:57:55 PST
Fix on review: https://reviews.llvm.org/D58113
Comment 4 Dimitry Andric 2019-04-24 10:38:28 PDT
I think this may be the same issue we're seeing in <https://bugs.freebsd.org/237515> ("textproc/htmldoc hangs at runtime when compiled with clang 8"), and which is also reported at <https://github.com/michaelrsweet/htmldoc/issues/349>.

In this case, a loop like:

                do
                {
                  *lineptr++ = ' ';
                  col ++;
                }
                while (col & 7);

is optimized to:

# %bb.52:                               # %do.body.preheader
                                        #   in Loop: Header=BB53_46 Depth=3
        #DEBUG_VALUE: parse_pre:y <- [DW_OP_constu 10312, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:col <- $esi
        #DEBUG_VALUE: parse_pre:lineptr <- $r12
        #DEBUG_VALUE: parse_pre:dataptr <- $rbx
        #DEBUG_VALUE: parse_pre:flat <- [DW_OP_constu 10352, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:start <- $r13
        #DEBUG_VALUE: parse_pre:t <- [DW_OP_constu 10368, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:left <- [DW_OP_constu 10336, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:right <- [DW_OP_constu 10356, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:bottom <- [DW_OP_constu 10360, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:top <- [DW_OP_constu 10332, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:x <- [DW_OP_constu 10296, DW_OP_minus] [$rbp+0]
        .loc    2 5493 27               # ps-pdf.cxx:5493:27
        movl    %esi, %eax
        xorb    $7, %al
        movzbl  %al, %edx
        movl    %edx, %r14d
        andl    $7, %r14d
        movq    %rsi, %r13
.Ltmp7097:
        movl    %esi, %eax
        negb    %al
        movzbl  %al, %r15d
        andl    $7, %r15d
        andl    $7, %edx
        incq    %rdx
        .loc    2 5493 30 is_stmt 0     # ps-pdf.cxx:5493:30
        movq    %r12, %rdi
        .loc    2 5493 27               # ps-pdf.cxx:5493:27
        leaq    (%r12,%r15), %r12
.Ltmp7098:
        .loc    2 5493 30               # ps-pdf.cxx:5493:30
        movl    $32, %esi
.Ltmp7099:
        callq   memset
.Ltmp7100:
.LBB53_53:                              # %do.body
                                        #   Parent Loop BB53_15 Depth=1
                                        #     Parent Loop BB53_30 Depth=2
                                        #       Parent Loop BB53_46 Depth=3
                                        # =>      This Inner Loop Header: Depth=4
        #DEBUG_VALUE: parse_pre:y <- [DW_OP_constu 10312, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:dataptr <- $rbx
        #DEBUG_VALUE: parse_pre:flat <- [DW_OP_constu 10352, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:t <- [DW_OP_constu 10368, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:left <- [DW_OP_constu 10336, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:right <- [DW_OP_constu 10356, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:bottom <- [DW_OP_constu 10360, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:top <- [DW_OP_constu 10332, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:x <- [DW_OP_constu 10296, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:lineptr <- undef
        #DEBUG_VALUE: parse_pre:col <- undef
        #DEBUG_VALUE: parse_pre:lineptr <- [DW_OP_plus_uconst 1, DW_OP_stack_value] undef
        #DEBUG_VALUE: parse_pre:col <- [DW_OP_plus_uconst 1, DW_OP_stack_value] undef
        .loc    2 5496 17 is_stmt 1     # ps-pdf.cxx:5496:17
        decq    %r15
.Ltmp7101:
        .loc    2 5495 10               # ps-pdf.cxx:5495:10
        jne     .LBB53_53
.Ltmp7102:
# %bb.54:                               # %for.inc170.loopexit

What happens is that in case col & 7 is already true before the loop, the value calculated with "negb %al" becomes zero, and then the loop will count back from that.  E.g the "decq %r15" just before .Ltmp7101 will effectively never reach zero, and a stack overflow follows.

Before r346397 (and its predecessor r345814) it got optimized to:

# %bb.71:                               # %do.body.preheader
                                        #   in Loop: Header=BB53_45 Depth=3
        #DEBUG_VALUE: parse_pre:flat <- [DW_OP_constu 10400, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:lineptr <- $rdi
        #DEBUG_VALUE: parse_pre:dataptr <- $rbx
        #DEBUG_VALUE: parse_pre:col <- $r15d
        #DEBUG_VALUE: parse_pre:start <- $r13
        #DEBUG_VALUE: parse_pre:t <- [DW_OP_constu 10360, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:left <- [DW_OP_constu 10344, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:right <- [DW_OP_constu 10348, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:bottom <- [DW_OP_constu 10352, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:top <- [DW_OP_constu 10340, DW_OP_minus] [$rbp+0]
        #DEBUG_VALUE: parse_pre:page <- [DW_OP_constu 10296, DW_OP_minus] [$rbp+0]
        .loc    2 5493 27               # ps-pdf.cxx:5493:27
        movl    %r15d, %eax
        xorb    $7, %al
        movzbl  %al, %r12d
        movl    %r12d, %edx
        andl    $7, %edx
        leaq    (%rdi,%rdx), %r14
        addq    $1, %r14
        andl    $7, %r12d
        incq    %rdx
        .loc    2 5493 30 is_stmt 0     # ps-pdf.cxx:5493:30
        movl    $32, %esi
        callq   memset
.Ltmp7053:
        .loc    2 5493 27               # ps-pdf.cxx:5493:27
        leal    (%r15,%r12), %r15d
.Ltmp7054:
        addl    $1, %r15d
Comment 5 Eli Friedman 2019-04-29 16:26:51 PDT
(In reply to Dimitry Andric from comment #4)
> I think this may be the same issue we're seeing in
> <https://bugs.freebsd.org/237515> ("textproc/htmldoc hangs at runtime when
> compiled with clang 8"), and which is also reported at
> <https://github.com/michaelrsweet/htmldoc/issues/349>.

This report is about a missed optimization; if you're seeing a crash or hang, please file a separate bug.
Comment 6 Dimitry Andric 2019-05-23 14:35:12 PDT
(In reply to Eli Friedman from comment #5)
> (In reply to Dimitry Andric from comment #4)
> > I think this may be the same issue we're seeing in
> > <https://bugs.freebsd.org/237515> ("textproc/htmldoc hangs at runtime when
> > compiled with clang 8"), and which is also reported at
> > <https://github.com/michaelrsweet/htmldoc/issues/349>.
> 
> This report is about a missed optimization; if you're seeing a crash or
> hang, please file a separate bug.

Right, this is now bug 41998.
Comment 7 listmail 2019-06-12 15:42:49 PDT
(In reply to Max Kazantsev from comment #3)
> Fix on review: https://reviews.llvm.org/D58113
This patch just landed today.  The test case on the bug works, so I'm closing this, but I have no idea if this was reduced from something which might need re-examined.  The fix was rather narrow to the particular test case.
Comment 8 Mikael Holmén 2019-06-12 23:20:53 PDT
(In reply to listmail from comment #7)
> (In reply to Max Kazantsev from comment #3)
> > Fix on review: https://reviews.llvm.org/D58113
> This patch just landed today.  The test case on the bug works, so I'm
> closing this, but I have no idea if this was reduced from something which
> might need re-examined.  The fix was rather narrow to the particular test
> case.

Thanks for pushing the fix. It indeed solves the problem I saw.