LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 31181 - wrong code with "opt -instcombine -early-cse -jump-threading -correlated-propagation -loop-rotate -indvars -gvn"
Summary: wrong code with "opt -instcombine -early-cse -jump-threading -correlated-prop...
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC All
: P normal
Assignee: Artur Pilipenko
URL:
Keywords:
: 31325 32047 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-11-28 01:17 PST by Zhendong Su
Modified: 2019-06-29 09:35 PDT (History)
9 users (show)

See Also:
Fixed By Commit(s): 362292, 364228


Attachments
bitcode before optimization (1.68 KB, application/octet-stream)
2016-11-28 01:19 PST, Zhendong Su
Details
bitcode after optimization (1.63 KB, application/octet-stream)
2016-11-28 01:19 PST, Zhendong Su
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zhendong Su 2016-11-28 01:17:21 PST
$ clang -v
clang version 4.0.0 (trunk 287975)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/clang-trunk/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9.4
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.2.0
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
$
$ clang -c -emit-llvm -o small.bc small.c
$ opt -instcombine -early-cse -jump-threading -correlated-propagation -loop-rotate -indvars -gvn -o small-opt.bc small.bc
$ llc -o small.s small-opt.bc
$ gcc small.s
$ timeout -s 9 10 ./a.out
Killed
$
$ clang -O0 small.c
$ ./a.out
$


-------------------------------


int a;

int main ()
{ 
  for (a = -2; a < -1; a++)
    ;
  return 0;
}
Comment 1 Zhendong Su 2016-11-28 01:19:30 PST
Created attachment 17663 [details]
bitcode before optimization
Comment 2 Zhendong Su 2016-11-28 01:19:47 PST
Created attachment 17664 [details]
bitcode after optimization
Comment 3 Hans Wennborg 2017-01-30 17:51:41 PST
Looks like a 3.9 -> 4.0 regression.
Comment 4 Hans Wennborg 2017-02-02 18:27:07 PST
Bisection points to:

r282872 - CVP. Turn marking adds as no wrap on by default (was turned off by 279082)

Artur, can you take a look?
Comment 5 Artur Pilipenko 2017-02-03 05:43:54 PST
CVP correctly marks the add as nuw basing on the loop latch check. After CVP:

define i32 @main() #0 {
entry:
  br label %for.cond

for.cond:                                         ; preds = %for.inc, %entry
  %storemerge = phi i32 [ -2, %entry ], [ %inc, %for.inc ]
  store i32 %storemerge, i32* @a, align 4
  %cmp = icmp slt i32 %storemerge, -1
  br i1 %cmp, label %for.inc, label %for.end

for.inc:                                          ; preds = %for.cond
  %inc = add nuw nsw i32 %storemerge, 1
  br label %for.cond

for.end:                                          ; preds = %for.cond
  ret i32 0
}

The problem is that LoopRotation later moves the addition over the check which guarantees nuw: 

for.cond:                                         ; preds = %for.cond, %entry
  %storemerge = phi i32 [ -2, %entry ], [ %inc, %for.cond ]
  store i32 %storemerge, i32* @a, align 4
  %cmp = icmp slt i32 %storemerge, -1
  %inc = add nuw nsw i32 %storemerge, 1
  br i1 %cmp, label %for.cond, label %for.end

Then indvars transforms the loop latch from pre increment check to the post increment check:

for.cond:                                         ; preds = %for.cond, %entry
  %storemerge = phi i32 [ -2, %entry ], [ %inc, %for.cond ]
  store i32 %storemerge, i32* @a, align 4
  %inc = add nuw nsw i32 %storemerge, 1
  %exitcond = icmp ne i32 %inc, 0
  br i1 %exitcond, label %for.cond, label %for.end

The check is never false because of nuw flag.

Since nsw, nuw flags can be control dependant, probably we should treat them similarly to metadata and strip them when we speculate the instruction.
Comment 6 Richard Smith 2017-02-05 20:10:57 PST
> Since nsw, nuw flags can be control dependant, probably
> we should treat them similarly to metadata and strip them
> when we speculate the instruction.

As far as I can see, LoopRotate's speculation is fine by itself, but indvars should presuambly be stripping nsw and nuw when reusing an existing add instruction.
Comment 7 Hans Wennborg 2017-02-16 13:45:40 PST
*** Bug 31325 has been marked as a duplicate of this bug. ***
Comment 8 Hans Wennborg 2017-02-23 14:22:13 PST
(In reply to Richard Smith from comment #6)
> > Since nsw, nuw flags can be control dependant, probably
> > we should treat them similarly to metadata and strip them
> > when we speculate the instruction.
> 
> As far as I can see, LoopRotate's speculation is fine by itself, but indvars
> should presuambly be stripping nsw and nuw when reusing an existing add
> instruction.

Hmm, when you say it's fine by itself you mean as long as the value isn't used until the loop condition has been checked, as it might be poison, right?

Only indvars does exactly that: uses the poison value.

I think the transformation happens in IndVarSimplify::linearFunctionTestReplace. We could of course check if CmpIndVar is an add and strip the flags.

But what if CmpIndVar is poison for some other reason, or has passed through other instructions. Can we "un-poison" it in a safe manner somehow?
Comment 9 Sanjoy Das 2017-02-23 14:26:11 PST
(In reply to Hans Wennborg from comment #8)
> (In reply to Richard Smith from comment #6)
> > > Since nsw, nuw flags can be control dependant, probably
> > > we should treat them similarly to metadata and strip them
> > > when we speculate the instruction.
> > 
> > As far as I can see, LoopRotate's speculation is fine by itself, but indvars
> > should presuambly be stripping nsw and nuw when reusing an existing add
> > instruction.
> 
> Hmm, when you say it's fine by itself you mean as long as the value isn't
> used until the loop condition has been checked, as it might be poison, right?
> 
> Only indvars does exactly that: uses the poison value.
> 
> I think the transformation happens in
> IndVarSimplify::linearFunctionTestReplace. We could of course check if
> CmpIndVar is an add and strip the flags.
> 
> But what if CmpIndVar is poison for some other reason, or has passed through
> other instructions. Can we "un-poison" it in a safe manner somehow?

That is what I'm working on now, but doing that without regressing other places (in terms of performance) is probably going to be too risky to pull in to 4.0.

So my two cents would be that despite this being a bug in IndVarSimplify, we cherry-pick a revert of Artur's CVP change into the release branch for now.
Comment 10 Hans Wennborg 2017-02-23 14:40:06 PST
(In reply to Sanjoy Das from comment #9)
> So my two cents would be that despite this being a bug in IndVarSimplify, we
> cherry-pick a revert of Artur's CVP change into the release branch for now.

Sounds good to me. I've reverted in r296030 and will merge that to the branch once it's been on the buildbots a bit.
Comment 11 Hans Wennborg 2017-02-24 10:48:57 PST
(In reply to Hans Wennborg from comment #10)
> (In reply to Sanjoy Das from comment #9)
> > So my two cents would be that despite this being a bug in IndVarSimplify, we
> > cherry-pick a revert of Artur's CVP change into the release branch for now.
> 
> Sounds good to me. I've reverted in r296030 and will merge that to the
> branch once it's been on the buildbots a bit.

Merged the revert to 4.0 in r296148.
Comment 12 Robert Lougher 2017-02-27 10:41:16 PST
*** Bug 32047 has been marked as a duplicate of this bug. ***
Comment 13 Hans Wennborg 2017-08-04 16:40:22 PDT
Has there been any progress here?
Comment 14 Nikita Popov 2019-04-20 02:05:08 PDT
For reference, the full repro IR for -indvars is:

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

@a = global i32 0

define i32 @main() {
entry:
  br label %loop 

loop:                                              
  %storemerge = phi i32 [ -2, %entry ], [ %inc, %loop ]
  store i32 %storemerge, i32* @a
  %cmp = icmp slt i32 %storemerge, -1
  %inc = add nuw nsw i32 %storemerge, 1
  br i1 %cmp, label %loop, label %exit

exit:                                              
  ret i32 0
}

I initially thought this no longer reproduces, because I habitually dropped the datalayout, but apparently it's important here.
Comment 15 Nikita Popov 2019-04-20 05:41:44 PDT
Attempt at fixing this: https://reviews.llvm.org/D60935
Comment 16 Nikita Popov 2019-06-29 09:35:26 PDT
LFTR issue fixed in https://reviews.llvm.org/rL362292 and CVP nowrap inference reenabled in https://reviews.llvm.org/rL364228.