$ clang -v clang version 4.0.0 (trunk 287975) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang-trunk/bin Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9.4 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5.3.0 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4.7 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.5 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.0 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.2.0 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 $ $ clang -c -emit-llvm -o small.bc small.c $ opt -instcombine -early-cse -jump-threading -correlated-propagation -loop-rotate -indvars -gvn -o small-opt.bc small.bc $ llc -o small.s small-opt.bc $ gcc small.s $ timeout -s 9 10 ./a.out Killed $ $ clang -O0 small.c $ ./a.out $ ------------------------------- int a; int main () { for (a = -2; a < -1; a++) ; return 0; }
Created attachment 17663 [details] bitcode before optimization
Created attachment 17664 [details] bitcode after optimization
Looks like a 3.9 -> 4.0 regression.
Bisection points to: r282872 - CVP. Turn marking adds as no wrap on by default (was turned off by 279082) Artur, can you take a look?
CVP correctly marks the add as nuw basing on the loop latch check. After CVP: define i32 @main() #0 { entry: br label %for.cond for.cond: ; preds = %for.inc, %entry %storemerge = phi i32 [ -2, %entry ], [ %inc, %for.inc ] store i32 %storemerge, i32* @a, align 4 %cmp = icmp slt i32 %storemerge, -1 br i1 %cmp, label %for.inc, label %for.end for.inc: ; preds = %for.cond %inc = add nuw nsw i32 %storemerge, 1 br label %for.cond for.end: ; preds = %for.cond ret i32 0 } The problem is that LoopRotation later moves the addition over the check which guarantees nuw: for.cond: ; preds = %for.cond, %entry %storemerge = phi i32 [ -2, %entry ], [ %inc, %for.cond ] store i32 %storemerge, i32* @a, align 4 %cmp = icmp slt i32 %storemerge, -1 %inc = add nuw nsw i32 %storemerge, 1 br i1 %cmp, label %for.cond, label %for.end Then indvars transforms the loop latch from pre increment check to the post increment check: for.cond: ; preds = %for.cond, %entry %storemerge = phi i32 [ -2, %entry ], [ %inc, %for.cond ] store i32 %storemerge, i32* @a, align 4 %inc = add nuw nsw i32 %storemerge, 1 %exitcond = icmp ne i32 %inc, 0 br i1 %exitcond, label %for.cond, label %for.end The check is never false because of nuw flag. Since nsw, nuw flags can be control dependant, probably we should treat them similarly to metadata and strip them when we speculate the instruction.
> Since nsw, nuw flags can be control dependant, probably > we should treat them similarly to metadata and strip them > when we speculate the instruction. As far as I can see, LoopRotate's speculation is fine by itself, but indvars should presuambly be stripping nsw and nuw when reusing an existing add instruction.
*** Bug 31325 has been marked as a duplicate of this bug. ***
(In reply to Richard Smith from comment #6) > > Since nsw, nuw flags can be control dependant, probably > > we should treat them similarly to metadata and strip them > > when we speculate the instruction. > > As far as I can see, LoopRotate's speculation is fine by itself, but indvars > should presuambly be stripping nsw and nuw when reusing an existing add > instruction. Hmm, when you say it's fine by itself you mean as long as the value isn't used until the loop condition has been checked, as it might be poison, right? Only indvars does exactly that: uses the poison value. I think the transformation happens in IndVarSimplify::linearFunctionTestReplace. We could of course check if CmpIndVar is an add and strip the flags. But what if CmpIndVar is poison for some other reason, or has passed through other instructions. Can we "un-poison" it in a safe manner somehow?
(In reply to Hans Wennborg from comment #8) > (In reply to Richard Smith from comment #6) > > > Since nsw, nuw flags can be control dependant, probably > > > we should treat them similarly to metadata and strip them > > > when we speculate the instruction. > > > > As far as I can see, LoopRotate's speculation is fine by itself, but indvars > > should presuambly be stripping nsw and nuw when reusing an existing add > > instruction. > > Hmm, when you say it's fine by itself you mean as long as the value isn't > used until the loop condition has been checked, as it might be poison, right? > > Only indvars does exactly that: uses the poison value. > > I think the transformation happens in > IndVarSimplify::linearFunctionTestReplace. We could of course check if > CmpIndVar is an add and strip the flags. > > But what if CmpIndVar is poison for some other reason, or has passed through > other instructions. Can we "un-poison" it in a safe manner somehow? That is what I'm working on now, but doing that without regressing other places (in terms of performance) is probably going to be too risky to pull in to 4.0. So my two cents would be that despite this being a bug in IndVarSimplify, we cherry-pick a revert of Artur's CVP change into the release branch for now.
(In reply to Sanjoy Das from comment #9) > So my two cents would be that despite this being a bug in IndVarSimplify, we > cherry-pick a revert of Artur's CVP change into the release branch for now. Sounds good to me. I've reverted in r296030 and will merge that to the branch once it's been on the buildbots a bit.
(In reply to Hans Wennborg from comment #10) > (In reply to Sanjoy Das from comment #9) > > So my two cents would be that despite this being a bug in IndVarSimplify, we > > cherry-pick a revert of Artur's CVP change into the release branch for now. > > Sounds good to me. I've reverted in r296030 and will merge that to the > branch once it's been on the buildbots a bit. Merged the revert to 4.0 in r296148.
*** Bug 32047 has been marked as a duplicate of this bug. ***
Has there been any progress here?
For reference, the full repro IR for -indvars is: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" @a = global i32 0 define i32 @main() { entry: br label %loop loop: %storemerge = phi i32 [ -2, %entry ], [ %inc, %loop ] store i32 %storemerge, i32* @a %cmp = icmp slt i32 %storemerge, -1 %inc = add nuw nsw i32 %storemerge, 1 br i1 %cmp, label %loop, label %exit exit: ret i32 0 } I initially thought this no longer reproduces, because I habitually dropped the datalayout, but apparently it's important here.
Attempt at fixing this: https://reviews.llvm.org/D60935
LFTR issue fixed in https://reviews.llvm.org/rL362292 and CVP nowrap inference reenabled in https://reviews.llvm.org/rL364228.