Reproducer: https://rise4fun.com/Alive/ne8y In InstCombiner::reassociateShiftAmtsOfTwoSameDirectionShifts(), when identifying two shifts to combine, it does so while ignoring any zero-extensions of those values. In the problematic case above, the values were both i1s zero-extended to i3s. When their zero-extensions are dropped (taking them back to i1s) their sum becomes an xor in SimplifyAddInst(), which causes the shift to basically disappear (and to be dropped in a later pass). I'm not familiar with the history of this code, particularly why the zero extensions are ignored, but changing the code to leave them in place does fix the issue.
The fix is likely ```diff - if (ShAmt0->getType() != ShAmt1->getType()) + if (ShAmt0->getType() != ShAmt1->getType() || + ShAmt0->getType()->isIntOrIntVectorTy(1)) ``` @ https://github.com/llvm/llvm-project/blob/9f507bfd8d4761c619f21d8165b7e6e6e5bd09bd/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp#L56-L59 I'm not sure if there is a deeper problem.
I might be missing something, but I don't think it's ever safe to drop the extension - certainly, the u1 case is easy to see, but consider if the value a uN, it could be "saturated", i.e., 0xF...F. If zero-extended to uM (N+1 or greater), it could hold the sum of two uNs, but if the extension is dropped, then semantics might not be preserved. I think the below might be necessary. diff --git a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp index 43be42a4096..d782fe9076c 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp @@ -35,7 +35,7 @@ Value *InstCombiner::reassociateShiftAmtsOfTwoSameDirectionShifts( Instruction *Sh0Op0; Value *ShAmt0; if (!match(Sh0, - m_Shift(m_Instruction(Sh0Op0), m_ZExtOrSelf(m_Value(ShAmt0))))) + m_Shift(m_Instruction(Sh0Op0), m_Value(ShAmt0)))) return nullptr; // If there is a truncation between the two shifts, we must make note of it @@ -50,7 +50,7 @@ Value *InstCombiner::reassociateShiftAmtsOfTwoSameDirectionShifts( // Inner shift: (x shiftopcode ShAmt1) // Like with other shift, ignore zext of shift amount if any. Value *X, *ShAmt1; - if (!match(Sh1, m_Shift(m_Value(X), m_ZExtOrSelf(m_Value(ShAmt1))))) + if (!match(Sh1, m_Shift(m_Value(X), m_Value(ShAmt1)))) return nullptr; // We have two shift amounts from two different shifts. The types of those
(In reply to Rob Springer from comment #2) > I might be missing something, but I don't think it's ever safe to drop the > extension - certainly, the u1 case is easy to see, but consider if the value > a uN, it could be "saturated", i.e., 0xF...F. If zero-extended to uM (N+1 or > greater), it could hold the sum of two uNs, but if the extension is dropped, > then semantics might not be preserved. If we have an `shl iN %x, %y`, the `%y` must be within `[iN 0, iN N)`, anything else is just UB. There are several important observations here: 1. For bitwidth=1, there is only one valid shift amount = 0 2. The shift amount is always positive - sign bits are always zeros Now, is there an `iN` where `iN (N-1) + iN (N-1)` overflows? I don't think so? * i1 (1-1) + i1 (1-1) = i1 0 + i1 0 = i1 0 <- No overflow? * i2 (2-1) + i2 (2-1) = i2 1 + i2 1 = i2 2 (0b10) <- No overflow? * i3 (3-1) + i3 (3-1) = i3 2 + i3 2 = i3 4 (0b100) <- No overflow? * i4 (4-1) + i4 (4-1) = i4 3 + i4 3 = i4 6 (0b0110) <- No overflow? and so on. So this is fine in general, but there is clearly something wrong.
(In reply to Roman Lebedev from comment #3) > (In reply to Rob Springer from comment #2) > > I might be missing something, but I don't think it's ever safe to drop the > > extension - certainly, the u1 case is easy to see, but consider if the value > > a uN, it could be "saturated", i.e., 0xF...F. If zero-extended to uM (N+1 or > > greater), it could hold the sum of two uNs, but if the extension is dropped, > > then semantics might not be preserved. > > If we have an `shl iN %x, %y`, the `%y` must be within `[iN 0, iN N)`, > anything else is just UB. There are several important observations here: > 1. For bitwidth=1, there is only one valid shift amount = 0 > 2. The shift amount is always positive - sign bits are always zeros > > Now, is there an `iN` where `iN (N-1) + iN (N-1)` overflows? > I don't think so? > * i1 (1-1) + i1 (1-1) = i1 0 + i1 0 = i1 0 <- No overflow? > * i2 (2-1) + i2 (2-1) = i2 1 + i2 1 = i2 2 (0b10) <- No overflow? > * i3 (3-1) + i3 (3-1) = i3 2 + i3 2 = i3 4 (0b100) <- No overflow? > * i4 (4-1) + i4 (4-1) = i4 3 + i4 3 = i4 6 (0b0110) <- No overflow? > and so on. > > So this is fine in general, but there is clearly something wrong. Err, i lost the thought there. But of course, while `iN 2 * iN (N-1)` never overflows, that isn't guaranteed to be so for `iM 2 * iM (N-1)` where M < N. Forgetting about fixed-width integers, we have the following requirement: `2*(x-1) <= (2^y)-1`, and it doesn't yield full-set: https://www.wolframalpha.com/input/?i=2+*+%28x-1%29+%3C%3D+%282%5Ey%29-1+for+x+%3E+0%2C+y+%3E+0%2C+x+%3E+y%2C+x+is+integer So indeed, we need one more sanity check for the shift bitwidths vs the shift amount bitwidth.
> So indeed, we need one more sanity check for the shift bitwidths vs the > shift amount bitwidth. Roman, are you still looking into this?
(In reply to Hans Wennborg from comment #5) > > So indeed, we need one more sanity check for the shift bitwidths vs the > > shift amount bitwidth. > > Roman, are you still looking into this? Yes, just made commits locally, will push today.
There was similar issue in foldShiftIntoShiftInAnotherHandOfAndInICmp(), fixed that one too. I'm not seeing the problem elsewhere. @Hans: all set as far as i'm concerned
(In reply to Roman Lebedev from comment #7) > There was similar issue in foldShiftIntoShiftInAnotherHandOfAndInICmp(), > fixed that one too. I'm not seeing the problem elsewhere. > > @Hans: all set as far as i'm concerned Great! I'll let the fixes bake in master for a bit, and then merge them.
(In reply to Hans Wennborg from comment #8) > (In reply to Roman Lebedev from comment #7) > > There was similar issue in foldShiftIntoShiftInAnotherHandOfAndInICmp(), > > fixed that one too. I'm not seeing the problem elsewhere. > > > > @Hans: all set as far as i'm concerned > > Great! I'll let the fixes bake in master for a bit, and then merge them. Pushed to 10.x as 77e448c0d3a87e7944381d7d53e55c997c8b936a f115a88191c3dc80c5140fbbf63f74ca77fcc74b ac293ede5e62cfc569f2d5d8f4667e6188afced0 b2b41bc3b51a083fb9e36e50d0131dfbd79e00ce