New user self-registration is disabled due to spam. For an account please email bugs-admin@lists.llvm.org with your e-mail address and full name.

Bug 44802 - InstCombine unsafely drops zero extensions of shift amounts when combining shifts - miscompile
Summary: InstCombine unsafely drops zero extensions of shift amounts when combining sh...
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Roman Lebedev
URL:
Keywords:
Depends on:
Blocks: release-10.0.0
  Show dependency tree
 
Reported: 2020-02-05 09:22 PST by Rob Springer
Modified: 2020-02-27 04:47 PST (History)
8 users (show)

See Also:
Fixed By Commit(s): 425ef999385 781d077afb0 6f807ca00d9 2855c8fed93


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rob Springer 2020-02-05 09:22:38 PST
Reproducer: https://rise4fun.com/Alive/ne8y

In InstCombiner::reassociateShiftAmtsOfTwoSameDirectionShifts(), when identifying two shifts to combine, it does so while ignoring any zero-extensions of those values. In the problematic case above, the values were both i1s zero-extended to i3s.

When their zero-extensions are dropped (taking them back to i1s) their sum becomes an xor in SimplifyAddInst(), which causes the shift to basically disappear (and to be dropped in a later pass).

I'm not familiar with the history of this code, particularly why the zero extensions are ignored, but changing the code to leave them in place does fix the issue.
Comment 1 Roman Lebedev 2020-02-05 09:34:59 PST
The fix is likely
```diff
-  if (ShAmt0->getType() != ShAmt1->getType())
+  if (ShAmt0->getType() != ShAmt1->getType() ||
+      ShAmt0->getType()->isIntOrIntVectorTy(1))
```
@ https://github.com/llvm/llvm-project/blob/9f507bfd8d4761c619f21d8165b7e6e6e5bd09bd/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp#L56-L59

I'm not sure if there is a deeper problem.
Comment 2 Rob Springer 2020-02-13 08:47:56 PST
I might be missing something, but I don't think it's ever safe to drop the extension - certainly, the u1 case is easy to see, but consider if the value a uN, it could be "saturated", i.e., 0xF...F. If zero-extended to uM (N+1 or greater), it could hold the sum of two uNs, but if the extension is dropped, then semantics might not be preserved.

I think the below might be necessary.

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
index 43be42a4096..d782fe9076c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
@@ -35,7 +35,7 @@ Value *InstCombiner::reassociateShiftAmtsOfTwoSameDirectionShifts(
   Instruction *Sh0Op0;
   Value *ShAmt0;
   if (!match(Sh0,
-             m_Shift(m_Instruction(Sh0Op0), m_ZExtOrSelf(m_Value(ShAmt0)))))
+             m_Shift(m_Instruction(Sh0Op0), m_Value(ShAmt0))))
     return nullptr;
 
   // If there is a truncation between the two shifts, we must make note of it
@@ -50,7 +50,7 @@ Value *InstCombiner::reassociateShiftAmtsOfTwoSameDirectionShifts(
   // Inner shift: (x shiftopcode ShAmt1)
   // Like with other shift, ignore zext of shift amount if any.
   Value *X, *ShAmt1;
-  if (!match(Sh1, m_Shift(m_Value(X), m_ZExtOrSelf(m_Value(ShAmt1)))))
+  if (!match(Sh1, m_Shift(m_Value(X), m_Value(ShAmt1))))
     return nullptr;
 
   // We have two shift amounts from two different shifts. The types of those
Comment 3 Roman Lebedev 2020-02-13 10:17:19 PST
(In reply to Rob Springer from comment #2)
> I might be missing something, but I don't think it's ever safe to drop the
> extension - certainly, the u1 case is easy to see, but consider if the value
> a uN, it could be "saturated", i.e., 0xF...F. If zero-extended to uM (N+1 or
> greater), it could hold the sum of two uNs, but if the extension is dropped,
> then semantics might not be preserved.

If we have an `shl iN %x, %y`, the `%y` must be within `[iN 0, iN N)`,
anything else is just UB. There are several important observations here:
1. For bitwidth=1, there is only one valid shift amount = 0
2. The shift amount is always positive - sign bits are always zeros

Now, is there an `iN` where `iN (N-1) + iN (N-1)` overflows?
I don't think so?
* i1 (1-1) + i1 (1-1) = i1 0 + i1 0 = i1 0         <- No overflow?
* i2 (2-1) + i2 (2-1) = i2 1 + i2 1 = i2 2 (0b10)  <- No overflow?
* i3 (3-1) + i3 (3-1) = i3 2 + i3 2 = i3 4 (0b100)  <- No overflow?
* i4 (4-1) + i4 (4-1) = i4 3 + i4 3 = i4 6 (0b0110)  <- No overflow?
and so on. 

So this is fine in general, but there is clearly something wrong.
Comment 4 Roman Lebedev 2020-02-14 02:29:22 PST
(In reply to Roman Lebedev from comment #3)
> (In reply to Rob Springer from comment #2)
> > I might be missing something, but I don't think it's ever safe to drop the
> > extension - certainly, the u1 case is easy to see, but consider if the value
> > a uN, it could be "saturated", i.e., 0xF...F. If zero-extended to uM (N+1 or
> > greater), it could hold the sum of two uNs, but if the extension is dropped,
> > then semantics might not be preserved.
> 
> If we have an `shl iN %x, %y`, the `%y` must be within `[iN 0, iN N)`,
> anything else is just UB. There are several important observations here:
> 1. For bitwidth=1, there is only one valid shift amount = 0
> 2. The shift amount is always positive - sign bits are always zeros
> 
> Now, is there an `iN` where `iN (N-1) + iN (N-1)` overflows?
> I don't think so?
> * i1 (1-1) + i1 (1-1) = i1 0 + i1 0 = i1 0         <- No overflow?
> * i2 (2-1) + i2 (2-1) = i2 1 + i2 1 = i2 2 (0b10)  <- No overflow?
> * i3 (3-1) + i3 (3-1) = i3 2 + i3 2 = i3 4 (0b100)  <- No overflow?
> * i4 (4-1) + i4 (4-1) = i4 3 + i4 3 = i4 6 (0b0110)  <- No overflow?
> and so on. 
> 
> So this is fine in general, but there is clearly something wrong.

Err, i lost the thought there.
But of course, while `iN 2 * iN (N-1)` never overflows,
that isn't guaranteed to be so for `iM 2 * iM (N-1)` where M < N.
Forgetting about fixed-width integers, we have the following requirement:
`2*(x-1) <= (2^y)-1`, and it doesn't yield full-set:
https://www.wolframalpha.com/input/?i=2+*+%28x-1%29+%3C%3D+%282%5Ey%29-1+for+x+%3E+0%2C+y+%3E+0%2C+x+%3E+y%2C+x+is+integer

So indeed, we need one more sanity check for the shift bitwidths vs the shift amount bitwidth.
Comment 5 Hans Wennborg 2020-02-25 06:57:07 PST
> So indeed, we need one more sanity check for the shift bitwidths vs the
> shift amount bitwidth.

Roman, are you still looking into this?
Comment 6 Roman Lebedev 2020-02-25 07:04:27 PST
(In reply to Hans Wennborg from comment #5)
> > So indeed, we need one more sanity check for the shift bitwidths vs the
> > shift amount bitwidth.
> 
> Roman, are you still looking into this?

Yes, just made commits locally, will push today.
Comment 7 Roman Lebedev 2020-02-25 07:29:29 PST
There was similar issue in foldShiftIntoShiftInAnotherHandOfAndInICmp(),
fixed that one too. I'm not seeing the problem elsewhere.

@Hans: all set as far as i'm concerned
Comment 8 Hans Wennborg 2020-02-25 07:31:14 PST
(In reply to Roman Lebedev from comment #7)
> There was similar issue in foldShiftIntoShiftInAnotherHandOfAndInICmp(),
> fixed that one too. I'm not seeing the problem elsewhere.
> 
> @Hans: all set as far as i'm concerned

Great! I'll let the fixes bake in master for a bit, and then merge them.
Comment 9 Hans Wennborg 2020-02-27 04:47:51 PST
(In reply to Hans Wennborg from comment #8)
> (In reply to Roman Lebedev from comment #7)
> > There was similar issue in foldShiftIntoShiftInAnotherHandOfAndInICmp(),
> > fixed that one too. I'm not seeing the problem elsewhere.
> > 
> > @Hans: all set as far as i'm concerned
> 
> Great! I'll let the fixes bake in master for a bit, and then merge them.

Pushed to 10.x as

77e448c0d3a87e7944381d7d53e55c997c8b936a
f115a88191c3dc80c5140fbbf63f74ca77fcc74b
ac293ede5e62cfc569f2d5d8f4667e6188afced0
b2b41bc3b51a083fb9e36e50d0131dfbd79e00ce