This: https://godbolt.org/g/b2t1a9 unsigned long f(unsigned long x, int amt) { x += (x << amt) | (x >> (64 - amt)); return x & 0xffffffffu; } unsigned long g(unsigned long x, int amt) { x += (x << amt) | (x >> (64 - amt)); return x; } produces: f(unsigned long, int): # @f(unsigned long, int) mov rax, rdi mov ecx, esi shl rax, cl mov ecx, 64 sub ecx, esi mov rdx, rdi shr rdx, cl or edx, eax add edi, edx mov rax, rdi ret g(unsigned long, int): # @g(unsigned long, int) mov rax, rdi mov ecx, esi rol rax, cl lea rax, [rax + rdi] ret Found while investigating a significant perf difference between LLVM and gcc on a hash function. (Yes, "amt" is variable in the code in question.)
I think it's because the AND gets pushed into the OR, so DAGCombiner::MatchRotate sees something like (or (trunc (srl ..)) (trunc (shl ..))) Maybe it could be taught to see through that.
Actually, I'm not sure it makes sense that (trunc (or x y)) has become (or (trunc x) (trunc y)).
If there are any rules in the transformations that can propagate truncates "upwards" (towards producers) through logic ops, it would make sense that the fixed point you eventually end up at has propagated the truncates as far up as they will go. I mentioned this bug on Twitter and it turned into a lengthy thread on the subject :) https://twitter.com/FioraAeterna/status/936387540113694720 As far as I see it, the two primary options are: 1. Inhibit "bubbling up" the truncates when they're part of a rotate, i.e. effectively blacklist rotate-like patterns for the truncate-movement rules (what's suggested in that tweet). 2. Teach MatchRotate to recognize the (or (trunc ...) (trunc ...)) and match it into (trunc (rotate ...)) (what Hans suggests; also suggested by Steve Canon later in that Twitter thread). Of the two, I think 2) is probably preferable, if for no other reason than that the rotate-matching seems a lot more specific than bubbling up truncates, so it feels more natural to teach rotates about truncations than it does to teach truncations about rotates. (If that makes sense.)
Patch:
Now with actual paste: https://reviews.llvm.org/D40792
r319692