LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 35487 - Poor code generation for narrowed rotates
Summary: Poor code generation for narrowed rotates
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: 5.0
Hardware: PC Linux
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-30 16:02 PST by Fabian Giesen
Modified: 2017-12-04 13:19 PST (History)
7 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Fabian Giesen 2017-11-30 16:02:24 PST
This: https://godbolt.org/g/b2t1a9

unsigned long f(unsigned long x, int amt)
{
    x += (x << amt) | (x >> (64 - amt));
    return x & 0xffffffffu;
}

unsigned long g(unsigned long x, int amt)
{
    x += (x << amt) | (x >> (64 - amt));
    return x;
}

produces:

f(unsigned long, int): # @f(unsigned long, int)
  mov rax, rdi
  mov ecx, esi
  shl rax, cl
  mov ecx, 64
  sub ecx, esi
  mov rdx, rdi
  shr rdx, cl
  or edx, eax
  add edi, edx
  mov rax, rdi
  ret
g(unsigned long, int): # @g(unsigned long, int)
  mov rax, rdi
  mov ecx, esi
  rol rax, cl
  lea rax, [rax + rdi]
  ret

Found while investigating a significant perf difference between LLVM and gcc on a hash function. (Yes, "amt" is variable in the code in question.)
Comment 1 Hans Wennborg 2017-12-01 15:07:57 PST
I think it's because the AND gets pushed into the OR, so DAGCombiner::MatchRotate sees something like

  (or (trunc (srl ..)) (trunc (shl ..)))

Maybe it could be taught to see through that.
Comment 2 Hans Wennborg 2017-12-01 15:13:55 PST
Actually, I'm not sure it makes sense that (trunc (or x y)) has become (or (trunc x) (trunc y)).
Comment 3 Fabian Giesen 2017-12-01 16:45:17 PST
If there are any rules in the transformations that can propagate truncates "upwards" (towards producers) through logic ops, it would make sense that the fixed point you eventually end up at has propagated the truncates as far up as they will go.

I mentioned this bug on Twitter and it turned into a lengthy thread on the subject :)

https://twitter.com/FioraAeterna/status/936387540113694720

As far as I see it, the two primary options are:
1. Inhibit "bubbling up" the truncates when they're part of a rotate, i.e. effectively blacklist rotate-like patterns for the truncate-movement rules (what's suggested in that tweet).
2. Teach MatchRotate to recognize the (or (trunc ...) (trunc ...)) and match it into (trunc (rotate ...)) (what Hans suggests; also suggested by Steve Canon later in that Twitter thread).

Of the two, I think 2) is probably preferable, if for no other reason than that the rotate-matching seems a lot more specific than bubbling up truncates, so it feels more natural to teach rotates about truncations than it does to teach truncations about rotates. (If that makes sense.)
Comment 4 Hans Wennborg 2017-12-04 10:25:45 PST
Patch:
Comment 5 Hans Wennborg 2017-12-04 10:26:11 PST
Now with actual paste: https://reviews.llvm.org/D40792
Comment 6 Hans Wennborg 2017-12-04 13:19:12 PST
r319692