LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 35448 - Loop unrolling breaks vectorization
Summary: Loop unrolling breaks vectorization
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Loop Optimizer (show other bugs)
Version: trunk
Hardware: PC All
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-28 09:40 PST by Sanjoy Das
Modified: 2017-11-30 02:48 PST (History)
4 users (show)

See Also:
Fixed By Commit(s):


Attachments
D source code that demonstrates the problem in LLVM 5.0.0 (1.96 KB, text/plain)
2017-11-30 02:48 PST, Igor Shirkalin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sanjoy Das 2017-11-28 09:40:36 PST
Perhaps this is already a known issue, but Clang/LLVM trunk does not vectorize the inner matmult loop unless the "#pragma unroll" is enabled:

void f(int * __restrict__ a, int * __restrict__ b, int * __restrict__ r) {
  for (int m = 0; m < 64; m++) {
    int c = 0;
    // #pragma unroll
    for (int i = 0; i < 32; i++) {
      c += a[i] * b[m * 32 + i];
    }
    r[m] = c;
  }
}

It looks like the loop unroller fully unrolls the inner loop and the SLP vectorizer is unable to vectorize as well as the Loop vectorizer would have vectorized the not-unrolled loop.
Comment 1 Igor Shirkalin 2017-11-30 02:48:23 PST
Created attachment 19495 [details]
D source code that demonstrates the problem in LLVM 5.0.0

Urolling the loop breaks vectorization in some cases. It depends on size of inner loop. For 16 and 32 items the vectorization is broken. If we accumulate the results of inner loop the vectorization works fine.