LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 24234 - [AArch64] error in backend: fixup value out of range
Summary: [AArch64] error in backend: fixup value out of range
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: AArch64 (show other bugs)
Version: trunk
Hardware: PC FreeBSD
: P normal
Assignee: Diana Picus
URL: https://bugs.freebsd.org/bugzilla/sho...
Keywords:
Depends on:
Blocks:
 
Reported: 2015-07-23 14:15 PDT by emaste
Modified: 2016-08-01 09:30 PDT (History)
9 users (show)

See Also:
Fixed By Commit(s):


Attachments
.c and .sh files produced after assertion failure (992.50 KB, application/octet-stream)
2015-07-23 14:15 PDT, emaste
Details
Reduced bitcode for the cockney testcase (110.29 KB, application/octet-stream)
2016-01-04 05:27 PST, Charlie Turner
Details
Reattach the original testcases. The reduced one no longer triggers with ToT Clang (320.00 KB, application/x-tar)
2016-01-07 12:02 PST, Charlie Turner
Details
Reduced bitcode testcase (119.18 KB, application/octet-stream)
2016-07-26 09:16 PDT, Diana Picus
Details
Reduced bitcode testcase (119.18 KB, application/octet-stream)
2016-07-26 09:18 PDT, Diana Picus
Details

Note You need to log in before you can comment on or make changes to this bug.
Description emaste 2015-07-23 14:15:41 PDT
Created attachment 14637 [details]
.c and .sh files produced after assertion failure

Found during FreeBSD/arm64 ports build and reported in FreeBSD PR 201762; I reproduced on a recent SVN build.

fatal error: error in backend: fixup value out of range
cc: error: clang frontend command failed with exit code 70 (use -v to see invocation)
FreeBSD clang version 3.6.1 (tags/RELEASE_361/final 237755) 20150525
Target: aarch64-unknown-freebsd11.0
Thread model: posix
cc: note: diagnostic msg: PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the crash backtrace, preprocessed source, and associated run script.
cc: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
cc: note: diagnostic msg: /tmp/cockney-d900cb.c
cc: note: diagnostic msg: /tmp/cockney-d900cb.sh
Comment 1 andrew 2015-07-24 13:13:22 PDT
This is because llvm is trying to create a tbnz instruction, however after calculating the fixup it finds the value is too large to fit into the 14-bit field.

In the attached case I find llvm is generating the tbnz fixup with an offset of 34612 bytes, 1842 bytes past where it could branch to. I expect this to most likely be from the large switch statement.
Comment 2 Charlie Turner 2016-01-04 05:27:52 PST
Created attachment 15548 [details]
Reduced bitcode for the cockney testcase

I invested a significant amount of compute time to get this reduction :-)
It's still not very useful, since the disassembly is over 7500 lines, but it runs a bit quicker for testing.

Here's how I'm reproducing the fault with the attached file:

$ llc -filetype=obj -O1 -relocation-model=pic cockney-reduced.bc

This fault does not occur when generating assmebly, or when run in -O0 mode.
Comment 3 Charlie Turner 2016-01-07 12:02:21 PST
Created attachment 15581 [details]
Reattach the original testcases. The reduced one no longer triggers with ToT Clang
Comment 4 Charlie Turner 2016-01-08 07:28:57 PST
I'm not able to continue investigating this ticket, so I'll do my best
to hand-over what I learned while trying (and failing) to fix it.

I initially thought this would be a bug in AArch64's branch relaxation
pass [lib/Target/AArch64/AArch64BranchRelaxation.cpp], the purpose of
which is to transform branch instruction to targets that are out of
range for the instruction encodings. Transforms like,

tbz  LBL_TOO_FAR_AWAY
==XFORM=>
tbnz NEXT_BB
B LBL_TOO_FAR_AWAY

I checked that this invariant wasn't being broken by an inspection of
the basic block offsets, some offsets were within a few thousand bytes
of 32K (the limit for TB[N]Z), but none were over it.

So we then go into MC. My first hack was to catch this oversized fixup
value in ELFAArch64AsmBackend::processFixupValue and emit a
relocaton. That's almost certainly not the right fix, because the
relocation might be truncated, but I didn't verify on of this. It's
just one possible way of curing the symptom and not getting the crash,
I doubt it solves the real problem!

Properties of the bug I've noticed
  - Does not show up in -O0 or -O1 from Clang.
  - You have to compile in -O2 mode and above in Clang, but -O1 and
  above in llc.
  - Only shows up when emitting an object, not assembly. So the
  problem is in the object streamer bits.
  - Only shows up with -fPIC (i.e., -relocation-model=pic for llc)

This command will show the error without having to use the provided
"sh" file.

$ clang -target aarch64-unknown -c -fPIC -O1 cockney-d900cb.c

Once a bitcode file has been produced, this is the llc rune:

$ llc -O1 -relocation-model=pic -filetype=obj cockney-d900cb.bc

Using bugpoint can reduce the test-case significantly, but it's still
too big for Human analysis of the source. It does make interactive
runs go faster. This is the bugpoint command I used:

$ bugpoint -llc-safe cockney-d900cb.bc --tool-args -relocation-model=pic -filetype=obj

That took several hours on my machine. (Un)fortunately, the test-case
I reduced a few days ago got fixed upstream, it no longer fails on
trunk. So if you want a smaller reproducer, use the above.

As to what is causing the bug, my only hunch is that there's something
wrong in the symbol generation. I see MCValue's whose "A symbol" is larger
than 32K, but that might be OK, because the fixup offsets are supposed
to account for that, if I've understood correctly. On my machine, the
value that causes the crash has an "A symbol" with an offset of 45084
and a corresponding fixup with offset 9660, giving a difference of
35424; 2657 bytes too big.

The fixup offsets are computed in what appears to be a sensible
fashion in MCELFStreamer::EmitInstToData. A fixup's offset points to
the start of the corresponding instruction's code in the data
fragment.

Another hunch was that relaxation is somehow pushing the symbol
offsets over the instruction encoding's maximums. I know the assembler
is supposed to do branch shortening, so maybe there's a bug there?

I've run out of time to load more context in to solve this one, so
I'll have to drop it.
Comment 5 Diana Picus 2016-07-26 09:16:52 PDT
Created attachment 16806 [details]
Reduced bitcode testcase

This is what bugpoint has reduced now. It's pretty fragile, because it's only 1 byte over the limit, so any change that reduces that huge function by 1 byte might avoid the assertion.
Comment 6 Diana Picus 2016-07-26 09:18:57 PDT
Created attachment 16807 [details]
Reduced bitcode testcase

This is what bugpoint managed to reduce this time around. It's pretty fragile because it's only 1 byte over the limit of 32767, so any change that will make that function smaller will avoid this error.
Comment 7 Diana Picus 2016-07-27 10:42:13 PDT
Potential fix: https://reviews.llvm.org/D22870

Note that this bug is not in the object streaming code, as previously assumed in the comments - in fact, the corresponding assembly file (clang -O2 -S) crashes gas as well as llvm-mc with the same out-of-range errors.
Comment 8 Diana Picus 2016-08-01 09:30:43 PDT
r277331