New user self-registration is disabled due to spam. For an account please email bugs-admin@lists.llvm.org with your e-mail address and full name.

Bug 45335 - Crash when building Linux kernel’s multi_v5_defconfig’s kernel/trace/trace_clock.o
Summary: Crash when building Linux kernel’s multi_v5_defconfig’s kernel/trace/trace_cl...
Status: RESOLVED FIXED
Alias: None
Product: clang
Classification: Unclassified
Component: Frontend (show other bugs)
Version: trunk
Hardware: Other All
: P enhancement
Assignee: Unassigned Clang Bugs
URL: https://github.com/ClangBuiltLinux/li...
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-27 23:13 PDT by Nathan Chancellor
Modified: 2020-04-30 10:26 PDT (History)
10 users (show)

See Also:
Fixed By Commit(s):


Attachments
IR before globalopt (3.69 KB, text/plain)
2020-04-16 06:40 PDT, David Spickett
Details
IR after globalopt (3.55 KB, text/plain)
2020-04-16 06:40 PDT, David Spickett
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nathan Chancellor 2020-03-27 23:13:16 PDT
With assertions enabled:

$ curl -LSs https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.5.13.tar.xz | tar xJf -

$ cd linux-5.5.13

$ make -j$(nproc) -s ARCH=arm CC=clang CROSS_COMPILE=arm-linux-gnueabi- O=out/arm32 distclean multi_v5_defconfig kernel/trace/trace_clock.o
fragment covers entire variable
!1079 = !DIGlobalVariableExpression(var: !1074, expr: !DIExpression(DW_OP_LLVM_fragment, 0, 64))
!1074 = distinct !DIGlobalVariable(name: "trace_clock_struct", scope: !2, file: !1057, line: 89, type: !1075, isLocal: true, isDefinition: true)
fatal error: error in backend: Broken module found, compilation aborted!
clang-11: error: clang frontend command failed with exit code 70 (use -v to see invocation)
...

creduce spits out:

$ cat trace_clock.i
typedef struct {
} a;
static struct {
  long b;
  a c
} d;
e() {
  long f = d.b + 1;
  d.b = f;
}

$ clang —O2 —-target=arm-linux-gnueabi -march=armv5te -g -c -o /dev/null trace_clock.i
fragment covers entire variable
!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32))
!1 = distinct !DIGlobalVariable(name: "d", scope: !2, file: !3, line: 6, type: !7, isLocal: true, isDefinition: true)
fatal error: error in backend: Broken module found, compilation aborted!

Full preprocessed source and interestingness test available at https://github.com/nathanchance/creduce-files/tree/d09780fbcf83f4707527f6bf6e553c79ea43d5e4/multi_v5_defconfig-trace_clock-crash
Comment 1 Kristof Beyls 2020-03-31 05:52:53 PDT
This also reproduces with
clang -target x86_64-unknown-linux-gnu -O2 -g -c t.c
So this seems like a target-independent issue in the Clang frontend.
Comment 2 Nick Desaulniers 2020-03-31 10:42:08 PDT
Indeed, it seems dependent on -O2 and -g and -DLLVM_ENABLE_ASSERTIONS=ON, but not target ISA (Kristof thank you for testing, and Nathan for the concise reproducer).

The definition of `a` looks funny; an anonymous struct with no members, but the error seems to come from the definition of `d`, IIUC.

I can also reproduce with the test case, then:
$ clang -O2 -g foo.c -emit-llvm -S
$ llc foo.c
Comment 3 David Spickett 2020-04-02 06:23:32 PDT
Mostly confirming your suspicions.

https://llvm.org/docs/LangRef.html#diexpression:
"DW_OP_LLVM_fragment, 16, 8 specifies the offset and size (16 and 8 here, respectively) of the variable fragment from the working expression. Note that contrary to DW_OP_bit_piece, the offset is describing the location within the described source variable."

This is declaring a global variable based on "d". This starts at offset 0 of "d" and has size 64.
!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression(DW_OP_LLVM_fragment, 0, 64))

We can get the size from the type referenced:
type: !8 = distinct !DICompositeType(tag: DW_TAG_structure_type, file: !7, line: 3, size: 64, elements: !9)

Which is 64. Hence the global covers the whole of "d". I assume a fragment must be by definition a subset of the type you're referencing. 

If you change the triple to arm you get an offset of 32, so I think this may be related to the alignment of the structure. You can add another long to "d" to get size=128 and the offset will remain 64 on x86. (not confirmed)
Comment 4 David Spickett 2020-04-02 09:08:14 PDT
So that size in DW_OP_LLVM_fragment is the size of the variable in bits.

I took out the verify checks in clang, and compared it with gcc's debug output.

GCC:
 <1><63>: Abbrev Number: 7 (DW_TAG_variable)
    <64>   DW_AT_name        : d
    <66>   DW_AT_decl_file   : 1
    <67>   DW_AT_decl_line   : 6
    <68>   DW_AT_decl_column : 3
    <69>   DW_AT_type        : <0x3c>
    <6d>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0  (DW_OP_addr: 0)

<1><3c>: Abbrev Number: 4 (DW_TAG_structure_type)
    <3d>   DW_AT_byte_size   : 8
    <3e>   DW_AT_decl_file   : 1
    <3f>   DW_AT_decl_line   : 3
    <40>   DW_AT_decl_column : 8
    <41>   DW_AT_sibling     : <0x5c>

GCC says that "d" is a struct at address 0, which has size 8. (this is a "simple location")

Clang:
<1><2a>: Abbrev Number: 2 (DW_TAG_variable)
    <2b>   DW_AT_name        : (indirect string, offset: 0x9e): d
    <2f>   DW_AT_type        : <0x41>
    <33>   DW_AT_decl_file   : 1
    <34>   DW_AT_decl_line   : 6
    <35>   DW_AT_location    : 11 byte block: 3 0 0 0 0 0 0 0 0 93 8    (DW_OP_addr:
 0; DW_OP_piece: 8)

<1><41>: Abbrev Number: 3 (DW_TAG_structure_type)
    <42>   DW_AT_byte_size   : 8
    <43>   DW_AT_decl_file   : 1
    <44>   DW_AT_decl_line   : 3

Clang says that "d" is a struct at address 0, with size 8 again. Except that we put it in the location itself instead of the struct type. (this is a "composite location"

At -O0 clang uses the same simple location expression as GCC's -O2. So maybe I understand the check now. If the thing you're pointing to is the same size as the location being used, you don't need a "fragment" of that location.

From the DWARF standard: "
Single location descriptions are of two kinds:
a. Simple location descriptions, which describe the location of one contiguous piece
(usually all) of an object. A simple location description may describe a location in
addressable memory, or in a register, or the lack of a location (with or without a
known value).
b. Composite location descriptions, which describe an object in terms of pieces each
of which may be contained in part of a register or stored in a memory location
unrelated to other pieces."

So option a. is correct here. (though it doesn't totally disavow using the second)

Indeed, if I replace the global var expression in clang's ir:
- !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression(DW_OP_LLVM_fragment, 0, 64))
+ !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())

llc no longer complains about the debug info.

I will see if I can find where the fragment part is generated. Perhaps at that point we assume that "b" is a fragment of {b,c} then c gets removed, so "b" is essentially the whole struct and we don't realise that.
Comment 5 David Spickett 2020-04-16 06:35:29 PDT
Narrowed it down to somewhere in the globalopt pass. Using the attached IR with:
$ ./opt --globalopt test_before_global_opt.ll -S -o -
(after is also attached)

You can see that DW_OP_LLVM_fragment is inserted. This is done when copying the debug statements from the old structures to the new ones. In lib/Transforms/IPO/GlobalOpt.cpp::transferSRADebugInfo.
Comment 6 David Spickett 2020-04-16 06:40:29 PDT
Created attachment 23371 [details]
IR before globalopt
Comment 7 David Spickett 2020-04-16 06:40:51 PDT
Created attachment 23372 [details]
IR after globalopt
Comment 8 David Spickett 2020-04-16 07:15:31 PDT
Discovered another issue like this https://bugs.llvm.org/show_bug.cgi?id=35416, which was fixed for the SROA pass. I will apply a similar thing here.
Comment 9 David Spickett 2020-04-23 08:53:24 PDT
Git commit for the change mentioned above is: d7f6f1636d53c3e2faf55cdf20fbb44a1a149df1

Fix for review: https://reviews.llvm.org/D78720
Comment 10 David Spickett 2020-04-30 03:40:18 PDT
Committed fix as 3929429347d398773577b79f7fdb780d4f7ed887.