New user self-registration is disabled due to spam. For an account please email bugs-admin@lists.llvm.org with your e-mail address and full name.

Bug 45632 - Linux kernel's multi_v5_defconfig no longer boots after max-page-size increase to 64k
Summary: Linux kernel's multi_v5_defconfig no longer boots after max-page-size increas...
Status: RESOLVED FIXED
Alias: None
Product: tools
Classification: Unclassified
Component: llvm-objcopy/strip (show other bugs)
Version: trunk
Hardware: PC Windows NT
: P normal
Assignee: Fangrui Song
URL:
Keywords:
Depends on:
Blocks: 4068
  Show dependency tree
 
Reported: 2020-04-21 13:31 PDT by Nathan Chancellor
Modified: 2020-05-04 17:03 PDT (History)
12 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nathan Chancellor 2020-04-21 13:31:20 PDT
https://github.com/llvm/llvm-project/commit/87383e408d41623ada41e2bbc371b037fa29e894

To reproduce (assuming LLVM/clang, arm-linux-gnueabi binutils, zstd, and qemu-system-arm are in your PATH):

$ git clone -b v5.7-rc2 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

$ git clone https://github.com/ClangBuiltLinux/boot-utils

$ make -C linux -j$(nproc) -s ARCH=arm CC=clang CROSS_COMPILE=arm-linux-gnueabi- LLVM=1 O=out/arm32 distclean multi_v5_defconfig zImage aspeed-bmc-opp-palmetto.dtb

$ ./boot-utils/boot-qemu.sh -a arm32_v5 -k linux/out/arm32 -t 30s
...
+ RET=124
...

To test the behavior before that change:

$ sed -i '55 a KBUILD_LDFLAGS\t+= -z max-page-size=4096' linux/arch/arm/Makefile

$ make -C linux -j$(nproc) -s ARCH=arm CC=clang CROSS_COMPILE=arm-linux-gnueabi- LLVM=1 O=out/arm32 distclean multi_v5_defconfig zImage aspeed-bmc-opp-palmetto.dtb

$ ./boot-utils/boot-qemu.sh -a arm32_v5 -k linux/out/arm32 -t 30s
...
Linux version 5.7.0-rc2-dirty (nathan@ubuntu-s3-xlarge-x86) (ClangBuiltLinux clang version 11.0.0 (git://github.com/llvm/llvm-project a9b137f9ffba8cb25dfd7dd1fb613e8aac121b37), LLD 11.0.0 (git://github.com/llvm/llvm-project a9b137f9ffba8cb25dfd7dd1fb613e8aac121b37)) #1 PREEMPT Tue Apr 21 13:21:14
...

I am unsure of how to debug this further. I tried attaching gdb to vmlinux but it looks like the boot gets stuck extremely early, it does not even make it to start_kernel.
Comment 1 Peter Smith 2020-04-22 02:47:06 PDT
Thanks for the reproducer. I'll try and take a look this Evening after work. I'm not experienced in debugging Linux kernels myself so I may not get very far either. I can check the ELF file to see if there is anything obviously wrong. When an image fails really quickly it can sometimes help to use the qemu instruction trace, it does produce a huge amount of output but it can sometimes tell you where an exception has been hit or where an infinite loop occurs.
Comment 2 Peter Smith 2020-04-22 07:25:50 PDT
Worth mentioning that at present you'll need to disable assertions due to https://bugs.llvm.org/show_bug.cgi?id=45335 As this affects debug I don't think it is the cause of the kernel not booting.
Comment 3 Peter Smith 2020-04-22 09:31:43 PDT
When I tried to execute the program I got an error from qemu telling me that it was trying to execute a from 0xf77e0908 which is outside the memory allocated for the kernel.

Adding -d guest_errors,nochain,in_asm shows that some code is executing from 0x40000000 presumably the decompressor for vmlinux, this then tries to jump to 0xf77e0908 which causes my qemu-system-arm to fall over.

Not sure much more I can do with debugging from the binary. There is something in the program headers that is either confusing the compressor or decompressor.

There is one interesting difference between the page sizes in LLD. When we set the page-size to 64-k then we get a PT_PHDR program header generated and a PT_LOAD header that just describes the program header. It is possible that this is confusing whatever tool is creating zimage. There is quite a complex calculation for when LLD creates a PT_PHDR in summary it finds the lowest alloc address, called min in source code
min = 0xc0008000
Allocate headers if header size <= min - AlignDown(min, MaxPageSize). 
With a 64k page size this is true as 0x514 <= 0xc0008000 - 0xc0000000 
With a 4k page size this is false as 0x514 > 0xc0008000 - 0xc0008000 /* 0 */

ld.bfd with a 64k page size does not allocate headers, and also has a single giant PT_LOAD segment whereas LLD has many smaller ones. This is the case when LLD has 4k pages as well, however there are at least a few . = ALIGN(((1 << 12))); that there is an outside chance LLD is overaligning at page boundary.

I don't have a lot of spare time to debug this unfortunately. My first thought is to try manually suppressing the PT_PHDR generation to see if that fixes the problem. If it doesn't then it is likely to be an interaction between the manual ALIGN directives and the existing one.

Appendix: Program Headers
Output is from Arm's fromelf tool, it is roughly equivalent to llvm-readelf except I find it's output to be more readable.


BFD
========================================================================

** Program header #0

    Type          : PT_LOAD (1)
    File Offset   : 0 (0x0)
    Virtual Addr  : 0xc0000000
    Physical Addr : 0xc0000000
    Size in file  : 12340264 bytes (0xbc4c28)
    Size in memory: 12340264 bytes (0xbc4c28)
    Flags         : PF_X + PF_W + PF_R (0x7)
    Alignment     : 65536


====================================

** Program header #1

    Type          : PT_LOAD (1)
    File Offset   : 12386304 (0xbd0000)
    Virtual Addr  : 0xffff0000
    Physical Addr : 0xc0bc5000
    Size in file  : 32 bytes (0x20)
    Size in memory: 32 bytes (0x20)
    Flags         : PF_X + PF_R (0x5)
    Alignment     : 65536


====================================

** Program header #2

    Type          : PT_LOAD (1)
    File Offset   : 12390400 (0xbd1000)
    Virtual Addr  : 0xffff1000
    Physical Addr : 0xc0bc5020
    Size in file  : 684 bytes (0x2ac)
    Size in memory: 684 bytes (0x2ac)
    Flags         : PF_X + PF_R (0x5)
    Alignment     : 65536


====================================

** Program header #3

    Type          : PT_LOAD (1)
    File Offset   : 12407520 (0xbd52e0)
    Virtual Addr  : 0xc0bc52e0
    Physical Addr : 0xc0bc52e0
    Size in file  : 840288 bytes (0xcd260)
    Size in memory: 1548989 bytes (0x17a2bd)
    Flags         : PF_X + PF_W + PF_R (0x7)
    Alignment     : 65536


====================================

LLD
========================================================================

** Program header #0

    Type          : PT_PHDR (6)
    File Offset   : 52 (0x34)
    Virtual Addr  : 0xc0000034
    Physical Addr : 0xc0000034
    Size in file  : 1248 bytes (0x4e0)
    Size in memory: 1248 bytes (0x4e0)
    Flags         : PF_R (0x4)
    Alignment     : 4


====================================

** Program header #1

    Type          : PT_LOAD (1)
    File Offset   : 0 (0x0)
    Virtual Addr  : 0xc0000000
    Physical Addr : 0xc0000000
    Size in file  : 1300 bytes (0x514)
    Size in memory: 1300 bytes (0x514)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #2

    Type          : PT_LOAD (1)
    File Offset   : 32768 (0x8000)
    Virtual Addr  : 0xc0008000
    Physical Addr : 0xc0008000
    Size in file  : 476 bytes (0x1dc)
    Size in memory: 476 bytes (0x1dc)
    Flags         : PF_X + PF_R (0x5)
    Alignment     : 65536


====================================

** Program header #3

    Type          : PT_LOAD (1)
    File Offset   : 33792 (0x8400)
    Virtual Addr  : 0xc0008400
    Physical Addr : 0xc0008400
    Size in file  : 9721864 bytes (0x945808)
    Size in memory: 9721864 bytes (0x945808)
    Flags         : PF_X + PF_W + PF_R (0x7)
    Alignment     : 65536


====================================

** Program header #4

    Type          : PT_LOAD (1)
    File Offset   : 9756672 (0x94e000)
    Virtual Addr  : 0xc094e000
    Physical Addr : 0xc094e000
    Size in file  : 1962080 bytes (0x1df060)
    Size in memory: 1962080 bytes (0x1df060)
    Flags         : PF_W + PF_R (0x6)
    Alignment     : 65536


====================================

** Program header #5

    Type          : PT_LOAD (1)
    File Offset   : 11718752 (0xb2d060)
    Virtual Addr  : 0xc0b2d060
    Physical Addr : 0xc0b2d060
    Size in file  : 8224 bytes (0x2020)
    Size in memory: 8224 bytes (0x2020)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #6

    Type          : PT_LOAD (1)
    File Offset   : 11726976 (0xb2f080)
    Virtual Addr  : 0xc0b2f080
    Physical Addr : 0xc0b2f080
    Size in file  : 62124 bytes (0xf2ac)
    Size in memory: 62124 bytes (0xf2ac)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #7

    Type          : PT_LOAD (1)
    File Offset   : 11789100 (0xb3e32c)
    Virtual Addr  : 0xc0b3e32c
    Physical Addr : 0xc0b3e32c
    Size in file  : 59916 bytes (0xea0c)
    Size in memory: 59916 bytes (0xea0c)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #8

    Type          : PT_LOAD (1)
    File Offset   : 11849016 (0xb4cd38)
    Virtual Addr  : 0xc0b4cd38
    Physical Addr : 0xc0b4cd38
    Size in file  : 206259 bytes (0x325b3)
    Size in memory: 206259 bytes (0x325b3)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #9

    Type          : PT_LOAD (1)
    File Offset   : 12055276 (0xb7f2ec)
    Virtual Addr  : 0xc0b7f2ec
    Physical Addr : 0xc0b7f2ec
    Size in file  : 4960 bytes (0x1360)
    Size in memory: 4960 bytes (0x1360)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #10

    Type          : PT_LOAD (1)
    File Offset   : 12060236 (0xb8064c)
    Virtual Addr  : 0xc0b8064c
    Physical Addr : 0xc0b8064c
    Size in file  : 78762 bytes (0x133aa)
    Size in memory: 78762 bytes (0x133aa)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #11

    Type          : PT_LOAD (1)
    File Offset   : 12139000 (0xb939f8)
    Virtual Addr  : 0xc0b939f8
    Physical Addr : 0xc0b939f8
    Size in file  : 48 bytes (0x30)
    Size in memory: 48 bytes (0x30)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #12

    Type          : PT_LOAD (1)
    File Offset   : 12140544 (0xb94000)
    Virtual Addr  : 0xc0b94000
    Physical Addr : 0xc0b94000
    Size in file  : 272368 bytes (0x427f0)
    Size in memory: 272368 bytes (0x427f0)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #13

    Type          : PT_LOAD (1)
    File Offset   : 12451840 (0xbe0000)
    Virtual Addr  : 0xffff0000
    Physical Addr : 0xc0bd7000
    Size in file  : 32 bytes (0x20)
    Size in memory: 32 bytes (0x20)
    Flags         : PF_X + PF_R (0x5)
    Alignment     : 65536


====================================

** Program header #14

    Type          : PT_LOAD (1)
    File Offset   : 12455936 (0xbe1000)
    Virtual Addr  : 0xffff1000
    Physical Addr : 0xc0bd7020
    Size in file  : 684 bytes (0x2ac)
    Size in memory: 684 bytes (0x2ac)
    Flags         : PF_X + PF_R (0x5)
    Alignment     : 65536


====================================

** Program header #15

    Type          : PT_LOAD (1)
    File Offset   : 12481248 (0xbe72e0)
    Virtual Addr  : 0xc0bd72e0
    Physical Addr : 0xc0bd72e0
    Size in file  : 309268 bytes (0x4b814)
    Size in memory: 309268 bytes (0x4b814)
    Flags         : PF_X + PF_R (0x5)
    Alignment     : 65536


====================================

** Program header #16

    Type          : PT_LOAD (1)
    File Offset   : 12790516 (0xc32af4)
    Virtual Addr  : 0xc0c22af4
    Physical Addr : 0xc0c22af4
    Size in file  : 5160 bytes (0x1428)
    Size in memory: 5160 bytes (0x1428)
    Flags         : PF_R (0x4)
    Alignment     : 65536


====================================

** Program header #17

    Type          : PT_LOAD (1)
    File Offset   : 12795904 (0xc34000)
    Virtual Addr  : 0xc0c24000
    Physical Addr : 0xc0c24000
    Size in file  : 75948 bytes (0x128ac)
    Size in memory: 75948 bytes (0x128ac)
    Flags         : PF_W + PF_R (0x6)
    Alignment     : 65536


====================================

** Program header #18

    Type          : PT_LOAD (1)
    File Offset   : 12943360 (0xc58000)
    Virtual Addr  : 0xc0c38000
    Physical Addr : 0xc0c38000
    Size in file  : 411552 bytes (0x647a0)
    Size in memory: 411552 bytes (0x647a0)
    Flags         : PF_W + PF_R (0x6)
    Alignment     : 65536


====================================

** Program header #19

    Type          : PT_LOAD (1)
    File Offset   : 13354912 (0xcbc7a0)
    Virtual Addr  : 0xc0c9c7a0
    Physical Addr : 0xc0c9c7a0
    Size in file  : 32160 bytes (0x7da0)
    Size in memory: 32160 bytes (0x7da0)
    Flags         : PF_W + PF_R (0x6)
    Alignment     : 65536


====================================

** Program header #20

    Type          : PT_LOAD (1)
    File Offset   : 13387072 (0xcc4540)
    Virtual Addr  : 0xc0ca4540
    Physical Addr : 0xc0ca4540
    Size in file  : 0 bytes (0x0)
    Size in memory: 708701 bytes (0xad05d)
    Flags         : PF_W + PF_R (0x6)
    Alignment     : 65536


====================================
Comment 4 Peter Smith 2020-04-22 10:42:57 PDT
I did a quick experiment to force LLD not to emit the PT_PHDR section and this did not make the kernel boot.

I think the most likely explanation is a mismatch between the linker script and LLD alignment of program headers. It seems like PAGE_SHIFT is hard coded to 4k in Linux which results in the linker script having numerous . = ALIGN(1 << 12) however it is possible that LLD with the extra program headers is breaking something by aligning these program headers to 64K
Comment 5 Peter Smith 2020-04-22 10:45:00 PDT
Apologies I've got about as far as I'm going to get for a while. I may be able to spend some time at the weekend.

One question, is it just this particular configuration are other Arm similar 32-bit kernels failing. If it is just this one I'd like some help to identify what is special about it. It may help to work out what we need to change.
Comment 6 Ilie Halip 2020-04-22 13:36:43 PDT
It happens because these symbols are relocated to higher addresses:

max-page-size=4096
    00601560 g       .image_end     00000000 __bss_start
    00601578 g       .bss   00000000 _end
    00601560 g       .pad   00000000 _edata
    00601530 g       .piggydata     00000000 _got_start
    00601558 g       .got   00000000 _got_end
    0060152e g       .piggydata     00000000 input_data_end

max-page-size=65536
    006fcb80 g       .image_end     00000000 __bss_start
    006fcb98 g       .bss   00000000 _end
    006fcb80 g       .pad   00000000 _edata
    006fcb50 g       .piggydata     00000000 _got_start
    006fcb78 g       .got   00000000 _got_end
    006fcb4f g       .piggydata     00000000 input_data_end

They're used to do some pointer voodoo in arch/arm/boot/compressed/head.S when decompressing and relocating the kernel code. In particular:
    restart:	adr	r0, LC0
        ldmia	r0, {r1, r2, r3, r6, r11, r12}

Inside the restart function, get_inflated_image_size computes the kernel image size in $r9 - I think this one ends up having a wrong value and messes things up later.
Comment 7 Peter Smith 2020-04-22 15:09:50 PDT
Thanks very much, that is very helpful.

My understanding of the boot process is still in its early stages. As I understand it we take the vmlinux image, objcopy it, compress it, insert it into piggydata, the bootloader then uses a combination of linker defined symbols, symbols defined in piggydata to work out how to decompress the kernel.

At the moment I think we are going wrong while the vmlinux kernel is just a binary blob, we haven't tried to jump into it yet.

I would expect some of the linker defined symbols to have a higher address than the 4k page alignment as the 64k page alignment increases the size of the vmlinux image, largely down to the number of program headers LLD is generating. There is also additional alignment in the boot loader that increases the addresses. Whether the symbols are correct is another matter, will need to work out how to check them.

I'll try and take another look later on in the week. I've only got out of work time to look at this so my progress is likely to be slow.
Comment 8 Ilie Halip 2020-04-23 06:44:46 PDT
> At the moment I think we are going wrong while the vmlinux kernel is just a binary blob, we haven't tried to jump into it yet.

Yes, that's right. To debug this code I'm using (with qemu -s -S):

> $ gdb-multiarch -ex "target remote :1234" -ex "stepi" -ex "stepi" -ex "stepi" -ex "stepi" -ex "stepi" -ex "add-symbol-file arch/arm/boot/compressed/vmlinux 0x40010000" -ex "b restart"

This should take you right before those symbols are loaded into the registers. You'll notice that right after "get_inflated_image_size	r9, r10, lr", $r9 has a very high value, in my case it's 0x3fNNNNNN.
Comment 9 Peter Smith 2020-04-25 08:27:20 PDT
I know what is causing the problem and I think it is most likely a bug in llvm-objcopy and not LLD. At least if I replace llvm-objcopy with arm-linux-gnueabihf-objcopy the 64k page size kernel will boot.

The large size for the decompressed kernel, is from LLD's perspective correct, it is loaded from the end of the compressed data (the LLD computed location is correct). The root case seems to be the size of the Image file produced by llvm-objcopy.

Tracing through the boot process, the size of the decompressed kernel is
extracted from the last word of the compressed data, this can be derived from the command line with gzip -l piggy_data

piggy_data is a compressed form of Image, which is the binary form of vmlinux after llvm-objcopy -O binary -R .comment -S

With llvm-objcopy Image is a gigantic file 1073577984 bytes, bigger than the original ELF file, and much bigger than the corresponding arm-linux-gnueabihf-objcopy file size of 13223232 bytes . The file compresses very well, so I'm guessing it is mostly zero.

With this gigantic file the boot code tries to relocate part of itself after the decompressed image but the decompressed image size is so large it ends up at an invalid address.

As the output of LLD seems to be fine when processed with GNU objcopy I think that this is much more likely an llvm-objdump problem than LLD. It is possible that the image layout is confusing llvm-objcopy but not GNU objcopy.

Just to make sure I'm not going crazy it would be great if it can be verified that the kernel will boot if GNU objcopy is used instead of llvm-objcopy? If that is indeed the case I suggest reassigning this bug to llvm-objcopy.

It certainly looks like LLD's program header table creation logic can be improved; LLD is creating way more program headers than it needs to. I can't give this any more time though.
Comment 10 Ilie Halip 2020-04-25 10:14:14 PDT
Yes, indeed the issue seems to be because of llvm-objcopy. I modified the kernel Makefile to use binutils' objcopy, and the kernel boots just fine.

I also see the huge 1G file created by llvm-objcopy, so I'm going to look into that.

Thank you!
Comment 11 James Henderson 2020-04-29 00:34:44 PDT
Looking at the comments here, this sounds like a duplicate of bug 42277, which I helped triage, and Jordan Rupprecht started trying to fix. However, he went quiet after a bit, so I don't know what happened, and I believe he is currently not available for a while, so I wouldn't get your hopes up on a fix from that source. Bug 42277 comment 4 contains a detailed explanation of my investigation of the problem and the results from last year. @MaskRay has been working in this area recently, so he might well be able to provide further commentary that is useful for you.
Comment 12 Fangrui Song 2020-04-30 23:21:56 PDT
Thanks a lot to the previous troubleshooting. I have figured out the llvm-objcopy problem.

* Peter Smith
> I know what is causing the problem and I think it is most likely a bug in llvm-objcopy and not LLD. At least if I replace llvm-objcopy with arm-linux-gnueabihf-objcopy the 64k page size kernel will boot.

lld(max-page-size=4096) + llvm-objcopy => good
lld(max-page-size=4096) + arm-linux-gnueabi-objcopy => good
lld(max-page-size=65536) + llvm-objcopy => bad
lld(max-page-size=65536) + arm-linux-gnueabi-objcopy => good

> ld.bfd with a 64k page size does not allocate headers, and also has a single giant PT_LOAD segment whereas LLD has many smaller ones.

The kernel linker script does not use PHDRS. Writer<ELFT>::createPhdrs creates PT_LOAD segments.
Unfortunately for every output section description like

  __ksymtab         : AT(ADDR(__ksymtab) - LOAD_OFFSET) { ... }

lmaExpr is present and we will start a new PT_LOAD

    //////// sameLMARegion is false
    bool sameLMARegion =
        load && !sec->lmaExpr && sec->lmaRegion == load->firstSec->lmaRegion;
    if (!(load && newFlags == flags && sec != relroEnd &&
          sec->memRegion == load->firstSec->memRegion &&
          (sameLMARegion || load->lastSec == Out::programHeaders))) {
      load = addHdr(PT_LOAD, newFlags);
      flags = newFlags;
    }

This can be tricky to fix (reduce the number of PT_LOAD).
We can't know for sure whether __ksymtab can reuse the previous program header,
because lmaExpr may be a more complex expression and may not follow the addess of the previous output section.
We have to place program header creation into the finalizeAddressDependentContent() loop
if we are going to fix it.

* James Henderson
> Looking at the comments here, this sounds like a duplicate of bug 42277, which I helped triage, and Jordan Rupprecht started trying to fix.

bug 42277 is unrelated. bug 42277 was fixed by https://reviews.llvm.org/D71035

The issue is that llvm-objcopy created arch/arm/boot/Image is too large.

% arm-linux-gnueabi-objcopy -O binary -R .comment -S vmlinux arch/arm/boot/Image
# size=13M
% llvm-objcopy -O binary -R .comment -S vmlinux arch/arm/boot/Image
# size=0xfffe0000-0xc0008000=0x3ffd8000 i.e. ~1GiB

% readelf -WS vmlinux
...
  [35] .text_itcm        PROGBITS        fffe0000 c50000 000000 00  WA  0   0  1
  [36] .data_dtcm        PROGBITS        fffe8000 c58000 000000 00  WA  0   0  1

Empty .text_itcm is expected to be skipped by -O binary. GNU objcopy guarantees this:

// binutils-gdb/bfd/binary.c

  static bfd_boolean
  binary_set_section_contents (bfd *abfd,
  			     asection *sec,
  			     const void * data,
  			     file_ptr offset,
  			     bfd_size_type size)
  {
    if (size == 0)
      return TRUE;

Honestly I don't like the special case, but we probably have to match its behavior.

reviews.llvm.org complains that "the disk is full"

  % /usr/bin/arc diff 'HEAD^'
   Exception 
  [HTTP/500] Internal Server Error
  FilesystemException: Failed to create a temporary directory: the disk is full.
  (Run with `--trace` for a full exception trace.)

I'll post an llvm-objcopy patch when reviews.llvm.org gets restored.
Comment 13 Fangrui Song 2020-04-30 23:57:00 PDT
https://reviews.llvm.org/D79229
Comment 14 Peter Smith 2020-05-01 00:57:17 PDT
> This can be tricky to fix (reduce the number of PT_LOAD).
> We can't know for sure whether __ksymtab can reuse the previous program header,
> because lmaExpr may be a more complex expression and may not follow the addess of the > previous output section.
> We have to place program header creation into the finalizeAddressDependentContent() 
> loop if we are going to fix it.

One possible approach that I've seen before is to do program header allocation in two passes. The first is a pessimistic maximum estimate of how many program headers are needed. This permits the maximum size of the program header table to be known. The contents of the program header table are only written after addresses are allocated so it is possible to write fewer program headers and zero out the remainder.

> % readelf -WS vmlinux
> ...
>  [35] .text_itcm        PROGBITS        fffe0000 c50000 000000 00  WA  0   0  1
>  [36] .data_dtcm        PROGBITS        fffe8000 c58000 000000 00  WA  0   0  1

What I don't understand right now is that those sections are also present in the 4k pagesize build yet llvm-objdump was able to handle that case without the extra padding.  

It is possible that there is some other property that by coincidence works out ok.

I expect the itcm and dtcm sections to represent tightly coupled memory TCM, this is essentially cache speed memory, but is managed by the programmer. I'm not sure how much linux makes use of this (optional) CPU feature though.
Comment 15 Fangrui Song 2020-05-01 11:43:51 PDT
(In reply to Peter Smith from comment #14)
> > This can be tricky to fix (reduce the number of PT_LOAD).
> > We can't know for sure whether __ksymtab can reuse the previous program header,
> > because lmaExpr may be a more complex expression and may not follow the addess of the > previous output section.
> > We have to place program header creation into the finalizeAddressDependentContent() 
> > loop if we are going to fix it.
> 
> One possible approach that I've seen before is to do program header
> allocation in two passes. The first is a pessimistic maximum estimate of how
> many program headers are needed. This permits the maximum size of the
> program header table to be known. The contents of the program header table
> are only written after addresses are allocated so it is possible to write
> fewer program headers and zero out the remainder.
> 
> > % readelf -WS vmlinux
> > ...
> >  [35] .text_itcm        PROGBITS        fffe0000 c50000 000000 00  WA  0   0  1
> >  [36] .data_dtcm        PROGBITS        fffe8000 c58000 000000 00  WA  0   0  1
> 
> What I don't understand right now is that those sections are also present in
> the 4k pagesize build yet llvm-objdump was able to handle that case without
> the extra padding.  
> 
> It is possible that there is some other property that by coincidence works
> out ok.
> 
> I expect the itcm and dtcm sections to represent tightly coupled memory TCM,
> this is essentially cache speed memory, but is managed by the programmer.
> I'm not sure how much linux makes use of this (optional) CPU feature though.

Thanks for the comments. I investigated more.

ld.lld -z max-page-size=4096
  LOAD           0xc1e000 0xc0c24000 0xc0c24000 0x128ac 0x128ac RW  0x1000
  LOAD           0xc31000 0xc0c38000 0xc0c38000 0x647a0 0x647a0 RW  0x1000
  LOAD           0xc957a0 0xc0c9c7a0 0xc0c9c7a0 0x07db8 0x07db8 RW  0x1000

  [34] .init.data        PROGBITS        c0c24000 c1e000 0128ac 00  WA  0   0 4096
  ## Note sh_offset(.text_itcm) = sh_offset(.data_dtcm) = sh_offset(.data)
  ## llvm-objcopy thinks .text_itcm/.data_dtcm are included in a PT_LOAD segment.
  [35] .text_itcm        PROGBITS        fffe0000 c31000 000000 00  WA  0   0  1
  [36] .data_dtcm        PROGBITS        fffe8000 c31000 000000 00  WA  0   0  1
  [37] .data             PROGBITS        c0c38000 c31000 0647a0 00  WA  0   0 32

ld.lld -z max-page-size=65536 (0x10000)
  LOAD           0xc34000 0xc0c24000 0xc0c24000 0x128ac 0x128ac RW  0x10000
  LOAD           0xc58000 0xc0c38000 0xc0c38000 0x647a0 0x647a0 RW  0x10000
  LOAD           0xcbc7a0 0xc0c9c7a0 0xc0c9c7a0 0x07db8 0x07db8 RW  0x10000

  [34] .init.data        PROGBITS        c0c24000 c34000 0128ac 00  WA  0   0 4096
  ## Note sh_offset(.text_itcm) != sh_offset(.data_dtcm)
  ## This is because to make sh_offset%maxpagesize = sh_addr%maxpagesize
  ## lld advances the file offset from 0xc50000 to 0xc58000 for .data_dtcm
  ## llvm-objcopy thinks .text_itcm/.data_dtcm are NOT included in a PT_LOAD segment.
  [35] .text_itcm        PROGBITS        fffe0000 c50000 000000 00  WA  0   0  1
  [36] .data_dtcm        PROGBITS        fffe8000 c58000 000000 00  WA  0   0  1
  [37] .data             PROGBITS        c0c38000 c58000 0647a0 00  WA  0   0 32

arm-linux-gnueabi-ld -z max-page-size=65536
  [23] .init.data        PROGBITS        c0c12000 c22000 0128ac 00  WA  0   0 4096
  [24] .text_itcm        PROGBITS        fffe0000 ca2558 000000 00   W  0   0  1
  [25] .data_dtcm        PROGBITS        fffe8000 ca2558 000000 00   W  0   0  1
  [26] .data             PROGBITS        c0c26000 c36000 0647a0 00  WA  0   0 32

In lld/ELF/Writer.cpp, removeEmptyPTLoad() removes empty (p_memsz=0) PT_LOAD segments.
For ld.lld -z max-page-size=65536 (0x10000), sh_offset(.data_dtcm) does not actually need to be advanced
because .data_dtcm's containing PT_LOAD was removed.

Created https://reviews.llvm.org/D79254 to improve lld's file offset assignment.
For my vmlinux build, this changes saves (162692352 - 162592184 = 100168) bytes.
Theoretically this can save at most 2*maxpagesize bytes.



The llvm-objcopy -O binary change still makes sense. I will tweak the description and submit. If linkers did not remove p_memsz=0 PT_LOAD segments, leading/trailing empty sections would have meaningful LMA, thus we would not need size>0 special cases in -O binary.
Comment 16 Fangrui Song 2020-05-01 16:32:16 PDT
Pushed the llvm-objcopy -O binary change https://reviews.llvm.org/D79229 ec786906f5feb4dceba1b5338927079e63e78095 (will be included in llvm 11.0.0)
Comment 17 Fangrui Song 2020-05-04 17:03:10 PDT
Pushed lld side change https://reviews.llvm.org/D79254 

> One possible approach that I've seen before is to do program header allocation in two passes. The first is a pessimistic maximum estimate of how many program headers are needed. This permits the maximum size of the program header table to be known. The contents of the program header table are only written after addresses are allocated so it is possible to write fewer program headers and zero out the remainder.

As a further follow-up, we can extend removeEmptyPTLoad to merge adjacent PT_LOAD segments.