LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 26761 - clang 3.8.0 messes up __builtin_dwarf_cfa (), at least for TARGET_ARCH=powerpc and powerpc64 (gcc/g++ mismatch)
Summary: clang 3.8.0 messes up __builtin_dwarf_cfa (), at least for TARGET_ARCH=powerp...
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Common Code Generator Code (show other bugs)
Version: trunk
Hardware: Other FreeBSD
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks: 25780
  Show dependency tree
 
Reported: 2016-02-27 17:44 PST by Mark Millard
Modified: 2016-11-28 19:27 PST (History)
3 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Millard 2016-02-27 17:44:45 PST
When run in a TARGET_ARCH=powerpc buildworld based environment that was built via clang 3.8.0 from FreeBSD's projects/clang380-import source the following 8 line program gets a SEGV. But before it does it ignores the catch clause and calls std::terminate.

#include <exception>

int main(void)
{
    try { throw std::exception(); }
    catch (std::exception& e) {} // same result without &
    return 0;
}

(The above is a simplification of the original discovery context. The actual problem code is not in the above source but in supporting FreeBSD library code when compiled via clang 3.8.0.)

I've tracked down the problem to misbehavior of clang 3.8.0 code generation for __builtin_dwarf_cfa () as used in:

#define uw_init_context(CONTEXT)                                           \
  do                                                                       \
    {                                                                      \
      /* Do any necessary initialization to access arbitrary stack frames. \
         On the SPARC, this means flushing the register windows.  */       \
      __builtin_unwind_init ();                                            \
      uw_init_context_1 (CONTEXT, __builtin_dwarf_cfa (),                  \
                         __builtin_return_address (0));                    \
    }                                                                      \
  while (0)
. . .
85	_Unwind_Reason_Code
86	_Unwind_RaiseException(struct _Unwind_Exception *exc)
87	{
88	  struct _Unwind_Context this_context, cur_context;
89	  _Unwind_Reason_Code code;
90	
91	  /* Set up this_context to describe the current stack frame.  */
92	  uw_init_context (&this_context);

In the below r4 ends up with the __builtin_dwarf_cfa () value supplied to uw_init_context_1:

Dump of assembler code for function _Unwind_RaiseException:
   0x419a8fd8 <+0>:	mflr    r0
   0x419a8fdc <+4>:	stw     r31,-148(r1)
   0x419a8fe0 <+8>:	stw     r30,-152(r1)
   0x419a8fe4 <+12>:	stw     r0,4(r1)
   0x419a8fe8 <+16>:	stwu    r1,-2992(r1)
   0x419a8fec <+20>:	mr      r31,r1
. . .
   0x419a9094 <+188>:	mr      r4,r31
   0x419a9098 <+192>:	mflr    r30
   0x419a909c <+196>:	lwz     r5,2996(r31)
   0x419a90a0 <+200>:	mr      r3,r28
   0x419a90a4 <+204>:	bl      0x419a929c <uw_init_context_1>

That r4 ends up holding the stack pointer value for after it has been decremented. r4 is not pointing at the boundary with the caller's frame.

The .eh_frame information and unwind code is set up for pointing at the boundary with the caller's frame. So the cfa relative addressing is messed up for what it actually extracts.

Contrast this with some other compiler's TARGET_ARCH=powerpc64 code (for FreeBSD's projects/clang380-import's source code again) where r4 is  made to be at the boundary with the caller's frame:

Dump of assembler code for function _Unwind_RaiseException:
   0x00000000501cb810 <+0>:	mflr    r0
   0x00000000501cb814 <+4>:	stdu    r1,-5648(r1)
. . .
   0x00000000501cb8d0 <+192>:	addi    r4,r1,5648
   0x00000000501cb8d4 <+196>:	stw     r12,5656(r1)
   0x00000000501cb8d8 <+200>:	mr      r28,r3
   0x00000000501cb8dc <+204>:	addi    r31,r1,2544
   0x00000000501cb8e0 <+208>:	mr      r3,r27
   0x00000000501cb8e4 <+212>:	addi    r29,r1,112
   0x00000000501cb8e8 <+216>:	bl      0x501cae60 <uw_init_context_1>

(clang 3.8.0 is unable to complete a buildworld for FreeBSD last I checked. Thus my use of another compiler.)

NOTE: The powerpc (32-bit) issue may in some way be associated with the clang 3.8.0  FreeBSD powerpc ABI violation in how it handles the stack pointer: TARGET_ARCH=powerpc builds are currently using a "red zone" in the stack, decrementing the stack pointer late, and incrementing the stack pointer early compared to the FreeBSD ABI rules. (This is similar to the official FreeBSD ABI for TARGET_ARCH=powerpc64.)
Comment 1 Mark Millard 2016-02-27 18:19:55 PST
(In reply to comment #0)

Here is a two line self-contained program that shows he problem when the .o is examined with objdump.

I provide comparisons with .o's from g++49 or g++5.

# more builtin_dwarf_cfa.cpp 
extern void g(void*);
void f() { g(__builtin_dwarf_cfa()); }

In a TARGET_ARCH=powerpc64 context:

# clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd


Disassembly of section .text:
0000000000000000 <._Z1fv> mflr    r0
0000000000000004 <._Z1fv+0x4> std     r31,-8(r1)
0000000000000008 <._Z1fv+0x8> std     r0,16(r1)
000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
0000000000000010 <._Z1fv+0x10> mr      r31,r1
0000000000000014 <._Z1fv+0x14> mr      r3,r31
0000000000000018 <._Z1fv+0x18> bl      0000000000000018 <._Z1fv+0x18>
000000000000001c <._Z1fv+0x1c> nop
0000000000000020 <._Z1fv+0x20> addi    r1,r1,128
0000000000000024 <._Z1fv+0x24> ld      r0,16(r1)
0000000000000028 <._Z1fv+0x28> ld      r31,-8(r1)
000000000000002c <._Z1fv+0x2c> mtlr    r0
0000000000000030 <._Z1fv+0x30> blr
        ...

r3 does not point to the boundary with the caller's stack frame.

By contrast for g++49:

# g++49 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o | more

builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd


Disassembly of section .text:
0000000000000000 <._Z1fv> mflr    r0
0000000000000004 <._Z1fv+0x4> std     r0,16(r1)
0000000000000008 <._Z1fv+0x8> std     r31,-8(r1)
000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
0000000000000010 <._Z1fv+0x10> mr      r31,r1
0000000000000014 <._Z1fv+0x14> addi    r9,r31,128
0000000000000018 <._Z1fv+0x18> mr      r3,r9
000000000000001c <._Z1fv+0x1c> bl      000000000000001c <._Z1fv+0x1c>
0000000000000020 <._Z1fv+0x20> nop
0000000000000024 <._Z1fv+0x24> addi    r1,r31,128
0000000000000028 <._Z1fv+0x28> ld      r0,16(r1)
000000000000002c <._Z1fv+0x2c> mtlr    r0
0000000000000030 <._Z1fv+0x30> ld      r31,-8(r1)
0000000000000034 <._Z1fv+0x34> blr
0000000000000038 <._Z1fv+0x38> .long 0x0
000000000000003c <._Z1fv+0x3c> .long 0x90001
0000000000000040 <._Z1fv+0x40> lwz     r0,1(r1)

r3 does point to the boundary with the caller's stack frame.

For TARGET_ARCH=powerpc, clang 3.8.0 first:

# clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd


Disassembly of section .text:
00000000 <_Z1fv> mflr    r0
00000004 <_Z1fv+0x4> stw     r31,-4(r1)
00000008 <_Z1fv+0x8> stw     r0,4(r1)
0000000c <_Z1fv+0xc> stwu    r1,-16(r1)
00000010 <_Z1fv+0x10> mr      r31,r1
00000014 <_Z1fv+0x14> mr      r3,r31
00000018 <_Z1fv+0x18> bl      00000018 <_Z1fv+0x18>
0000001c <_Z1fv+0x1c> addi    r1,r1,16
00000020 <_Z1fv+0x20> lwz     r0,4(r1)
00000024 <_Z1fv+0x24> lwz     r31,-4(r1)
00000028 <_Z1fv+0x28> mtlr    r0
0000002c <_Z1fv+0x2c> blr

Then g++5 (5.3):

# g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd


Disassembly of section .text:
00000000 <_Z1fv> stwu    r1,-16(r1)
00000004 <_Z1fv+0x4> mflr    r0
00000008 <_Z1fv+0x8> stw     r0,20(r1)
0000000c <_Z1fv+0xc> stw     r31,12(r1)
00000010 <_Z1fv+0x10> mr      r31,r1
00000014 <_Z1fv+0x14> addi    r9,r31,16
00000018 <_Z1fv+0x18> mr      r3,r9
0000001c <_Z1fv+0x1c> bl      0000001c <_Z1fv+0x1c>
00000020 <_Z1fv+0x20> nop
00000024 <_Z1fv+0x24> addi    r11,r31,16
00000028 <_Z1fv+0x28> lwz     r0,4(r11)
0000002c <_Z1fv+0x2c> mtlr    r0
00000030 <_Z1fv+0x30> lwz     r31,-4(r11)
00000034 <_Z1fv+0x34> mr      r1,r11
00000038 <_Z1fv+0x38> blr
Comment 2 Mark Millard 2016-02-27 19:04:21 PST
(In reply to comment #1)

I should have been explicit:

The stack frames boundary that I reference in the 2-line examples are between:

A) f's frame
and
B) f's caller's frame

(Not between f vs. g.)

(The external g function just avoided any potential optimization that might eliminate the code I was trying to produce.)


(B) is rather implicit as I wrote comment #1. It could lead to confusion. Thus this note.
Comment 3 Mark Millard 2016-02-27 19:13:52 PST
Looks like arm has the same sort of distinction vs. g++:

# clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-littlearm


Disassembly of section .text:
00000000 <_Z1fv> push	{fp, lr}
00000004 <_Z1fv+0x4> mov	fp, sp
00000008 <_Z1fv+0x8> mov	r0, fp
0000000c <_Z1fv+0xc> bl	00000000 <_Z1gPv>
00000010 <_Z1fv+0x10> pop	{fp, pc}
# g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
# /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o

builtin_dwarf_cfa.o:     file format elf32-littlearm


Disassembly of section .text:
00000000 <_Z1fv> push	{fp, lr}
00000004 <_Z1fv+0x4> add	fp, sp, #4, 0
00000008 <_Z1fv+0x8> add	r3, fp, #4, 0
0000000c <_Z1fv+0xc> mov	r0, r3
00000010 <_Z1fv+0x10> bl	00000000 <_Z1gPv>
00000014 <_Z1fv+0x14> nop			; (mov r0, r0)
00000018 <_Z1fv+0x18> pop	{fp, pc}
Comment 4 Mark Millard 2016-02-28 23:00:16 PST
Here is what the "ABI for the ARM 32 32-bit Architecture" "DWARF for the ARM Architecture" document says about the CFA:

3.4 Canonical Frame Address

The term Canonical Frame Address (CFA) is defined in [GDWARF], §6.4, Call Frame Information. This ABI adopts the typical definition of CFA given there.
 The CFA is the value of the stack pointer (r13) at the call site in the previous frame.


This, with the armv6 code I've shown via "objdump -d", indicates that for armv6 clang++'s __builtin_dwarf_cfa() return value is not the same value as the official ARM ABI indicates. It also indicates that what g++ returns does match the official ARM ABI.
Comment 5 Mark Millard 2016-03-03 04:13:03 PST
I do not claim that the following is the proper, global, __builtin_dwarf_cfa () fix given the history of it being a gcc/g++ mismatch since clang 2.7 or so when it was added to clang. But this work around on a powerpc FreeBSD box has let me investigate later issues in the C++ exception handling while using FreeBSD's libgcc_s. Thanks go to Roman Divacky for finding what I needed to look at in clang/llvm for this.

For case Intrinsic::eh_dwarf_cfa in SelectionDAGBuilder::visitIntrinsicCall . . .

# svnlite diff contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Index: contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
===================================================================
--- contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp	(revision 296011)
+++ contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp	(working copy)
@@ -4618,7 +4618,7 @@
                                  CfaArg);
     SDValue FA = DAG.getNode(
         ISD::FRAMEADDR, sdl, TLI.getPointerTy(DAG.getDataLayout()),
-        DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout())));
+        DAG.getConstant(1, sdl, TLI.getPointerTy(DAG.getDataLayout())));
     setValue(&I, DAG.getNode(ISD::ADD, sdl, FA.getValueType(),
                              FA, Offset));
     return nullptr;


In other words: use Frame Depth 1 instead of Frame Depth 0. So get a frame/stack boundary that is between the frame for routine using __builtin_dwarf_cfa () (_Unwind_RaiseException here) and the frame for its caller (throw code),  matching what gcc/g++ does when it is used to compile that same code.


For TARGET_ARCH=powerpc (and likely powerpc64?) this allowed getting much farther into the exception handling for some types of contexts. [__builtin_dwarf_cfa () is not the only issue overall.]

FreeBSD has not had clang for TARGET_ARCH=powerpc (or powerpc64) yet and so the clang history does not matter so much for it for those architectures but matching gcc/g++ helps, allowing a mix of clang and gcc use overall. (Actually some users of lang/clang*'s in ports might well notice the difference.)

It is less clear to me what is appropriate for TARGET_ARCH's that have been using clang for buildworld in FreeBSD land for some time --or for outside FreeBSD contexts: Frame Depth 0 use here has been around a long time.

Since other things use the lower level code interface involved the above avoids changing the "API" results for the other uses by changing the calling code instead. (I've not checked if any of the other uses of the lower level code have off by one problems compared to gcc/g++ as well.)
Comment 6 Mark Millard 2016-03-06 14:48:21 PST
Adjusting the example source shows that the _builtin_dwarf_cfa() result depends on where it is used:

# more builtin_dwarf_cfa.cpp
#include <stdlib.h>

extern void g(void*);
void f0() { g(__builtin_dwarf_cfa()); }
void f1()
{ auto f1_cfa = __builtin_dwarf_cfa(); g(f1_cfa); }

f0 and f1 pass g different offsets from the frame pointer. See below for a TARGET_ARCH=powerpc example. g++ behaves like f1 for both f1 and f0.

# clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp

results in:

Disassembly of section .text:
00000000 <_Z2f0v> mflr    r0
00000004 <_Z2f0v+0x4> stw     r31,-4(r1)
00000008 <_Z2f0v+0x8> stw     r0,4(r1)
0000000c <_Z2f0v+0xc> stwu    r1,-16(r1)
00000010 <_Z2f0v+0x10> mr      r31,r1
00000014 <_Z2f0v+0x14> mr      r3,r31
00000018 <_Z2f0v+0x18> lwz     r3,0(r3)
0000001c <_Z2f0v+0x1c> bl      0000001c <_Z2f0v+0x1c>
00000020 <_Z2f0v+0x20> addi    r1,r1,16
00000024 <_Z2f0v+0x24> lwz     r0,4(r1)
00000028 <_Z2f0v+0x28> lwz     r31,-4(r1)
0000002c <_Z2f0v+0x2c> mtlr    r0
00000030 <_Z2f0v+0x30> blr
00000034 <_Z2f1v> mflr    r0
00000038 <_Z2f1v+0x4> stw     r31,-4(r1)
0000003c <_Z2f1v+0x8> stw     r0,4(r1)
00000040 <_Z2f1v+0xc> stwu    r1,-16(r1)
00000044 <_Z2f1v+0x10> mr      r31,r1
00000048 <_Z2f1v+0x14> mr      r3,r31
0000004c <_Z2f1v+0x18> lwz     r3,0(r3)
00000050 <_Z2f1v+0x1c> stw     r3,8(r31)
00000054 <_Z2f1v+0x20> bl      00000054 <_Z2f1v+0x20>
00000058 <_Z2f1v+0x24> addi    r1,r1,16
0000005c <_Z2f1v+0x28> lwz     r0,4(r1)
00000060 <_Z2f1v+0x2c> lwz     r31,-4(r1)
00000064 <_Z2f1v+0x30> mtlr    r0
00000068 <_Z2f1v+0x34> blr
Comment 7 Mark Millard 2016-03-06 15:13:07 PST
(In reply to comment #6)

Ignore comment 6 (I wish I could just delete it to avoid creating confusions).

> Adjusting the example source shows that the _builtin_dwarf_cfa() result
> depends on where it is used:
> . . .

WRONG!

I misread where an offset was used and was using a clang++ 3.8.0 with a local workaround in it as well.

Not one of my better days.
Comment 8 Mark Millard 2016-03-09 02:58:17 PST
Another way of seeing which boundary of a frame (low memory address side vs. high memory address side) is the sign of the offset used for the DW_REG_offset figures for after the stack pointer has been adjusted. A powerpc example (from dwarfdump -v -v -F) is:

DW_CFA_offset r28 -160  (40 * -4)

Negative offsets are for the cfa having a high-address boundary value.

Positive offsets would be for the cfa having a low-address boundary value.

So the .eh_frame information for powerpc (and powerpc64) indicates that the high-address side is supposed to be used for the cfa. This is also how the official documents read: the stack pointer value on entry before the adjustment for the local frame. (Armv6/armv7 also get Negative offsets. Likely others do was well.)

PPCTargetLowering::LowerFRAMEADDR for depth zero returns the lower address side, in part because PPCTargetLowering::LowerRETURNADDR uses PPCTargetLowering::LowerFRAMEADDR based on a numbering where that would be the zero position: as stands both count Frame Depth the same way.

But PPCTargetLowering::LowerRETURNADDR works correctly as is: it would have to be adjusted if the PPCTargetLowering::LowerFRAMEADDR Frame Depth numbering was changed.

Thus it appears that in "case Intrinsic::eh_dwarf_cfa" in SelectionDAGBuilder::visitIntrinsicCall its Frame Depth for its ISD::FRAMEADDR use should be converting from the ISD::FRAMEADDR "low-address side" view to the cfa "high-address side" view by requesting the "low-address" side of the "Depth 1 Frame" (i.e., the prior Frame Pointer register value): ISD::FRAMEADDR returns the "low-address side" for the given depth.
Comment 9 Hal Finkel 2016-08-30 11:13:15 PDT
Patch posted for review: https://reviews.llvm.org/D24038
Comment 10 Hal Finkel 2016-09-01 05:38:48 PDT
(In reply to comment #9)
> Patch posted for review: https://reviews.llvm.org/D24038

r280350. Also, PR30231 filed to track the potential issue on ARM.
Comment 11 Mark Millard 2016-09-10 15:53:21 PDT
(In reply to comment #10)
> (In reply to comment #9)
> > Patch posted for review: https://reviews.llvm.org/D24038
> 
> r280350. Also, PR30231 filed to track the potential issue on ARM.

Thanks Hal.

Dimitry Andric (dim at FreeBSD.org) has written:

> I merged the upstream fix to projects/clang390-import:
> 
> https://svnweb.freebsd.org/changeset/base/305683

So FreeBSD stable/12 will be adopting your changes.



As for my activity:

I'll not have access to powerpc64s/powerpcs for a few weeks yet.
Comment 12 Mark Millard 2016-09-10 16:02:24 PDT
(In reply to comment #11)
> So FreeBSD stable/12 will be adopting your changes.

That should have been head (current) for FreeBSD 12.
Comment 13 Mark Millard 2016-11-28 19:27:57 PST
(In reply to comment #12)
> (In reply to comment #11)
> > So FreeBSD stable/12 will be adopting your changes.
> 
> That should have been head (current) for FreeBSD 12.

powerpc64 notes (and only ppc64 for now). . .

With my amd64 -> TARGET_ARCH=powerpc64 buildworld and the FreeBSD
clang 3.9.0 that in includes the simple 2-line example works fine
(code inspection of the .o file).

Thanks!

But there are other problems that still prevent the following from
working overall. Yet 26761's issue is fixed.

#include <exception>

int main(void)
{
    try { throw std::exception(); }
    catch (std::exception& e) {} // same result without &
    return 0;
}

An inspection of the code produced in gdb shows:

   0x00000000501c72bc <+0>:	mflr    r0
   0x00000000501c72c0 <+4>:	mfcr    r12
   0x00000000501c72c4 <+8>:	std     r31,-152(r1)
   0x00000000501c72c8 <+12>:	std     r0,16(r1)
   0x00000000501c72cc <+16>:	stw     r12,8(r1)
   0x00000000501c72d0 <+20>:	stdu    r1,-5840(r1)
   0x00000000501c72d4 <+24>:	mr      r31,r1
. . .
   0x00000000501c7394 <+216>:	addi    r4,r31,5840
. . .
   0x00000000501c7414 <+344>:	bl      0x501c76dc <uw_init_context_1>

So that much is now correct (matching 26761's issue).

But overall it gets:

Program terminated with signal SIGABRT, Aborted.
#0  0x00000000502f8868 in .__sys_thr_kill () from /lib/libc.so.7
(gdb) bt
#0  0x00000000502f8868 in .__sys_thr_kill () from /lib/libc.so.7
#1  0x00000000502f8818 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x00000000502f8748 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
#3  0x00000000501c9cbc in _Unwind_GetGR (context=<optimized out>, index=65) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:180
#4  uw_update_context_1 (context=<optimized out>, fs=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1353
#5  0x00000000501c78b0 in uw_init_context_1 (context=0xffffffffffffd1e0, outer_cfa=0xffffffffffffd940, outer_ra=0x50179ea0 <__cxa_throw(void*, std::type_info*, void (*)(void*))+248>)
    at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:1442
#6  0x00000000501c7418 in _Unwind_RaiseException (exc=<optimized out>) at /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:92
#7  0x0000000050179ea0 in throw_exception (ex=<optimized out>) at /usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:774
#8  __cxa_throw (thrown_exception=<optimized out>, tinfo=<optimized out>, dest=<optimized out>) at /usr/src/lib/libcxxrt/../../contrib/libcxxrt/exception.cc:801
#9  0x0000000010000cf0 in .main ()

because of other issues with C++ exception handling.

Note: That gdb can do the bt now is a big improvement for powerpc64 as
I remember. More is definitely working than when I reported 26761.

Again: Thanks!