LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 21573 - libxcb-1.11 miscompiles with Clang -O2 on x86_32/linux
Summary: libxcb-1.11 miscompiles with Clang -O2 on x86_32/linux
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Reid Kleckner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-14 09:47 PST by Joakim Gebart
Modified: 2014-11-14 19:45 PST (History)
3 users (show)

See Also:
Fixed By Commit(s):


Attachments
LLVM IR with -O1 (114.27 KB, text/plain)
2014-11-14 09:47 PST, Joakim Gebart
Details
LLVM IR with -O2 (178.68 KB, text/plain)
2014-11-14 09:48 PST, Joakim Gebart
Details
assembly output with -O1 (170.76 KB, text/plain)
2014-11-14 09:48 PST, Joakim Gebart
Details
assembly output with -O2 (232.67 KB, text/plain)
2014-11-14 09:49 PST, Joakim Gebart
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joakim Gebart 2014-11-14 09:47:18 PST
The function get_peer_sock_name in src/xcb_auth.c of libxcb-1.11 generates broken assembly code on x86_32 when compiled with Clang/LLVM-3.5.0 using -O2 optimization level, this does not happen on -O1 or lower level, it also seems to work correctly on x86_64 even with -O2.

The C code first calls malloc(), then checks the returned pointer for NULL, finally calls the function getpeername() or getsockname() via a function pointer, using the malloc'ed pointer as second argument.

When compiled with -O2 the assembly code generated will (incorrectly) dereference the pointer returned by malloc() and pass the value pointed at as second argument to the getpeername() function call.

By coincidence, the uninitialized malloced area may contain a valid pointer which results in a corrupt heap since the getpeername function will write its result to an incorrect address.

C-code snippet below:

/* Return a dynamically allocated socket address structure according
   to the value returned by either getpeername() or getsockname()
   (according to POSIX, applications should not assume a particular
   length for `sockaddr_un.sun_path') */
static struct sockaddr *get_peer_sock_name(int (*socket_func)(int,
                                                              struct sockaddr *,
                                                              socklen_t *),
                                           int fd)
{
    socklen_t socknamelen = sizeof(struct sockaddr) + INITIAL_SOCKNAME_SLACK;
    socklen_t actual_socknamelen = socknamelen;
    struct sockaddr *sockname = malloc(socknamelen);

    if (sockname == NULL)
        return NULL;

    /* Both getpeername() and getsockname() truncates sockname if
       there is not enough space and set the required length in
       actual_socknamelen */
    if (socket_func(fd, sockname, &actual_socknamelen) == -1) // <======= This is where the incorrect dereference happens. /////////////
        goto sock_or_realloc_error;

    if (actual_socknamelen > socknamelen)
    {
        struct sockaddr *new_sockname = NULL;
        socknamelen = actual_socknamelen;

        if ((new_sockname = realloc(sockname, actual_socknamelen)) == NULL)
            goto sock_or_realloc_error;

        sockname = new_sockname;

        if (socket_func(fd, sockname, &actual_socknamelen) == -1 ||
            actual_socknamelen > socknamelen)
            goto sock_or_realloc_error;
    }

    return sockname;

 sock_or_realloc_error:
    free(sockname);
    return NULL;
}
Comment 1 Joakim Gebart 2014-11-14 09:47:59 PST
Created attachment 13345 [details]
LLVM IR with -O1
Comment 2 Joakim Gebart 2014-11-14 09:48:19 PST
Created attachment 13346 [details]
LLVM IR with -O2
Comment 3 Joakim Gebart 2014-11-14 09:48:41 PST
Created attachment 13347 [details]
assembly output with -O1
Comment 4 Joakim Gebart 2014-11-14 09:49:00 PST
Created attachment 13348 [details]
assembly output with -O2
Comment 5 Joakim Gebart 2014-11-14 09:57:10 PST
(In reply to comment #4)
> Created attachment 13348 [details]
> assembly output with -O2

See line 88 in the attached file (%esi contains the pointer returned by malloc):

87	movl	%edi, 8(%esp)
88	movl	(%esi), %eax
89	movl	%eax, 4(%esp)
90	movl	%ebx, (%esp)
91	calll	getpeername
92	cmpl	$-1, %eax

With O1 (line 370 of attached dump, %ebp contains the pointer returned by malloc):

367	movl	%eax, 8(%esp)
.Ltmp54:
369	#DEBUG_VALUE: get_peer_sock_name:actual_socknamelen <- undef
370	movl	%ebp, 4(%esp)
371	movl	%edi, (%esp)
372	calll	*%esi
373	cmpl	$-1, %eax
Comment 6 Reid Kleckner 2014-11-14 12:19:08 PST
The -O1 and -O2 IR are correct. The miscompilation must be happening somewhere in the backend.
Comment 7 Reid Kleckner 2014-11-14 12:51:46 PST
(In reply to comment #6)
> The -O1 and -O2 IR are correct. The miscompilation must be happening
> somewhere in the backend.

Actually, you can see the problem in the -O1 IR. The original program appears to be doing something like this:

union __SOCKADDR_ARG { void *p; };
int getpeername(int, union __SOCKADDR_ARG, int*);
typedef int (*socket_func_t)(int, void *, int*);
int _xcb_get_auth_info(int fd, void *info, int display) {
  socket_func_t socket_function_ptr = (socket_func_t)&getpeername;
  socket_function_ptr(0, NULL, NULL);
}

Basically, a function that takes a 4-byte union by value is being casted to a function that takes a pointer instead of a union and then called with that prototype. I think that might be ill-formed C, so you might be out of luck.

The way that this goes awry in LLVM is that we have an attribute mismatch between the call and the function declaration. The declaration of getpeername has the byval attribute while the call does not:

declare i32 @getpeername(i32, %union.__SOCKADDR_ARG* byval align 4, i32*) #0

  %7 = call i32 @getpeername(i32 %fd, %union.__SOCKADDR_ARG* %6, i32* %actual_socknamelen.i) #2

CodeGen is "noticing" the byval on the declaration of getpeername that is missing on the call of getpeername, and is dereferencing %6 before performing the call. IMO CodeGen should *not* pull attributes off of the call target if it just happens to be direct.

More reduced:

struct PtrWrapper { void *p; };
void g(struct PtrWrapper p);
void f(void *p) {
  ((void (*)(void*))g)(p);
}

LLVM will load p when using before calling g. It appears that the SDag lowering path will only do this when optimizations are enabled.
Comment 8 Reid Kleckner 2014-11-14 19:45:34 PST
I went and looked at sys/socket.h, and it uses transparent unions. After reading the gcc docs (https://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html#index-g_t_0040code_007btransparent_005funion_007d-attribute-3157), we decided this was a bug in clang. transparent union changes how the value is passed.

Turns out, we had a test case with a FIXME already documenting the problem. I fixed it in r222074.