Originally filed at https://bugzilla.opensuse.org/show_bug.cgi?id=1067478 When working on making openSUSE Linux package builds reproducible, I found that our gnustep-libobjc2 differed across builds because libobjc2-1.8.1 contains an arc.m that produces 8 different .o files when compiled with clang-4.0.1, unless ASLR is disabled. From that I condensed this standalone minimal reproducer: cat > test.m <<EOF typedef id (*IMP)(id, ...); static Class c1; void f1(void) { @selector(f2); IMP v1 = @selector(f3); } EOF for i in $(seq 15) ; do clang -c test.m -o /dev/stdout 2>/dev/null | md5sum ; done | sort | uniq -c | wc -l Actual Result: 2 Expected Result: 1 Same result with clang-5.0.0
Any chance you can narrow down/bisect to a single commit?
(and, FWIW, check whether clang trunk (top of tree) still has the problem :)
bisect would need a version without this problem 3.8.0 also has it. I'll leave testing the trunk to someone else... but I did for i in $(seq 5) ; do echo 10000 > /proc/sys/kernel/ns_last_pid ; ltrace -f -o /tmp/clang.ltrace.$i \ clang-5.0.0 -c test.m -o /dev/stdout 2>/dev/null | md5sum ; done perl -i -pe 's/0x[0-9a-f]{7,16}/0x55501234/g' /tmp/clang.ltrace.* and could see that runs with different results differed in ~160 lines while runs with similar results differed only in ~90 lines.
Created attachment 19404 [details] diff of diffs run 1 and 2 produced different results run 1 and 5 produced same results
I can reproduce at trunk. The diff shows at the LLVM IR level, suggesting this is a Clang issue. Here's a diff of the output from two runs of bin/clang -c test.m -S -emit-llvm -o /tmp/x.ll (I'll attach the files) --- /tmp/14014e1ee9b6f16fac95459b1c1e3d19.ll 2017-11-13 12:41:14.420157440 -0800 +++ /tmp/428f773972593598e2ee5d92eaf888b5.ll 2017-11-13 12:41:04.308240890 -0800 @@ -7,9 +7,9 @@ @1 = private unnamed_addr constant [33 x i8] c"__ObjC_Protocol_Holder_Ugly_Hack\00", align 1 @.objc_protocol_list = internal global { i8*, i64, [0 x i8*] } zeroinitializer, align 8 @2 = internal global { i8*, i8*, i8*, i8*, i8* } { i8* getelementptr inbounds ([12 x i8], [12 x i8]* @0, i64 0, i64 0), i8* getelementptr inbounds ([33 x i8], [33 x i8]* @1, i64 0, i64 0), i8* null, i8* null, i8* bitcast ({ i8*, i64, [0 x i8*] }* @.objc_protocol_list to i8*) }, align 8 -@.objc_sel_namef2 = linkonce_odr constant [3 x i8] c"f2\00" @.objc_sel_namef3 = linkonce_odr constant [3 x i8] c"f3\00" -@.objc_selector_list = internal global [3 x { i8*, i8* }] [{ i8*, i8* } { i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.objc_sel_namef2, i64 0, i64 0), i8* null }, { i8*, i8* } { i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.objc_sel_namef3, i64 0, i64 0), i8* null }, { i8*, i8* } zeroinitializer], align 8 +@.objc_sel_namef2 = linkonce_odr constant [3 x i8] c"f2\00" +@.objc_selector_list = internal global [3 x { i8*, i8* }] [{ i8*, i8* } { i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.objc_sel_namef3, i64 0, i64 0), i8* null }, { i8*, i8* } { i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.objc_sel_namef2, i64 0, i64 0), i8* null }, { i8*, i8* } zeroinitializer], align 8 @3 = internal global { i64, { i8*, i8* }*, i16, i16, [3 x i8*] } { i64 2, { i8*, i8* }* getelementptr inbounds ([3 x { i8*, i8* }], [3 x { i8*, i8* }]* @.objc_selector_list, i32 0, i32 0), i16 0, i16 1, [3 x i8*] [i8* bitcast ({ i8*, i8*, i8*, i8*, i8* }* @2 to i8*), i8* null, i8* null] }, align 8 @.objc_source_file_name = private unnamed_addr constant [9 x i8] c"./test.m\00", align 1 @4 = internal global { i64, i64, i8*, { i64, { i8*, i8* }*, i16, i16, [3 x i8*] }* } { i64 8, i64 32, i8* getelementptr inbounds ([9 x i8], [9 x i8]* @.objc_source_file_name, i64 0, i64 0), { i64, { i8*, i8* }*, i16, i16, [3 x i8*] }* @3 }, align 8 @@ -19,7 +19,7 @@ define void @f1() #0 { entry: %v1 = alloca i8* (i8*, ...)*, align 8 - store i8* (i8*, ...)* bitcast ({ i8*, i8* }* getelementptr inbounds ([3 x { i8*, i8* }], [3 x { i8*, i8* }]* @.objc_selector_list, i64 0, i32 1) to i8* (i8*, ...)*), i8* (i8*, ...)** %v1, align 8 + store i8* (i8*, ...)* bitcast ([3 x { i8*, i8* }]* @.objc_selector_list to i8* (i8*, ...)*), i8* (i8*, ...)** %v1, align 8 ret void }
Created attachment 19416 [details] ir output one
Created attachment 19417 [details] ir output two
+John, do you know someone familiar with Obj-C codegen who might want to take a look?
(In reply to Hans Wennborg from comment #8) > +John, do you know someone familiar with Obj-C codegen who might want to > take a look? It looks like the GNU-runtime lowering iterates over SelectorTable, which is ultimately a DenseMap, when building the module initialization function (CGObjCGNU::ModuleInitFunction). That should be easy for anyone to fix, but CC'ing David Chisnall in case he's still interested in maintaining that.
I guess the correct solution is to sort the table before emitting the selectors, though I wonder if any of the other things (class, category tables?) have the same problem? I probably won't get to it for a week or so, so if there's a simple fix that someone else can commit then please go ahead.
(In reply to David Chisnall from comment #10) > I guess the correct solution is to sort the table before emitting the > selectors, though I wonder if any of the other things (class, category > tables?) have the same problem? I probably won't get to it for a week or > so, so if there's a simple fix that someone else can commit then please go > ahead. Sorting would be a reasonable option; so would just switching the data structure to being an llvm::MapVector. Sorting sounds like the better option because it's probably something that the runtime could actually take advantage of, either by insisting on it (with a flag, maybe) or just by writing its initialization algorithms in a way that benefits if the table happens to be sorted.
I'm (intermittently, as time permits) working on a new GNUstep ABI that is not constrained by some of the original GCC runtime's requirement to work with linkers from the '80s, which will make this somewhat moot (in the new ABI, selectors are deduplicated across compilation units by the linker and the possible nondeterminism may come back as a result of linker behaviour).
Hi, was there any progress towards creating reproducible binaries? Anything I can help with?
It sounds like David does have a patch prepared for that, yeah.
(In reply to John McCall from comment #14) > It sounds like David does have a patch prepared for that, yeah. The patch is now up here: https://reviews.llvm.org/D50559
Interestingly the minimized reproducer now produced just 1 result with clang-6.0.0 now, but the original arc.m still produced 8. I'm building a patched clang now to test it.