New user self-registration is disabled due to spam. For an account please email bugs-admin@lists.llvm.org with your e-mail address and full name.

Bug 26308 - clang crashes on valid code at -O1 and above on x86_64-linux-gnu with no diagnostic msg (SimplifyCFG)
Summary: clang crashes on valid code at -O1 and above on x86_64-linux-gnu with no diag...
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-25 23:44 PST by Qirun Zhang
Modified: 2016-01-27 15:54 PST (History)
4 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Qirun Zhang 2016-01-25 23:44:16 PST
The current clang trunk crashes when compiling the following valid code at -O1 and above on x86_64-linux-gnu in 64-bit mode. The 32-bit mode and -O0 level work fine.

It might be a regression from 3.7.0.


$ clang-trunk -v
clang version 3.9.0 (trunk 258673)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.8.4
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64


$ clang-trunk -Wall -Wpedantic -Wextra -O0 -c abc.c
$ clang-trunk -O1 -c abc.c
clang-3.9: error: unable to execute command: Segmentation fault (core dumped)
clang-3.9: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 3.9.0 (trunk 258673)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
clang-3.9: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang-3.9: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-3.9: note: diagnostic msg: /tmp/abc-d7f04c.c
clang-3.9: note: diagnostic msg: /tmp/abc-d7f04c.sh
clang-3.9: note: diagnostic msg:

********************


----
 cat abc.c
# 2 "" 3
int a, b, c;
fn1() {
  int d = "";
  while (fn1) {
    int *e;
    {
      int f;
      if (!memcmp(f, c, 0))
        continue;
    }
    for (;;) {
      b = e;
      for (; d; b = a)
        ;
      e++;
    }
  }
}
Comment 1 David Majnemer 2016-01-26 00:33:36 PST
Simplifycfg, reduced testcase:
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define i32 @fn1(i1 %B, i64 %load) {
entry:
  br label %while.body

while.body:
  br label %cleanup

cleanup:                                          ; preds = %if.end, %if.then
  %cleanup.dest.slot.0 = phi i1 [ false, %while.body ]
  br i1 %cleanup.dest.slot.0, label %for.cond, label %cleanup4

for.cond:                                         ; preds = %cleanup, %for.end
  %e.0 = phi i64* [ undef, %cleanup ], [ %incdec.ptr, %for.cond2 ]
  %pi = ptrtoint i64* %e.0 to i64
  %incdec.ptr = getelementptr inbounds i64, i64* %e.0, i64 1
  br label %for.cond2

for.cond2:                                        ; preds = %for.inc, %for.cond
  %storemerge = phi i64 [ %pi, %for.cond ], [ %load, %for.cond2 ]
  br i1 %B, label %for.cond2, label %for.cond

cleanup4:                                         ; preds = %cleanup
  br label %while.body
}
Comment 2 David Majnemer 2016-01-26 16:22:53 PST
Bisection points to r228826:
Author: James Molloy <james.molloy@arm.com>
Date:   Wed Feb 11 12:15:41 2015 +0000

    [SimplifyCFG] Swap to using TargetTransformInfo for cost
     analysis.
    
    We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
    heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
    different heuristic now so there is a slight test churn.
    
    Test changes:
      * combine-comparisons-by-cse.ll: Removed unneeded branch check.
      * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
      * coalesce-subregs.ll: Superfluous block check.
      * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
      * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
      * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

James, can you please take a look?
Comment 3 James Molloy 2016-01-27 02:42:51 PST
Hi,

r228826 appears to be a red herring - I'm not quite sure how your bisect script got there but to me it looks like r255660:

    [SimplifyCFG] allow speculation of exactly one expensive instruction (PR24818)
    
    This is the last general step to allow more IR-level speculation with a safety harness in place in CodeGenPrepare.
    
    The intent is to restore the behavior enabled by:
    http://reviews.llvm.org/rL228826
    
    but prevent bad performance such as:
    https://llvm.org/bugs/show_bug.cgi?id=24818
    
    Earlier patches in this sequence:
    D12882 (disable SimplifyCFG speculation for expensive instructions)
    D13297 (have CGP despeculate expensive ops)
    D14630 (have CGP despeculate special versions of cttz/ctlz)
    
    As shown in the test cases, we only have two instructions currently affected: ctz for some x86 and fdiv generally.
    Allowing exactly one expensive instruction is a bit of a hack, but it lines up with what is currently implemented
    in CGP. If we make the despeculation more general in CGP, we can make the speculation here more liberal.
    
    A follow-up patch will adjust the cost for sqrt and possibly other typically expensive math intrinsics (currently
    everything is cheap by default). GPU targets would likely want to override those expensive default costs (just as
    they probably should already override the cost of div/rem) because just about any math is cheaper than control-flow
    on those targets.
    
    Differential Revision: http://reviews.llvm.org/D15213

Running your testcase with `opt -simplifycfg -disable-output` I see the segfault, and the cause is a stack overflow:

... snip ...
    frame #87325: 0x0000000101281368 opt`DominatesMergePoint(V=<unavailable>, BB=0x0000000102a17c30, AggressiveInsts=0x00007fff5fbfeb58, CostRemaining=0x00007fff5fbfeb44, TTI=0x0000000102a181d0, Depth=2) + 504 at SimplifyCFG.cpp:325
    frame #87326: 0x0000000101281368 opt`DominatesMergePoint(V=<unavailable>, BB=0x0000000102a17c30, AggressiveInsts=0x00007fff5fbfeb58, CostRemaining=0x00007fff5fbfeb44, TTI=0x0000000102a181d0, Depth=1) + 504 at SimplifyCFG.cpp:325
    frame #87327: 0x000000010126a25f opt`(anonymous namespace)::SimplifyCFGOpt::run(llvm::BasicBlock*) + 353 at SimplifyCFG.cpp:1857
    frame #87328: 0x000000010126a0fe opt`(anonymous namespace)::SimplifyCFGOpt::run(this=0x00007fff5fbfec50, BB=0x0000000102a17c30) + 1374 at SimplifyCFG.cpp:5251
... snip ...

DominatesMergePoint is infinitely recursing. It is keeping a Depth argument that keeps increasing but AFAICT there is a case where that cutoff is never checked.

Sanjay, can you comment more?
Comment 4 Sanjay Patel 2016-01-27 09:52:10 PST
(In reply to comment #3) 
> Sanjay, can you comment more?

I hope so - getting a debug setup in place now.
Comment 5 Sanjay Patel 2016-01-27 11:02:39 PST
David's regression analysis is correct. The reduced test will fail even if we revert:
http://reviews.llvm.org/rL255660

Alternatively, we included a chicken switch for the functionality that was added with that commit, and you can see that the test still fails with "-speculate-one-expensive-inst=0".

The problem is that we're hitting a zero cost cycle with the PHI/gep sequence:
  %e.0 = phi i64* [ undef, %cleanup ], [ %incdec.ptr, %for.cond2 ]
  %incdec.ptr = getelementptr inbounds i64, i64* %e.0, i64 1

Since the CostRemaining is never changing, we spin until we crash. I think we need to limit recursion depth to avoid this. I'll post a patch for review.
Comment 6 Sanjay Patel 2016-01-27 12:08:35 PST
http://reviews.llvm.org/D16637
Comment 7 Sanjay Patel 2016-01-27 15:54:15 PST
Marking as fixed since we shouldn't crash after:
http://reviews.llvm.org/rL258971

...but as noted by Daniel Berlin and David, we can do better than using a recursion limit to prevent this bug.