Hi, if one compiles the following sample with llvm/clang r351192 and the command line "clang -x c++ test.cpp -fdebug-info-for-profiling -fstack-protector -o a.out -O0 -g -c", --------8<-------- int foo (int input_param) { int array1[input_param]; array1[0]=18; return array1[0]; } -------->8-------- And then examine the line tables with dwarfdump -l a.out: --------8<-------- 0x00000000 [ 2, 0] NS uri: "/home/jmorse/fdebug.cpp" 0x00000011 [ 1, 0] NS PE DI=0x2 0x00000015 [ 1, 0] DI=0x4 0x00000018 [ 3,16] NS 0x0000001d [ 3, 5] 0x00000020 [ 3, 5] DI=0x2 0x00000024 [ 3, 5] DI=0x4 ... -------->8-------- We have the line numbers going backwards at the start of the function (two to one to three). This is due to the x86-discriminate-memops pass picking the default line number to instrument memory operations with as being the start of the function. (See ReferenceDI in X86DiscriminateMemOps.cpp). These line numbers are potentially misleading IMHO -- would we be able to use a zero line number by default instead? According to the DWARF spec this corresponds to compiler-generated code and will be ignored by debuggers. That way X86DiscriminateMemOps won't have any affect on debugging, and the change (AFAIK) will have no effect on its operation.
If I understand the proposal correctly, we would use the line number "0" for all instructions that have memops and do not already have some debug info. This would likely very quickly hit the limit of 4096 base discriminator values that we can currently use in llvm (see https://bugs.llvm.org/show_bug.cgi?id=39890) I understand that the current approach may generate misleading line numbers. For the purposes of the pass (unique ID-ing instructions with memops), that's fine. However - and noting that the pass is enabled only in profiling scenarios - is there a specific scenario that is blocked by the current behavior?
(In reply to Mircea Trofin from comment #1) > I understand that the current approach may generate misleading line numbers. > For the purposes of the pass (unique ID-ing instructions with memops), > that's fine. However - and noting that the pass is enabled only in profiling > scenarios - is there a specific scenario that is blocked by the current > behavior? If this would have a negative impact on debug stepping, then that's a problem for us. From the initial llvm-dev RFC for -fdebug-info-for-profiling [1]: ~~~~ For our users we fully expect their profile/debug builds to be the same thing (that is, almost all debugging is on fully optimized code, where fully optimized also means part of an iterative profiling workflow). Having extra debug information that only benefits the profiling part seems reasonable. What would be unacceptable is if we ever had to ask our users to make a trade off between debug info that was best for profiling at the expense of debugging quality or vice versa. The "-femit-debug-for-profiling" flag seems to fit this model of it being more a strict superset rather than a tuning option more clearly. ~~~~ [1] http://lists.llvm.org/pipermail/llvm-dev/2016-November/107438.html
I see - we could make this pass (x86-discriminate-memops) an explicit opt-in. Would that work?
(In reply to Mircea Trofin from comment #1) > If I understand the proposal correctly, we would use the line number "0" for > all instructions that have memops and do not already have some debug info. > This would likely very quickly hit the limit of 4096 base discriminator > values that we can currently use in llvm (see > https://bugs.llvm.org/show_bug.cgi?id=39890) > So I understand correctly, how often in practice does this happen? I know we found it with in Bug 39890 where we had a case of heavy use of nested macros causing >4096 memops on a single line, but beyond that is it that common? I'm wondering whether it would be feasible to just cut-off after the first 4096 discriminators and not add any more? Potentially then, we could made support for >4096 discriminators an opt-in feature instead of the entire pass?
I built an experiment where I use 0 as the line number for instructions with memops that don't have any debug info. I do so on a per-function basis, so instructions in different functions in the same module might have the same debug info, which is not what we want, but it was quick to prototype and it gives a lower bound on the increase of number of instructions for which we can't generate a discriminator. For a reasonably-sized project, this led to an over 2x increase in the number of such instructions. Equally important, the number of functions this happens in grew by 4x. I'd prefer going the route of opting in the pass for the short term (i.e. '-mllvm -discriminate-mem-ops' or something like that). My long term plan is to explore adding support for base discriminator values > 4096. DWARF encodes discriminators as ULEB128, so there is capacity, it's just that LLVM uses a 32 bit value for encoding, and adding extra storage seemed like a non-trivial patch, from what I can tell. Would this staging work for your scenario?
I'm with Jeremy overall on "-fdebug-info-for-profiling shouldn't decrease debug info quality for non-profile uses" seems like a good premise to start with (ie: requires a pretty strong justification (discussion on llvm-dev with relevant folks, etc) to break from that premise) Mircea - yeah, if you want to have the pass off-by-default for now while you experiment with ways to flesh this out, that seems OK (internally - I probably wouldn't want to turn this on for Google without this being resolved, just to ensure we have a motivation to clean this up (& having this on by default in upstream means Google doesn't diverge too much/end up with weird features no one else is using, etc) Though as for the original problem - I think I'm misunderstanding something. It sounded like "under some circumstances, we use the function starting line number for <things>". The counterproposal is "use line zero in those conditions instead". And this somehow produces a greater number of these duplicates, so higher discriminator count? But why do these numbers need to be unique across functions? The address ranges in the debug info would distinguish between one discriminator used in one function, and one used in another function? (I guess related to this comment: "so instructions in different functions in the same module might have the same debug info, which is not what we want" - why is that a problem, couldn't the values be uniqued per-function?)
(In reply to Mircea Trofin from comment #5) > I'd prefer going the route of opting in the pass for the short term (i.e. > '-mllvm -discriminate-mem-ops' or something like that). > > My long term plan is to explore adding support for base discriminator values > > 4096. DWARF encodes discriminators as ULEB128, so there is capacity, it's > just that LLVM uses a 32 bit value for encoding, and adding extra storage > seemed like a non-trivial patch, from what I can tell. > > Would this staging work for your scenario? This works for us. We're certainly very keen to help out evaluating the benefits of this work for our iterative profiling workflow, but we can't get away with trading them for any regression in debugging at the same time. Having it on a switch and off by default will let us do some comparative benchmarking (time permitting), hopefully with a view to enabling it if we can get the best of both worlds.
Re. Comment #6 - David is correct, we could have duplicate discriminators for line 0 in different functions, however, I realized there's an additional hurdle - create_llvm_prof references lines as an offset from the function first line, and there is an additional assumption that that number is positive. Nothing insurmountable, and will fit well with the larger work item of extending the range of base discriminators.
Many thanks for the fix in r352246 -- FYI I've filed bug 40560 to have that revision pulled into v8.0, to avoid any diminished debug data there. It's my assumption that this won't interfere with anyone as X86DiscriminateMemOps can still be enabled with the option.
Fixed on trunk and 8.0, many thanks again.