To reproduce: (1) Without exception escape check: clang-tidy -checks=-*,bugprone-*,-bugprone-exception-escape \ /tmp/ref_count.cpp -- \ -target x86_64-linux-gnu \ -m64 -g -O0 \ -std=c++11 \ -fno-exceptions 423 warnings generated. Suppressed 423 warnings (408 in non-user code, 15 with check filters). (2) Hangs with exception escape check: clang-tidy -checks=-*,bugprone-* \ /tmp/ref_count.cpp -- \ -target x86_64-linux-gnu \ -m64 -g -O0 \ -std=c++11 \ -fno-exceptions The problem is reproducible with current source and r336997, which added the exception escape checker. The problem does not exist in r336996.
Created attachment 20955 [details] contains test input ref_count.cpp
Hi Chih-Hung, thank you for reporting the bug. I can reproduce the problem. The fix is that the check will not do its work when `-fno-exceptions` is provided. A patch that solves the issue is here: https://reviews.llvm.org/D52880 Best Jonas
The test case is from one Android open source file. Although -fno-exceptions is used for most files, some could be compiled with exceptions. I tried the change in D52880 and the test case again. Now clang-tidy won't hang with `-fno-exceptions` but still hangs without -fno-exceptions.
Ok, I start creducing the test-file now, if i have something i will upload the file. Given that this is a massive file, do you know other files that trigger the problem?
Among thousands of AOSP source files, I only found two files failed with this clang-tidy check. So now I disabled this check for those two cases: https://android-review.googlesource.com/c/platform/bionic/+/774478 https://android-review.googlesource.com/c/platform/frameworks/compile/slang/+/774546 Test input ref_count.cpp is preprocessed output from one of the cases. The other case has about the same size and also takes about 8 to 10 seconds to run through clang-tidy when bugprone-exception-escape is disabled. I hope it is easy enough for the writer of that check to find the infinite loop with ref_count.cpp. If there is any fix, I can test it quickly with other AOSP files and the real AOSP compilation flags.
@adam.balogh I am investigating the issue, but it would help a lot if you take a look at it, too, as you know the internals. Right now it seems that many functions are analyzed in an endless loop (maybe recursion, direct/indirect?). I added trivial recursive test-cases that did _NOT_ result in an endless loop. Applying the patch in the attachement does resolve the endless loop, but I can not evaluate if it makes sense. The tests _DO_ work with the patch applied!
Created attachment 20962 [details] A potential patch, removes the endless loop at least This includes a recursive test case and some debug output I used to check. I could not use `Func->getName()` as it failed an assertion, so i just printed the pointer to debug a bit.
Comment on attachment 20962 [details] A potential patch, removes the endless loop at least Please ignore the whitespace changes in the test, clang-format produced them! The notable change is `llvm::dbgs() << ...` and the added test-case at the end of the test file.
Created attachment 20963 [details] Counting of the functions that are analyzed, sorted This was produced by running the check 5 seconds, so its fast in what its doing :)
Created attachment 20964 [details] Cleaned patch with actual "fix" Sorry, the last patch was too noisy. This one is better
Created attachment 20965 [details] Debugoutput with patch applied Probably not very helpful, but the counts of the analyzed functions go down significantly.
Hello! Of course I will investigate it, I was busy with another task, but I will continue with this one.
Perfect, is it ok if I assign the bug to you again?
Created attachment 20990 [details] Patch Attempt I tried the most straightforward thing which is checking whether the function to be analyzed is already in the call stack (see attachment), but it still does not terminate. It seems that it has nothing to do with the recursion.
(In reply to Jonas Toth from comment #8) > Comment on attachment 20962 [details] > A potential patch, removes the endless loop at least > > Please ignore the whitespace changes in the test, clang-format produced them! > > The notable change is `llvm::dbgs() << ...` and the added test-case at the > end of the test file. How did it terminate for you? Even with my patch attempt it seems to run forever, but i do not know, why yet. How many minutes did it take?
Created attachment 20991 [details] Patch Attempt with Optimized Matcher Expression I optimized the matcher expression to first check whether a function is allowed to throw and only after that check whether it throws. The output became much shorter, but at some point it hangs silently.
Oh, `CallStack` is a `SmallSet` so disregard my first attempt. It does not fix anything.
There is one codepath, that actually removes the analyzed function from the SmallSet, i just commented this line out (see https://bugs.llvm.org/attachment.cgi?id=20964) and it terminated "instantly" (i think 5 seconds but most of it is probably the frontend parsing this big file). I added some direct and indirect test-cases as well, which did not end up in an endless loop even without the patch. But it might be the case, that this is only triggered in some special recursive setting.
(In reply to Jonas Toth from comment #18) > There is one codepath, that actually removes the analyzed function from the > SmallSet, i just commented this line out (see > https://bugs.llvm.org/attachment.cgi?id=20964) and it terminated "instantly" > (i think 5 seconds but most of it is probably the frontend parsing this big > file). > > I added some direct and indirect test-cases as well, which did not end up in > an endless loop even without the patch. But it might be the case, that this > is only triggered in some special recursive setting. There is a test-case attached at the beginning. We could reduce it down to figure out exactly what causes this problem.
Created attachment 20992 [details] Optimized Query It seems that optimization of the query is enough.
Created attachment 20993 [details] Counting of the functions with optimized query The numbers are much lower than before.
(In reply to Jonas Toth from comment #18) > There is one codepath, that actually removes the analyzed function from the > SmallSet, i just commented this line out (see > https://bugs.llvm.org/attachment.cgi?id=20964) and it terminated "instantly" > (i think 5 seconds but most of it is probably the frontend parsing this big > file). > > I added some direct and indirect test-cases as well, which did not end up in > an endless loop even without the patch. But it might be the case, that this > is only triggered in some special recursive setting. There is probably no endless loop at all, just an insufferable execution time. The analyzed function must be removed from the SmallSet unless we want to loose some true positives: if A calls B then C, but B also calls C, C throws an exception which B catches but not A, and A should not throw then by not removing C from the SmallSet after visiting it after called from B we will not detect the exception it throws and A does not catch. Such an optimization could be done if we changed the whole call graph visit from depth-first to width-first.
> There is probably no endless loop at all, just an insufferable execution > time. The analyzed function must be removed from the SmallSet unless we want > to loose some true positives: if A calls B then C, but B also calls C, C > throws an exception which B catches but not A, and A should not throw then > by not removing C from the SmallSet after visiting it after called from B we > will not detect the exception it throws and A does not catch. Such an > optimization could be done if we changed the whole call graph visit from > depth-first to width-first. I see, that makes sense. > There is a test-case attached at the beginning. We could reduce it down to figure out exactly what causes this problem. I was trying to do that. creduced didn't anymore proceed at some and it seemed like a logical issue. Lets see if Adams fix is actually a fix ;) @Chih-Hung Hsieh It would be awesome if you could try out the patch https://reviews.llvm.org/D53187 here and check if everything would be fine with it.
With patch D53187, I was able to compile all Android source code. While it looks like a safe and correct optimization, I don't think it fixes the important problem of repeated computation inside throwsException. It now works with Android because Android code has used little exception specifications, and the failed case I got probably has many and deep function calls but not much exception specification in those functions. If the Func pointer is unique for each function, earlier debug dump showed that some TypeVec was computed for the same function too many times: 276857 Insertion Function 0xca14368 276857 Insertion Function 0xca14230 270491 Insertion Function 0xc9b49c8 During this tidy check, isn't a function's TypeVec, collected by throwsException constant? If that's the case, the optimal solution should compute TypeVec only once per function. The CallStack does not prevent a function being checked multiple times when it is called many times from another function. Instead of using a CallStack, could you try to use a map to store the check status of all Func to TypeVec? Inside throwsException, if a function (1) is not in the map yet, add a 'checking' state to the map, and start checking. (2) has 'checking' status, return an empty TypeVec, like the current code. (3) has 'checked' status, return the saved TypeVec in the map. After checking a function, save its TypeVec in the map and change its status to 'checked' so it won't be checked again.
I think we should go step by step. So the question is whether this first optimization step is sound. I know it is incomplete, but I think it is far better then the current status. There are two further possible optimizations and I think both should be done. The one is width-first searching for exceptions and ignoring subsequent calls to the same function. (This is based on the fact that an indirect call never throws more exceptions than a direct one.) This ensures that each function is visited only once per checked function (a function that should not throw). The second possible optimization could be caching thrown (direct and indirect!) exceptions for each visited function globally. This is a bit difficult because matchers are usually stateless.
The gain from the two further steps I proposed (and I will do them eventually) are minor compared to this. The matcher expression was definitely wrong.
Thanks for validating! The optimizations require a prior refactoring on the matching logic which might take a while. For now this fix is a good start (as Adam stated) and the thousands of checks seem to be gone and now its more max ~150 (or so), which is 2 orders of magnitude less.
As said earlier, I think D53187 is fine for an incremental improvement. As it showed effectiveness of the filter anyOf(isNoThrow(), ...) to be done before the filter 'throw(unless(...)), have you considered embedding the anyOf(...) filter as part of the 'throw' check or ExceptionEscapeCheck? If all easy-to-check filters are applied before any recursive calls inside ExceptionEscapeCheck, would there be any difference between depth-first and breadth-first search? After all, every function that passes those simple filters must be checked exactly once.
*** Bug 39621 has been marked as a duplicate of this bug. ***
I am currently refactoring the check a bit and implemented `FunctionDecl` based caching. For your example there is no noticeable performance difference, _maybe_ for other cases it is slightly faster.