So recently a very hyped memory corruption security vulnerability was discovered in the OpenSSL punycode parser.
Some folks including Hanno (https://twitter.com/hanno/status/1587775675397726209) asked why this is still happenning, why no one wrote a fuzzer for the punycode parser and if we as the security community have learned nothing from Heartbleed.
I think we should give the developers the benefit of doubt and assume they were acting in good faith and try to see what could be improved.
In fact, there already exists a fuzz testing harness for the X.509 in the OpenSSL source code.
All of the fuzzers from the OpenSSL source tree are also supposedly automatically deployed to ClusterFuzz via OSS-Fuzz: https://github.com/google/oss-fuzz/blob/master/projects/openssl/build.sh
Examining call chains
Let’s start by examining the call chain for the vulnerable function.
- ossl_punycode_decode: called by ossl_a2ulabel
- ossl_a2ulabel: ossl_a2ucompare is not really referenced anywhere in C code, only mentioned in documentation.
Let's examine who calls "ossl_a2ulabel" then.
openssl/crypto$ grep -rI ossl_a2ulabel .
./x509/v3_ncons.c: if (ossl_a2ulabel(baseptr, ulabel + 1, &size) <= 0) {
Let's remember the name of this file and examine the coverage produced by the corpus shipped with the OpenSSL source for the X.509 fuzzing harness.
- nc_email_eai <- nc_match_single <- nc_match <- NAME_CONSTRAINTS_check, NAME_CONSTRAINTS_check_CN
- NAME_CONSTRAINTS_check, NAME_CONSTRAINTS_check_CN <- check_name_constraints in crypto/x509/x509_vfy.c
- check_name_constraints <- verify_chain <- X509_verify_cert
- X509_verify_cert: this one has A LOT of callers in the OpenSSL code, but was not reached by the fuzzing harness.
- X509_verify_cert: (other ways to reach it are circular - looks like we have to call it directly): check_crl_path <- check_crl <- check_cert <- check_revocation <- verify_chain
Examining coverage
Here is what I did:
- Compiled the fuzzing harness with coverage (added -fprofile-instr-generate -fcoverage-mapping) before -DPEDANTIC when building fuzzers.
- Minimized the x.509 fuzzing corpus to speed up the next step:
- ./fuzz/x509 -merge=1 fuzz/corpora/x509_min fuzz/corpora/x509
- Ran the executable on all input vectors. This is very slow because while parsing is fast, executable takes time to start up. One solution here could be to use the toolchain from OSS-Fuzz which replaces libFuzzer with a library which triages inputs somewhat like AFL persistent mode.
- for i in corpora/x509_min/*; do ./x509 $i; mv default.profraw $(basename $i).profraw; done
- llvm-profdata-10 merge -sparse *.profraw -o default.profdata
- llvm-cov-10 show --output-dir=cov_html --format=html -instr-profile=default.profdata x509
- Update (regarding the persistent mode/perf comment above): once you build the harness with coverage flags, there is no need to execute each input file separately, one can just use the "runs=N" option of libFuzzer:
- ./x509 -runs=3000 ./corpora/x509_min
So, why did this all happen?
My first (un)educated guess was: fuzzing will waste time in ASN.1 deserialization with little time spent on parsing decoded fields. Turns out, it's slightly worse.
Short answer: the code is not reachable by the current corpus and harness. As there exists an X.509 fuzzer, perhaps developers and other folks assumed it could theoretically reach all parsers, but this is not the case.
The file through which it’s reachable (v3_ncons.c) has little coverage.
The specific call chain which we traced to "check_name_constraints" ends up in "crypto/x509/x509_vfy.c" which has ZERO coverage.
(Update): "verify_chain" is reachable in other fuzzers, but it's still not enough.
Jonathan Metzman from Google's Open Source Security Team pointed me to an OSS-Fuzz dashboard where "verify_chain" is reachable.
- https://twitter.com/metzmanj/status/1588174176229199873
- https://storage.googleapis.com/oss-fuzz-coverage/openssl/reports/20221031/linux/src/openssl/crypto/x509/x509_vfy.c.html#L221
Unfortunately, this is still NOT enough:
- "verify_chain" is covered 4.6K times whereas the inner loop of the X.509 fuzzer is invoked 12.4 times so it's reachable but NOT by the X.509 fuzzer.
- Whichever harness reaches "verify_chain" (most likely the "server" ssl test but not the X.509 one) needs to be modified to either set up a valid certificate chain for verification or mark the certificate as self-signed so that "build_chain" does not return error
- https://twitter.com/astarasikov/status/1588175841615261702
(Update 2): making "verify_chain" and "build_chain" pass (still not there).
- https://gist.github.com/astarasikov/4a60bb17499d4351bb27189e5e8ba8f4
What could we try improving?
- Write separate parsers for each function (like Hanno did) - for that, it'd be necessary to examine coverage to see 1. where coverage is low and 2. where the code processes decoded ASN.1 elements
- Write a harness to cover X509_verify_cert. Looks like this is currently only called from "test" but not "fuzz" tests. While it may be slow to fuzz the verification, it will definitely cover a larger attack surface.
- Update: while this function is reachable via "client" and "server" tests, it returns early. To really cover it and the "punycode" parsing, it's necessary to set up certificate chains in these tests, as well as generate valid "proxy" certificates and add them to the corpus.
- Periodically examine coverage reports. This is a bit tedious to do manually, but if there was a public access to the OSS-Fuzz coverage report from Google, this would be much easier. Additionally, the OSS-Fuzz Introspector tool could be helpful in identifying roadblocks/unreachable code.
Generally, the Introspector does not always work perfectly and is easy to break - it's a static analysis tool so it gets confused by function pointers and cannot infer anything that happens at runtime, like when your code is heavily using C++ classes or a hashtable for lookups. In case of X.509 code, however, it may work fine - function pointers are mostly used by the ASN.1 code for its internals (which we actualy do NOT want to fuzz or review in most cases) whereas the core top-level logic is reachable by direct function calls from the entrypoint - a good candidate for static analysis.
- https://github.com/ossf/fuzz-introspector
- https://openssf.org/blog/2022/06/09/introducing-fuzz-introspector-an-openssf-tool-to-improve-fuzzing-coverage/
If we had the harness which could theoretically reach X509_verify_cert
- Add well-formed (encoded) ASN.1 elements to the dictionary file (-dict= option for libFuzzer). This one is currently not used by the OpenSSL fuzz/helper.py, but at least oids.txt is used by OSS-Fuzz as the dictionary.
- Add well-formed X.509 certificates which make use of the "name constraints" field. And strictly speaking, all other fields too - instead of just storing the libFuzzer-generated corpus in the tree, it would be better to manually provide various inputs exercising difficult functionality. However, as libFuzzer is spending far too much time on ASN.1 and is overloaded with "features", this will likely only uncover new issues during long (days) runs on ClusterFuzz. Whereas parsers for individual leaf functions, as demonstrated by Hanno, can find (some) bugs in mere seconds.
Thanks to:
Hanno for his twitter thread for motivating me to look into this.
My colleagues for introducing me to the Introspector tool.
P.S. Linking to my evergreen tweet https://twitter.com/astarasikov/status/1122364899362054144