Design

SQLCipher is a security extension to the SQLite database platform that facilitates the creation of encrypted databases. It uses the internal SQLite Codec API to insert a callback into the pager system that can operate on database pages immediately before they are written to and read from storage. Thus, SQLCipher's encryption is

  • Transparent – An application doesn’t require any special knowledge of the underlying database security. Applications use the standard SQLite API to manipulate tables using SQL. Behind the scenes the library silently manages the security.
  • On-the-fly – SQLCipher encrypts and decrypts in chunks called pages, as needed, so it doesn’t operate on the database all at one time. This means SQLCipher
    • starts up and closes down quickly
    • performs very well even with very large databases
    • works with SQLite indexing (i.e. retrieving a single record using an indexed search can incur as little as 5% overhead above a standard SQLite database)

Security features

The following is a short list SQLCiphers security features

  1. The encryption algorithm is 256-bit AES in CBC mode
  2. Each database page is encrypted and decrypted individually. The default page size is 4096 bytes but this can be adjusted at runtime to improve performance for certain query types.
  3. Each page has it’s own random initialization vector. The IV is generated by a cryptographically secure random number generator (e.g. OpenSSL’s RAND_bytes, CommonCrypto's SecRandom, LibTomCrypt Fortuna), and is stored at the end of the page. IVs are regenerated on write so that the same IV is not reused on subsequent writes of the same page data.
  4. Every page write includes a Message Authetication Code (HMAC-SHA512) of the ciphertext and the initialization vector at the end of the page. The MAC is checked when the page is read back from disk. If the ciphertext or IV have been tampered with or corrupted the HMAC check will cause SQLCipher to report a problem with the database. 
  5. When initialized with a passphrase SQLCipher derives the key data using PBKDF2-HMAC-SHA512. Each database is initialized with a unique random salt in the first 16 bytes of the file. This salt is used for key derivation and it ensures that even if two databases are created using the same password, they will not have the same encryption key. The default configuration uses 256,000 iterations for key derivation (this can be changed at runtime using PRAGMA kdf_iter).
  6. The key used to calculate page HMACs is different that the encryption key. It is derived from the encryption key and using PBKDF2 with 2 iterations and a variation of the random database salt. 
  7. If use of a passphrase is undesirable, an application may provide raw binary key data (for instance to support vaulted keys, or the use of PKI based key exchange).
  8. When encrypted, the entire database file appears to contain random data.
  9. Memory allocated by SQLCipher is locked when possible (via mlock/VirtualLock) and wiped before it is freed
  10. SQLCipher does not implement its own encryption. Instead it uses the widely available encryption libraries like OpenSSL libcrypto, LibTomCrypt, and CommonCrypto for all cryptographic functions. The cryptography provider used depends on the platform and configuration options.

Database Encryption and Temporary Files

All data in the main database file is encrypted. In addtion, SQLCipher encrypt data pages in journal files. Provided that you taken the important step of disabling file base temporary stores (i.e. --enable-tempstore=yes during configuration and define SQLITE_TEMP_STORE=2 during build), we are primarily concerned with the following:

  • Rollback journals - Pages in the rollback journal are encrypted using the same key as the main database. Note that there is an unencrypted header in a rollback journal, but it doesn't contain any data. The journal created by a vacuum run is encrypted in the same way as a rollback journal. Verification: create an encrypted database, start a transaction, make changes, and then inspect the -journal file using hexdump or a similar program. 
  • Write Ahead Log Files - Using the new WAL mode (i.e. PRAGMA journal_mode = WAL;), page data stored in the WAL file is encrypted using the datbase key. Pages in the rollback journal are encrypted using the same key as the main database. Verification: create an encrypted database, start a transaction, make changes, and then inspect the -wal file using hexdump or a similar program. 
  • Statement journals - Statement journals are also encrypted. This is harder to "observe" because they are only created under very limited circumstances, and even then they use temporary files that are immediately deleted after use. Note that statement journals are maintained in memory if temporary files are disabled. Verification: Compile a build under linux that forces a minimal temp cache size (so that pages are actually written to disk) and allows the use of temporary files, start a transaction with updates that cause a statement journal to be written, and then inspect the file descriptor of the temporary journal in the /proc//fd directory. 
  • Master journals - The master journal does not contain data (see http://www.sqlite.org/atomiccommit.html). Unlike the rollback journals, the master journal does not contain any original database page content. Instead, the master journal contains the full pathnames for rollback journals for every database that is participating in the transaction. 

Other transient files are not encrypted, so you must disable file based temporary storage if your application will use temp space, as noted above.

An Illustrative Terminal Listing

~ sjlombardo$ hexdump -C sqlite.db
00000000  53 51 4c 69 74 65 20 66  6f 72 6d 61 74 20 33 00  |SQLite format 3.|
...
000003c0  65 74 32 74 32 03 43 52  45 41 54 45 20 54 41 42  |et2t2.CREATE TAB|
000003d0  4c 45 20 74 32 28 61 2c  62 29 24 01 06 17 11 11  |LE t2(a,b)$.....|
...
000007e0  20 74 68 65 20 73 68 6f  77 15 01 03 01 2f 01 6f  | the show..../.o|
000007f0  6e 65 20 66 6f 72 20 74  68 65 20 6d 6f 6e 65 79  |ne for the money|

~ $ sqlite3 sqlcipher.db
sqlite> PRAGMA KEY='test123';
sqlite> CREATE TABLE t1(a,b);
sqlite> INSERT INTO t1(a,b) VALUES ('one for the money', 'two for the show');
sqlite> .quit

~ $ hexdump -C sqlcipher.db
00000000  84 d1 36 18 eb b5 82 90  c4 70 0d ee 43 cb 61 87  |.?6.?..?p.?C?a.|
00000010  91 42 3c cd 55 24 ab c6  c4 1d c6 67 b4 e3 96 bb  |.B?..?|
00000bf0  8e 99 ee 28 23 43 ab a4  97 cd 63 42 8a 8e 7c c6  |..?(#C??.?cB..|?|

~ $ sqlite3 sqlcipher.db
sqlite> SELECT * FROM t1;
Error: file is encrypted or is not a database

Packaging

SQLCipher is an extension to SQLite, but it does not function as a loadable plugin for many reasons. Instead, SQLCipher modifies SQLite itself, and is maintained as a separate version of the source tree. SQLCipher releases are baselined against a specific source version of SQLite. However, the project minimizes alterations to core SQLite code to reduce the risk of breaking changes during upstream SQLite merges.

The reasons that SQLCipher is packaged this way, as opposed to a "plugin" or extension to the SQLite amalgamation, follow:

  • Enabling an SQLite codec requires the compile-time definition of SQLITE_HAS_CODEC, which is not present on standard, unmodified SQLite builds.
  • Even when enabled, SQLite isn't setup to load codecs as plugins. While SQLite does have a plugin function for loadable extensions, it does not extend access to any system internals (it mainly used to allow custom user functions).
  • SQLCipher makes calls to internal functions that are not part of the public SQLite API. Sometimes these APIs change, even in between minor SQLite versions. Thus, each update adn merge requires inspection, testing and verification. Making SQLCipher portable across multiple versions of SQLite would not be feasible, nor could it to use only the public API (for instance, even the first critical step of attaching the codec callback to the pager uses an internal API).
  • SQLCipher modifies supporting functions to introduce special pragmas, built in functions, etc (e.g. PRAGMA cipher_*). Injecting this functionality in a plugin architecture wouldn't be possible.
  • SQLCipher's test harness relies on support in testfixture to take advantage of the test API and various internal checks (memory reference counting, etc.)
  • Even if it were possible to use a loadable plugin, dynamic libraries aren't available on all supported platforms, for example iOS