Technical Guidance: Using Random Values as SQLCipher Keys

2019-06-07 09:00:00 -0400

This has been cross-posted from our discussion forum.

There are two options that applications commonly use to manage key material when using SQLCipher:

  1. Password or Passphrase - An example of ths type of integration is an application that requires the user to enter password or PIN number with a device keyboard, and the input is provided to SQLCipher for use as a key. SQLCipher includes special functionality to derive an encryption key from user provided input.
  2. Raw or “Random” Key Material - In some advanced situations an application might generate random key material for a database (i.e. random bytes) which is separately secured by an enclave, key store, hardware security module, or otherwise encrypted. The application provides this random data to SQLCipher for use as a key.

In the interest of interoperability it is strongly recommended that any key material passed to SQLCipher consist solely of valid UTF-8 encoded strings that do not include zero byte values internally. This recommendation applies universally to all keys, whether random or not, and no matter how they are supplied (e.g. sqlite3_key, PRAGMA key, or any wrapping API).

With Option 1, it is a given that the key material will be a string value because it is entered by the user. However, Option 2 requires additional consideraton on the part of the developer. While it may be technically possible to generate random bytes in code and coerce them to behave like strings in some languages, it must be avoided due to the potential for unpredictable behavior and compatibility problems, for instance:

  1. diffuculty opening a database using certain tools with such a key (e.g. one can’t provide raw binary values via SQL statements (e.g. PRAGMA or ATTACH), the SQLCipher command line shell, database GUI application, or using certain programming languages)
  2. corruption of key material due to encoding or decoding operations that encounter invalid code pages
  3. truncation of key material passed through languages or APIs rely on a NULL terminators when zero-bytes are present in the random key material

For these reasons, SQLCipher has standardized on Raw Key syntax since it’s initial release. It is the supported method for using binary data directly as key material. A Raw Key is simply a fixed length byte array with a standardized BLOB literal representation and hexadecimal encoding that is interpreted by SQLCipher and used directly as an encryption key. Using this feature an application can safely generate random key material and instruct SQLCipher to use the exact 32-byte (256 bits) sequence as a key, for examle:

PRAGMA key = "x'2DD29CA851E7B56E4697B0E1F08507293D761A05CE4D1B628663F411A8086D99'";

Using this approach it is the calling application’s responsibility to ensure that the data provided is a 64-character hex string in BLOB format (i.e. x'hexvalue'). As an added benefit, SQLCipher will bypass the KDF when a Raw Key is provided which can result in a substantial performance enhancement when opening databases.

If it is not possible for an application to use the Raw Key syntax for some reason, it is still the application’s responsibility to guarantee that the passphrase or key material consists of a valid UTF-8 string. This can be accomplished either by converting a random byte to a UTF-8-safe string using an intermediate encoding like Base64, or by constructing the random key from a set of valid UTF-8 code pages.

Finally, for applications that may already be using potentially invalid keys for some subset of users, we recommend converting key material going forward using the “rekey” feature to re-encrypt the database with a new key that is of a valid format, i.e. either a Raw key or a valid UTF-8 string.