Docs: Configuration

This document describes how to configure Skeema using options, in general. To view a reference of all specific options that exist, please see the options reference instead.

Skeema is configured by setting options. These options may be provided to the Skeema CLI on the command-line, as well as via option files. Handling and parsing of options is intentionally designed to be very similar to the MySQL client and server programs.

This document is primarily geared towards the Skeema command-line tool, although much of the same behavior is matched in Skeema Cloud Linter.

Option types

Options generally take values, which can be string, enum, regular expression, int, size, or boolean types depending on the option.

Non-boolean options require a value. For example, you cannot provide --host on the command-line without also specifying a value, nor can you have a line that only contains “host\n” in an options file. The only special-case is the password option, which behaves like it does in the MySQL client: you may omit a value to prompt for password on STDIN.

Boolean options do not require a value; simply specifying the option alone is equivalent to passing a true value. The option name may be prefixed with “skip-” or “disable-” to set a false value. In other words, on the command-line --skip-foo is equivalent to --foo=false or --foo=0; this may also be used in option files without the -- prefix.

String options may be set to any string of 0 or more characters. Due to shell interactions on the command-line, string values containing spaces or control characters must be quote-wrapped; empty strings on the command-line also must use quotes.

Enum options behave like string options, except the set of allowed values is restricted. The option reference lists what values are permitted in each case. Values are case-insensitive.

Regular expression options are used for string-matching. Unless otherwise noted in the option’s documentation, the value should be supplied without any surrounding delimiter; for example, use --ignore-schema='^test.*', NOT --ignore-schema='/^test.*/'. To make a match be case-insensitive, put (?i) at the beginning of the regex. For example, --ignore-schema='(?i)^test.*' will match “TESTING”, “Test”, “test”, etc.

Int options must be set to an integer value.

Size options are a special-case of int options. They are used in options that deal with file or table sizes, in bytes. Size values may optionally have a suffix of “K”, “M”, or “G” to multiply the preceding number by 1024, 1024^2, or 1024^3 respectively. Options that deal with table sizes query information_schema to compute the size of a table; be aware that the value obtained may be slightly inaccurate. As a special-case, Skeema treats any table without any rows as size 0 bytes, even though they actually take up a few KB on disk. This way, you may configure a size option to a value of 1 to mean any table with at least one row.

Specifying options on the command-line

All options have a “long” POSIX name, supplied on the command-line in format --option-name. Many also have a “short” flag name format, such as -o.

Non-boolean options require a corresponding value, and may be specified on the command-line with one of the following formats:

  • --option-name value
  • --option-name=value
  • -o value
  • -ovalue

Note that the password option is a special-case since it is a string option that does not require a value. If a value is supplied, either the 2nd or 4th forms listed above must be used on the command-line. This is consistent with how a password is supplied to the MySQL command-line client.

Boolean options never require a value. They may be supplied in any of these formats:

  • --option-name=value (value of "false", "off", "", or "0" is treated as false; any other is treated as true)
  • --option-name (implies =true)
  • -o (implies true)
  • --skip-option-name (same meaning as --option-name=false)

The short form of boolean options may be “stacked”. For example, if -o and -x are both boolean options, you may supply -xo to set both at once.

Specifying options via option files

Skeema option files are a variant of INI format, designed like MySQL cnf files, supporting the following syntax:

option-name=value
some-bool-option   # inline comment

# full-line comment
; full-line comment (alternative syntax -- only works at beginning of line)

[environment-name]
this=that

Options must be provided using their full names (“long” POSIX name, but without the double-dash prefix). Values may be omitted for options that do not require them, such as boolean flags.

Values may optionally be wrapped in quotes, but this is not required, even for values containing spaces. The # character will not start an inline comment if it appears inside of a quoted value. Outside of a quoted value, it may also be backslash-escaped as \# to insert a literal.

Sections in option files are interpreted as environment names – typically one of “production”, “staging”, or “development”, but any arbitrary name is allowed. Every Skeema command takes an optional positional arg specifying an environment name, which will cause options in the corresponding section to be applied. Options that appear at the top of the file, prior to any environment name, are always applied; these may be overridden by options subsequently appearing in a selected environment.

If no environment name is supplied to the Skeema CLI, the default environment name is “production”. The Cloud Linter service also always operates using the “production” environment’s configuration.

Environment sections allow you to define different hosts, or even different schema names, for specific environments. You can also define configuration options that only affect one environment – for example, loosening protections in development, or only using online schema change tools in production.

The Skeema CLI always looks for several "global" option file paths, regardless of the current working directory. On Linux and MacOS, these files are used:

  • /etc/skeema
  • /usr/local/etc/skeema
  • ~/.my.cnf (special parsing rules apply)
  • ~/.skeema

On Windows, these global config files are used:

  • C:\Program Files\Skeema\skeema.cnf
  • %USERPROFILE%\.my.cnf (special parsing rules apply)
  • %USERPROFILE%\.skeema

Skeema then also searches the current working directory (and its tree of parent directories) for additional option files; see the execution model and priority sections below.

Parsing of MySQL config file ~/.my.cnf is a special-case: instead of the normal environment logic applying, only the sections [skeema], [client], and [mysql] are evaluated. Parsing ignores any options that are unknown to Skeema (which will be most of them, aside from options shared between Skeema and MySQL). If you do not want Skeema to parse ~/.my.cnf at all, you may specify skip-my-cnf in a global option file.

Execution model and per-directory option files

After parsing and applying global option files, Skeema next looks for option files in the current directory path. Starting with the current working directory, parent directories are climbed until one of the following is hit:

  • ~ (user’s home directory)
  • a directory containing .git (the root of a git repository)
  • / (the root of the filesystem)

Then, each evaluated directory (starting with the rootmost) is checked for a file called .skeema, which will be parsed and applied if found.

Most Skeema commands – including skeema diff, skeema push, skeema pull, skeema lint, and skeema format – then operate in a recursive fashion. Starting from the current directory, they proceed as follows:

  1. Read and apply any .skeema file present
  2. If both a host and schema have been defined (by this directory’s .skeema file and/or a parent directory’s), execute command logic as appropriate on the *.sql table files in this directory.
  3. If step 2 did not apply, then recurse into subdirectories, repeating steps 1-3 on each subdirectory.

For example, if you have multiple MySQL clusters, each with multiple schemas, your schema repo layout may be of the format reporoot/hostname/schemaname/*.sql. Each hostname subdir will have a .skeema file defining a different host, and each schemaname subdir will have a .skeema file defining a different schema. If you run skeema diff from reporoot, diff’ing will be executed on all hosts and all schemas. But if you run skeema diff in some leaf-level schemaname subdir, only that schema (and the host defined by its parent dir) will be diffed.

Priority of options set in multiple places

The same option may be set in multiple places. Conflicts are resolved as follows, from lowest priority to highest:

  • Option default value
  • /etc/skeema
  • /usr/local/etc/skeema
  • C:\Program Files\Skeema\skeema.cnf (on Windows)
  • ~/.my.cnf
  • ~/.skeema
  • Per-directory .skeema files, in order from ancestors to current dir
    • The root-most .skeema file has the lowest priority
    • The current directory’s .skeema file has the highest priority
  • Options provided on the command-line

This ordering allows you to add configuration options that only affect specific hosts or schemas, by putting it only in a specific subdir’s .skeema file.

In cases of conflicts within the same .skeema file, sectionless options (at the top of the file) have lower priority than options inside of the currently-selected environment’s section.

Invalid options

Passing unknown/invalid options to the Skeema CLI, either in an option file or on the command-line, causes the program to abort except in two cases:

  • In addition to its own option files, Skeema also parses the MySQL per-user file ~/.my.cnf to look for connection-related options (user, password, etc). Other options in this file are specific to MySQL and unknown to Skeema, but these will simply be ignored instead of throwing an error.

  • Option names may be prefixed with “loose-”, in which case they are ignored if they do not exist in the current version of Skeema. (MySQL also provides the same mechanism, although it is not well-known.) If combining this with the boolean “skip-” prefix, then “loose-” must appear first: “loose-skip-foo”, not “skip-loose-foo”.

Env variables

Beginning with Skeema v1.9.0, some options now support the use of environment variables in option files. This feature provides a mechanism for dynamically configuring Skeema’s behavior without having to hard-code values in option files. For example, if the line user=$MY_SKEEMA_USER appears in a .skeema file, the user option will take on the value of the MY_SKEEMA_USER environment variable. This functionality is also useful for supplying separate passwords to different hosts: For example, maindb/.skeema could contain password=$MAINDB_PASSWORD while otherdb/.skeema could contain password=$OTHERDB_PASSWORD.

An environment variable may be used as the value of these options:

To use an environment variable in an option file, simply ensure the value begins with the $ symbol, for example user=$DB_USER. You may optionally wrap the value in double-quotes, such as user="$DB_USER". Either way, if the specified environment variable is not set, the option will be set to a blank string value.

If an option value is wrapped in single quotes, it will not be processed as an environment variable. This provides a mechanism to express values that just happen to begin with a $ symbol but are not environment variables.

Environment variable substitution only works if the full option value is a single complete environment variable name. In other words, usage such as schema=mydb_$FOO, host=${DBSUBDOMAIN}.myhost.com, or schema=$SHARD1,$SHARD2 is not supported at this time.

The password option is a notable case since its default value uses an environment variable: if omitted entirely from option files and the command-line, the default value is “$MYSQL_PWD”, meaning the MYSQL_PWD environment variable will be used as the default password (or no password, if this variable is not set). This is equivalent to the password behavior of the standard mysql client. This default behavior specifically for $MYSQL_PWD works prior to Skeema v1.9.0 as well, even though other environment variable support was not present yet.

Windows users, note that only $UNIX_STYLE syntax is supported for environment variables in .skeema files, not %WINDOWS_STYLE%, even when using the Windows port of Skeema Premium CLI.

Limitations on host and schema options

The host and schema options may be used on the command-line only in skeema init and skeema add-environment. For all other commands, these two options cannot be supplied on the command-line, and should only appear in .skeema files. Also note these two options cannot be set in global option files.

All other commands (skeema diff, skeema push, skeema pull, skeema lint, skeema format) are designed to recursively crawl the directory structure and obtain host and schema information from the .skeema files in each subdirectory. Many Skeema users track multiple database hosts and/or schemas in the same repo, and the values for host and schema vary by subdirectory and environment, so it does not make sense to supply values “globally” for these options.

To interact with just a single schema at a time, simply cd to the schema’s subdirectory before invoking skeema; see the FAQ entry for more information.

If you truly need to supply host and/or schema dynamically, use environment variables in .skeema files in Skeema v1.9.0+. For handling more complicated situations, you can use external program shell-outs, as described in the next section.

Options with variable interpolation

Some string-type options, such as alter-wrapper, are always interpreted as external commands to execute. A few other string-type options, such as schema, are optionally interpreted as external commands only if the entire option value is wrapped in backticks.

In either case, the external command-line supports interpolation of variable placeholders, which appear in all-caps and are wrapped in braces like {VARNAME}. For example, this line may appear in a .skeema file to configure use of pt-online-schema-change:

alter-wrapper=/usr/local/bin/pt-online-schema-change --execute --alter {CLAUSES} D={SCHEMA},t={TABLE},h={HOST},P={PORT},u={USER},p={PASSWORDX}

Or this line might be used in a .skeema file to configure service discovery via host-wrapper, to dynamically map host values to database instances, instead of using the host value literally as an address:

host-wrapper=/path/to/service_discovery_lookup.sh /databases/{ENVIRONMENT}/{HOST}

The placeholders are automatically replaced with the correct values for the current operation. The doc for each option lists what variables it supports.

If a variable value contains spaces, quotes, or control characters, these will be automatically escaped and the value will be wrapped in single quotes. On Linux or MacOS, the entire command string is then passed to /bin/sh -c. On Windows, as of Skeema Premium CLI v1.11.0, the entire command string is automatically base64-encoded to avoid issues with control characters, and then passed to powershell.exe -EncodedCommand.