Safety is the primary design consideration throughout all of Skeema’s features and functionality. There are several types of guardrails throughout the system.
- Schema design safety: linter engine with many configurable rules for enforcing company policies and best practices
- Unsafe change detection: automatic detection and prevention of generated DDL which is destructive or highly risky
- Operational safety: numerous options for fine-tuning how DDL is generated and executed
- Internal correctness: functionality for confirming that Skeema has introspected all tables correctly, and empirically verifying that all generated DDL has the expected effect
Schema linting
Skeema’s linter provides a set of configurable rules for detecting schema design problems, and enforcing company policies on use of various database features.
To check for problems in all CREATE statements in your *.sql files, use the skeema lint command. By default, this command also reformats statements to their canonical format, but this behavior may be disabled with the skip-format option.
The skeema diff and skeema push commands also perform linting on the new definitions of all modified objects in the diff. This behavior is enabled by default, but can be disabled with the skip-lint option.
Each linter rule has a corresponding enum option, which can be configured to generate an error, a warning, or be ignored entirely. For example, the lint-pk option specifies how to handle tables that lack a primary key; it can be set to either “error” (emit a fatal error), “warning” (log a non-fatal warning), or “ignore” (disable this linter rule). This is sometimes referred to as configuring the severity of the linter rule.
Linter rule enum options treat skip- prefixes as “ignore”; in other words, skip-lint-pk is equivalent to lint-pk=ignore. This can be especially useful as an ad-hoc command-line override, such as skeema push --skip-lint-pk, if you are confident that a failing linter check can be bypassed safely for a specific operation or situation.
Some linter rules also have supplemental configuration options, such as an allow-list. For example, the lint-charset option checks for acceptable character sets, using the allow-list specified by allow-charset. Together, this pair of options controls character set linting: the allow-charset option is set to a comma-separated list specifying which character sets to allow, while the lint-charset option indicates the severity (error/warning/ignore) for non-allowed character sets.
| Linter option | Default | What it checks/flags |
|---|---|---|
| lint-pk | warning | Tables that lack a primary key |
| lint-dupe-index | warning | Duplicate secondary indexes |
| lint-charset | warning | Column character sets, using allow-list allow-charset |
| lint-engine | warning | Table storage engines, using allow-list allow-engine |
| lint-fk-parent | warning | Foreign key parent-side table missing or lacks unique index |
| lint-reserved-word | warning | Names of objects which conflict with server reserved words |
| lint-name-case | ignore | Tables or views that have uppercase letters in their names |
| lint-zero-date | warning | date, datetime, or timestamp column default values with 00 in day/month/year |
| lint-auto-inc | warning | auto_increment column data types, using allow-list allow-auto-inc |
| lint-pk-type | ignore | Primary key column data types, using allow-list allow-pk-type |
| lint-compression | warning | Table compression options, using allow-list allow-compression |
| lint-display-width | warning | Integer column types with nonstandard display width modifiers |
| lint-definer | error | Definer users for stored objects, using allow-list allow-definer |
| lint-has-enum | ignore | Columns using enum or set data types |
| lint-has-float | ignore | Columns using float or double data types |
| lint-has-time | ignore | Columns using timestamp, datetime, or time data types |
| lint-has-fk | ignore | Any use of foreign keys constraints |
| lint-has-routine | ignore | Any use of stored procedures or functions |
| lint-has-view | ignore | Any use of views (Premium Edition only) |
| lint-has-trigger | ignore | Any use of triggers (Premium Edition only) |
| lint-has-event | ignore | Any use of events (Premium Edition v1.12+ only) |
Supplemental allow-list options include the following:
| Allow-list | Default value | Notes |
|---|---|---|
| allow-auto-inc | int unsigned, bigint unsigned | |
| allow-charset | latin1, utf8mb4 | |
| allow-compression | none, 4kb, 8kb | |
| allow-definer | %@% | Default is permissive (allows any user) |
| allow-engine | innodb | |
| allow-pk-type | n/a | Must be explicitly configured if using lint-pk-type |
Unsafe change detection
By default, skeema push refuses to execute DDL on a schema if any of the requested operations are “unsafe” – that is, they have the potential to destroy data, or a high risk of causing subtle problems for application workloads. Similarly, skeema diff also returns a fatal error in this case. Even though skeema diff never executes DDL anyway, it serves as an accurate “dry run” for skeema push and therefore aborts in the same fashion.
The following operations are considered unsafe, as of the most recent version of Skeema:
- Dropping a table
- Altering a table to drop a normal column or stored (non-virtual) generated column
- Altering a table to modify the data type of an existing column in a way that potentially causes data loss, truncation, corruption, or reduction in precision
- Altering a table to modify the character set of an existing column
- Altering a table to modify the collation of an existing column, if that column is already part of a uniqueness constraint
- Altering a table to change its storage engine
- Dropping a stored procedure or function
- Dropping and re-creating a stored procedure or function to change its parameters or return type
- Dropping and re-creating a MySQL stored procedure or function to change its body, DEFINER, DETERMINISTIC property, or creation-time metadata
- This is only unsafe in MySQL, which lacks atomic
CREATE OR REPLACEsyntax. In MariaDB, this operation safe.
- This is only unsafe in MySQL, which lacks atomic
- Dropping a view (only supported in premium Skeema products)
- Dropping a trigger (only supported in premium Skeema products)
- Dropping an event, unless it is already disabled/expired (only supported in premium Skeema products)
In order to proceed with unsafe changes, you must enable the allow-unsafe option. Run skeema diff --allow-unsafe to first preview the full set of changes with unsafe operations permitted, potentially followed by skeema push --allow-unsafe to execute the operations.
It is not recommended to permanently enable allow-unsafe in an option file or wrapper script, especially in the production environment. It is safer to require users to supply it manually on the command-line on an as-needed basis, to serve as a confirmation step for unsafe operations.
You may also configure Skeema to conditionally permit unsafe operations on tables below a certain size (in bytes), or always permit unsafe operations on tables that have no rows. This behavior is configured through the safe-below-size option. For example, you could configure safe-below-size=10M to automatically permit unsafe operations on tables under 10 megabytes in size, or safe-below-size=1 to automatically permit unsafe operations only for completely empty tables.
Operational safety
Even if your desired schema change is non-destructive and passes all configured linter checks, there can be additional safety concerns around operational execution. These risks depend on your company’s specific database environment, and often require human involvement, configuration, and/or coordination.
Dry run
Before running skeema push, be sure to always review the generated DDL diff by first running skeema diff. Under the hood, these commands share the exact same underlying code paths, with only two major behavioral differences:
skeema diffdisplays generated DDL without actually executing it on the target database server, whereasskeema pushdisplays and then executes the generated DDL.skeema diffhas different process exit code behavior thanskeema push, as described in each commands' documentation.
Running skeema diff is actually completely equivalent to using skeema push --dry-run. In terms of the implementation, skeema diff is just a shim around skeema push --dry-run with slightly different command help/usage text.
ALTER TABLE operational safety
By default, altering a large table can be disruptive to application queries, depending on your database server version and the type of alteration being performed. Refer to the online schema change documentation for in-depth coverage on how to configure non-disruptive table alteration. The bottom of that page also covers metadata locking problems, which can cause unexpected query pile-ups.
DROP TABLE operational safety
Prior to MySQL 8.0.23, whenever innodb_buffer_pool_size is large, DROP TABLE can potentially impact database performance even if the table being dropped is small or empty.
Regardless of database server version, DROP TABLE performance is improved when innodb_adaptive_hash_index is disabled on the server. Additionally, if you self-host your databases (rather than using e.g. RDS or CloudSQL), choice of Linux filesystem can also greatly affect DROP TABLE performance for large tables. As an example, XFS can unlink files quickly regardless of file size, whereas EXT3/EXT4 cannot. Unlink performance can also depend on the specific filesystem mount options.
Foreign key constraint validation
By default, skeema push executes DDL in a session with foreign key checks disabled. When adding a new foreign key to an existing table, no immediate integrity check is performed on existing data. This results in faster ALTER TABLE execution, avoids order-of-operation issues with circular dependencies, and eliminates one possible failure vector for the DDL.
This behavior may be overridden by enabling the foreign-key-checks boolean option. When enabled, skeema push enables foreign key checks for any ALTER TABLE that adds one or more foreign keys to an existing table. This means the server will validate existing data’s referential integrity for new foreign keys, and the ALTER TABLE will fail with a fatal error if the constraint is not met for all rows.
Virtual column validation
When adding a new VIRTUAL column, by default MySQL servers will not confirm that the column generation expression is valid for all existing rows of the table. To perform this validation in MySQL, enable Skeema’s alter-validate-virtual boolean option, which will add a WITH VALIDATION clause to any generated ALTER TABLE which affects virtual columns. However, this validation can cause slower ALTER TABLE performance. Refer to the MySQL manual for more information.
In contrast, on a MariaDB server, validation of virtual columns occurs only when an ALGORITHM=COPY clause is used. To force validation on a MariaDB server, you must use Skeema’s alter-algorithm=copy option setting, which may cause ALTER TABLE to be significantly slower, and potentially disruptive to writes.
When an external online schema change tool is being used, such tools typically rebuild an entire table, in which case the database server will inherently perform virtual column validation as rows are bulk-copied into the new table. No special syntax is required in that situation.
Workspace safety
In order to accurately introspect the schemas represented in your filesystem’s *.sql files, Skeema actually runs your CREATE statements in a temporary location, called a workspace. By default, Skeema creates, uses, and then carefully drops a schema called _skeema_tmp on each database server it interacts with. However, this behavior is highly configurable.
When operating on schemas containing hundreds of table definitions, workspace activity can potentially be disruptive to busy production database servers, especially on older versions of MySQL with large InnoDB buffer pools. Be sure to carefully read the workspace documentation, which describes how to configure offloading workspace activity to either a local Docker container (any edition of Skeema) or a non-production database server of your choice (Skeema Premium only).
Internal correctness guardrails
Skeema verifies the correctness of its own behavior in several ways.
No reliance on SQL parsing
Skeema’s table introspection behavior does not rely on parsing/interpreting SQL DDL. Parsing SQL is brittle across various MySQL and MariaDB versions, which have subtle differences in features, functionality, and syntax. Instead, Skeema introspects your tables by querying information_schema and using SHOW CREATE commands, to obtain metadata directly from the database itself. See the workspace documentation to learn how this works for interpreting your CREATE statements from the filesystem.
When operating on a workspace, Skeema halts immediately if any workspace table is detected to be non-empty (contains any rows). This prevents disaster if someone accidentally misconfigures Skeema’s workspace-related options to point to a real/live database schema.
Integration test suite
Skeema’s codebase has an extensive suite of unit, integration, and end-to-end functional tests. Every commit goes through automated testing against the most commonly-used versions of MySQL and MariaDB. For new Skeema releases, the test suite is run using every supported version of MySQL, MariaDB, and Percona Server. Skeema Premium is additionally tested against every supported version of Amazon Aurora.
Generated ALTER TABLE verification
Whenever Skeema generates ALTER TABLE statements, it will automatically verify their correctness by testing them in a workspace. An empty copy of the old (live) table definition is created in the workspace schema, and then the auto-generated ALTER TABLE is run against it. Skeema then verifies that the resulting table definition correctly matches the desired state from the filesystem .sql file, confirming that the generated ALTER TABLE is empirically correct for your specific database server flavor/version.
If verification fails, Skeema aborts with a fatal error. This is extremely rare in practice; throughout Skeema’s decade-long history, it has only occurred in a few reported edge cases, which have long since been corrected.
If desired, diff verification can be skipped by disabling the verify option. This can improve diff performance slightly.
Table introspection validation
In order to validate that a table has been correctly introspected, Skeema internally reconstructs the expected output of SHOW CREATE TABLE, and compares this to the actual output of that command. This technique confirms that your version of Skeema fully supports all aspects of the table’s structure. A few rarely-used table features are not supported by Skeema, due to the necessary introspection code not being implemented yet; Skeema’s introspection validation process allows such cases to be caught automatically.
If an unexpected mismatch is found, any generated ALTER TABLE goes through an adjusted diff verification process in a workspace:
- If a table uses unsupported features but the generated
ALTER TABLEnonetheless successfully brings an empty clone of the table to the new desired state, this means the requested modification didn’t actually touch any of those unsupported features. Therefore the diff is correct and safe, and Skeema can proceed as normal. - However, if the generated
ALTER TABLEdid not bring the cloned table to the desired final state, this means the requested modification to the table potentially involves some table feature that Skeema does not understand or support yet.
When examining modifications to tables with unsupported features, this diff verification process differs from the usual procedure in two important ways:
- For such tables, diff verification occurs even if the verify option is disabled.
- For such tables, if diff verification fails, the affected table is simply skipped with a warning, instead of a fatal error. (Verification is expected to fail for diffs involving unsupported features, so this situation does not represent an internal correctness bug.)
If you need to modify unsupported features of a table, you will need to run the alter externally (e.g., direct invocation of a desired ALTER TABLE, or through pt-online-schema-change / gh-ost). Afterwards, you can update your schema repo using skeema pull, which will work properly even on these tables. Also note that Skeema can always CREATE or lint such tables; only ALTER is potentially problematic.
If you encounter an “unsupported diff” warning for a situation that isn’t already documented, first confirm you are running a recent release of Skeema, and upgrade if not. If the problem remains, please file a bug report issue on GitHub.