Skeema v1.4.0 released

November 22, 2019

Skeema v1.4.0 has been released! This new version includes diff support for partitioned tables and generated columns. Several other enhancements are included, especially for users with large-scale database deployments.

Partitioned tables

Previously, Skeema was unable to alter partitioned tables. This release adds support for common partitioning use-cases, as well as the ability to properly diff all non-partitioning-related aspects of a partitioned table.

A new partitioning configuration option has been added to handle different push workflows. The default setting of partitioning=keep permits initial partitioning of tables, but prevents accidental de-partitioning or re-partitioning by ignoring any subsequent changes to the partitioning clause for an already-partitioned table. To allow these behaviors, supply --partitioning=modify on the command-line for skeema diff and skeema push.

Some companies use partitioning in production, but omit it in development. Skeema fully supports this workflow – simply use partitioning=remove in a [dev] section of a .skeema file. Then, whenever running a diff or push against dev, all partitioning clauses in your *.sql files will automatically be ignored.

Many users of partitioning rely on external scripts/crons to maintain partition lists, especially with RANGE partitioning. Skeema’s partitioning support is designed to “stay out of your way” with this workflow. Once a table is partitioned in a live database, Skeema won’t interact with its partition list, regardless of the partitioning option. This way, your existing partition management scripts don’t need to interact with your schema repository.

Thank you to Etsy for generously sponsoring the development of this functionality!

Generated columns

Recent versions of MySQL and MariaDB support generated columns – the ability to define columns that are automatically computed based on an expression. Skeema is now able to properly diff tables that use generated columns in MySQL 5.7+, Percona Server 5.7+, and MariaDB 10.2+.

Generated columns may be stored (materialized into the table row data) or virtual (evaluated at read time). Skeema supports both possibilities. For schema management purposes, stored generated columns function similarly to normal columns, but virtual generated columns have a couple unique properties in Skeema:

  • Since virtual columns take up no storage space, dropping them does not require the allow-unsafe option, as any accidental drop can be reverted relatively painlessly.
  • MySQL supports a special WITH VALIDATION ALTER clause when adding virtual columns, to verify that all existing rows of a table fit into the column type. Skeema’s new alter-validate-virtual option can be used to enable this clause.

Other enhancements

As a sharding improvement, the schema option now permits use of regular expressions to match existing sharded schemas that follow a common naming convention. Simply wrap the value in forward slashes to denote a regex; for example, schema=/^foo/ in a directory’s .skeema file will map the dir to all schema names beginning with prefix “foo”.

A new linter check, lint-has-float, can be enabled to warn or error on use of float or double column types. This optional check is designed for companies storing monetary data, which requires exact precision.

Two new options are designed to improve the functionality of the default workspace=temp-schema, especially when operating at scale:

  • The temp-schema-threads option controls concurrency; increase it to improve performance in high-network-latency scenarios, or reduce it to prevent mutex contention in high-volume OLTP production environments.
  • The temp-schema-binlog option controls whether workspace statements are written to the database’s binary log. By default, binlogging is now skipped (preventing unnecessary replication of workspace statements) whenever Skeema’s user has sufficient privileges to do so.

Finally, the default readTimeout (driver-level query timeout) has been increased from 5 seconds to 20 seconds. Users may still override this using connect-options as before if desired; only the default has changed. As an improvement for users with large tables, any ALTER TABLE or DROP TABLE statements executed directly by Skeema (e.g. without an alter-wrapper or ddl-wrapper script) are now automatically excluded from the readTimeout.

CI system improvements

Skeema Cloud Linter now includes the above changes. Add it to your schema repo (or monorepo) on GitHub, and receive automated diff comments and inline linter annotations on pull requests and master commits.

Private repos are fully supported, and there’s no charge for individual accounts. A free trial is available for organization accounts.

Commercial support and consulting services

If your company needs expert assistance with configuring your schema management workflow, don’t hesitate to get in touch! We’re always happy to discuss potential consulting services, custom development, commercial support, and opportunities to sponsor open source improvements to Skeema.