Enforce Primary Key constraints on Replication

In this post, we introduce a configuration option that controls whether replication channels allow the creation of tables without primary keys. This continues our recent work on replication security, where we allowed users to enforce privilege checks, and/or enforce row-based events.

On 8.0.20 we introduce a new option for the CHANGE MASTER TO statement: REQUIRE_TABLE_PRIMARY_KEY_CHECK. This enables a replication channel to select its own policy when executing queries that create or alter table definitions and their primary keys.

Enforcing primary keys on table definitions is important for example when replicating using row-based logging were table keys play an important role in the replica performance. The tool in the server for enforcing this policy is the variable sql_require_primary_key. In the context of replication, the value of this variable will be sent together with all queries that change a table structure, also known as DDL, and so the replica will follow whatever restrictions were in place on the primary.

However, if the operator of the replica does not control or trust the primary server, it does not suffice to follow restrictions defined there. For this reason, this behavior can now be influence with the value of
REQUIRE_TABLE_PRIMARY_KEY_CHECK

This parameter can be set on a channel to:

  • ON: the replication channel always uses the value ON for the sql_require_primary_key system variable in replication operations, requiring a primary key in all create and alter table operations.
  • OFF: the replication channel always uses the value OFF for the sql_require_primary_key system variable in replication operations, so that a primary key is never required when creating or altering tables, even if the primary enforced such restrictions.
  • STREAM: the default; the replication channel uses whatever value is replicated from the primary for each transaction. This preserves the previous server behavior.

Usage and advantages

The first use case for this new addition is in scenarios where there is no tight control over the primary instance from which the data originates. In such cases, REQUIRE_TABLE_PRIMARY_KEY_CHECK=ON ensures that no primary keys are removed from your tables definitions thus causing performance issues.

This feature is also particularly interesting in multi source replication scenarios. It allows for a more uniform behavior across replication channels from different primaries, keeping a consistent value for sql_require_primary_key.
Using ON safeguards against the loss of primary keys when multiple primaries update the same set of tables and there was a mistake on one of them. Using OFF allows primaries that can manipulate primary keys to work alongside primaries that cannot.

This feature also has advantages when using privilege checks in the replication channel, as setting REQUIRE_TABLE_PRIMARY_KEY_CHECK to a value different from STREAM means the configured user account for PRIVILEGE_CHECKS_USER no longer needs privileges to manipulate sql_require_primary_key. If set to STREAM, besides the basic privileges to create or alter a table, the privilege checks user is required to have session administration level privileges to replicate any query that executes one of these actions in the replica.

Configuration

To explicitly change the behavior of channel in regards to how it handles primary key checks policies you need to stop the replica SQL thread.

Observability

The Performance Schema tables related to the slave applier status were enhanced to display the status of the new CHANGE MASTER TO … statement option, REQUIRE_TABLE_PRIMARY_KEY_CHECK:

Some notes on usage

This feature is affected by RESET SLAVE ALL, but not by RESET SLAVE.

Also, while the Group Replication plugin does enforce every query to be executed in a table with a primary key, the check does not depend on sql_require_primary_key and is less restrictive. Read more on that here.

Summary

This feature is a new tool to secure your replication streams in complex and diverse environments, while also allowing you better control over the privileges you give your replication applier users.

We hope this new feature will allow you to create more secure solutions with the MySQL server. Feel free to test it, and tell us your opinion.

 3,930 total views,  12 views today

About Pedro Gomes

Who am I? I'm a replication developer @ MySQL since 2013, and a fan of all things distributed so it's hard not to love my job. Raised on the distributed lab of Minho's University, home of great academic research on the field, I joined Oracle following this same passion and here I am!

Leave a Reply