Automatic Schema Synchronization in NDB Cluster 8.0: Part 2

In part 1, we took a brief, high-level look at the various protocols and mechanisms used to keep the Data Dictionary (DD) of MySQL servers connected to a MySQL Cluster in synchronization with each other and with the NDB Dictionary. More specifically, we explored the problems with the implementation of user-triggered synchronization in the NDB Cluster 7.x versions. These concerns are addressed in NDB Cluster 8.0 through a new feature: Automatic Schema Synchronization (or auto schema sync for short).

A new component called Metadata Change Monitor has been introduced to detect any NDB metadata changes. This component runs in the background and compares the contents of the NDB Dictionary with that of the MySQL server’s DD at fixed, user-configurable intervals of time. The Metadata Change Monitor detects any mismatch i.e. the scenario where a metadata object exists in NDB Dictionary and is missing from the MySQL server DD and vice versa. The metadata objects checked for mismatches are:

  • Logfile groups
  • NDB tablespaces
  • Databases (or schemata) containing NDB tables
  • NDB tables

The Metadata Change Monitor submits any mismatched objects detected to a queue from which they are eventually synchronized with the NDB Dictionary. These objects are eventually synchronized by the NDB Event Handling component making the discovery and synchronization of mismatched objects asynchronous by design. The NDB Event Handling component picks up an object from the head of the queue and attempts to synchronize it by either creating or deleting the object in the MySQL server’s DD depending on whether it exists in NDB Dictionary or not. The rate of schema object synchronization is throttled to avoid any significant performance overhead.

Usability

The primary goal for auto schema sync in terms of improving usability is to remove the need for users to perform a manual step in order for metadata changes made using native NdbApi to be visible in MySQL servers. By default the Metadata Change Monitor component polls for mismatches every 60 seconds which ensures that all metadata changes are eventually propagated to the MySQL servers without any user intervention. The feature can be enabled or disabled by setting the ndb_metadata_check MySQL server system variable to 1 or 0 while the interval can be tweaked using the ndb_metadata_check_interval system variable. The shorter the interval, the quicker the mismatches will be detected and synchronized but this also results in more resource utilization which is a trade-off the user will have to be wary of.

There are a couple of MySQL server status variables: Ndb_metadata_detected_count and Ndb_metadata_synced_count which contain the count of the number of objects detected and synchronized respectively.

The above mechanism ensures that the metadata is eventually present in the MySQL server’s DD and also serves as an option to fall back on for certain failed schema distribution or schema synchronization attempts. It is, however, not a drop-in replacement for the erstwhile SHOW TABLES behaviour. There’s still a valid use case where, for example, an application needs to restore metadata using the ndb_restore utility and then ensure that all the metadata is now present in the MySQL server before continuing with further processing. In such cases, the eventual consistency achieved by the polling Metadata Change Monitor and synchronization of the queue is not ideal as it would require additional application logic to see if the metadata is present or polling the above status variables until the desired state is detected. To solve this, a new MySQL server system variable called ndb_metadata_sync was introduced.

The usage is neatly summarized in the MySQL manual which is quoted verbatim below for the sake of convenience:

Setting this variable causes the change monitor thread to override any values set for ndb_metadata_check or ndb_metadata_check_interval, and to enter a period of continuous change detection. When the thread ascertains that there are no more changes to be detected, it stalls until the binary logging thread has finished synchronization of all detected objects. ndb_metadata_sync is then set to false, and the change monitor thread reverts to the behavior determined by the settings for ndb_metadata_check and ndb_metadata_check_interval.

This can be demonstrated with the help of a small example as follows:

Assume that the above metadata is backed up using the ndb_mgm client (skipped for the sake of brevity) and then the database ‘db1’ is dropped using the MySQL client. The ndb_restore utility can be used to create the metadata in the NDB Dictionary but not in the DD of the MySQL server. Rather than waiting for periodic polling to find the mismatch and synchronize the schema, a user can simply set the ndb_metadata_sync variable to true and wait until it is automatically flipped back to its default value of false.

Global Locks

In the NDB Cluster 7.x implementation, a global lock is taken which spans the entire duration of the synchronization activity. With auto schema sync, it is now held only for multiple, short intervals. The NDB Event Handling component acquires (and releases) this global lock on a per object basis. An important thing to note is that a try-lock strategy is employed when it comes to acquiring this lock. This, coupled with the fact that the locks are short-lived, makes auto schema sync less intrusive and less likely to affect other DDL changes that may be taking place in parallel.

No additional overhead during SHOW TABLES

In NDB Cluster 8.0, the SHOW TABLES query does just that and no more. The additional synchronization and resource contention in terms of locking that occurs as a side-effect in NDB Cluster 7.x versions has been completely removed.

Design concern

The Metadata Change Monitor component is used to simply detect any mismatches and submit it to the NDB Event Handling component. It is the NDB Event Handling component that is actually responsible for acquiring the appropriate global and metadata locks while modifying the MySQL server’s DD. This is in line with the design of the schema synchronization and schema distribution protocols therefore aligning the 3 different mechanisms from a design perspective. From a code point of view, this also enabled removal of code since the functionality is encapsulated in a single place.

One interesting design challenge for this feature is the scenario when the NDB Event Handling component fails to synchronize an object due to a permanent error in execution. In such cases, the same mismatch could be detected again and again by the Metadata Change Monitor along with (possibly) successive failed attempts by the NDB Event Handling component. This is prevented by maintaining a blacklist of objects that the NDB Event Handling component has failed to synchronize. On failure, the object is placed in the blacklist. The user is then expected to resolve the mismatch by attempting to discover the object using SELECT or SHOW queries or triggering a reconnection of the MySQL server to the MySQL Cluster in more extreme cases. The number of objects present in the blacklist can be checked using the Ndb_metadata_blacklist_size variable.

For as long as an object exists in the blacklist, it’s ignored by the Metadata Change Monitor in subsequent iterations. The validation of the objects in the blacklist is done by the Metadata Change Monitor at the beginning of the next detection cycle. Each object in the blacklist is checked to see if the mismatch still exists. If it doesn’t, then the object is removed from the blacklist and is considered to be a viable candidate for automatic schema synchronization from that point onwards. If the mismatch still exists, then the object is ignored for another detection cycle and will continue to be ignored until the user manually intervenes to correct the mismatch.

Summary

From a user point of view, the main change as a result of auto schema sync in NDB Cluster 8.0 is how metadata restored using the ndb_restore utility is propagated to the DD of the MySQL server.

In 7.x versions, users are expected to issue the following query in order to synchronize changes:

In 8.0, users can simply wait for the periodic polling and synchronization of the changes to occur. The polling period can be changed by tweaking the ndb_metadata_check_interval MySQL server system variable:

Alternatively, in 8.0, the user can set the ndb_metadata_sync MySQL server system variable to true and wait until it is automatically flipped back to false:

There’s more work planned in the area with increased functionality and exposing more details to the user on top of the wishlist. As with any new feature, early feedback from the community is vital and much appreciated!

 6,699 total views,  24 views today

About Arnab Ray

Arnab Ray is a Lead Member Technical Staff at MySQL Cluster, Oracle. He works primarily in the area of continuous integration of all MySQL Cluster components affected by changes to the MySQL Server. This includes a special focus on the ndbcluster storage engine plugin's usage of the MySQL server's Data Dictionary.

One thought on “Automatic Schema Synchronization in NDB Cluster 8.0: Part 2

Leave a Reply