MySQL-5.7.6: Introducing Multi-source replication

On March 10, 2015, we released MySQL-5.7.6 and among many other things it includes multi-source replication which provides the ability for a MySQL slave server to replicate from more than one MySQL master. We have been very careful with multi-source replication in terms of what exactly our users want and we have tried our best to incorporate as much feedback as possible. In the process, we released the feature twice under MySQL Labs asking users to try it out and tell us:

  1. If this caters to all their use cases,
  2. Plays well with the existing applications using single source replication so our users don’t have to change their old scripts at all if they do not intend to use multi-source and instead stick to single source replication.
  3. The user interface is in sync with our naming conventions to date, easy to understand and intuitive etc.

Note that the look and feel changed as we moved from one lab release to another and finally to the latest version as MySQL-5.7.6. In this post, I aim to introduce the released feature, the commands and how you can monitor the multi-source replication internals. Let’s start with the following figure that best illustrates the core of multi-source replication.

multi-source-basic

In the figure above we have three MySQL sources (independent MySQL servers- master 1, master 2 and master 3) replicating to the only slave acting as a sink to collect all the data from all the three sources.

The use cases of multi-source, as you have probably already guessed, are related to data aggregation. Note that there is no conflict detection or resolution built into multi-source replication. We expect the application to make sure that data coming from different sources are non-conflicting. A typical setup could be something like this:

multi-source-with-databases

 

The same concept could be extended to shards (instead of databases). So another use case of multi-source replication is to join shards and make a full table on the sink (aka slave). .

To understand how to configure and monitor multi-source replication, we introduced the notion of channels. A channel is an abstraction of the internals of MySQL replication’s finer details. It hides the machinery underneath while providing the level of detail that helps the user manage and understand multi-source replication. From a user perspective you can imagine a channel as a pipe between a master and a slave. If there are multiple masters, there are the same number of channels (or pipes) emerging out of the slave server as the number of sources as shown in the picture below:

multiple-channels

 

If you understand MySQL internals already and want to know exactly what constitutes a channel, look at the pink strip in the following picture. The replication channel documentation has all the details. But if you don’t know these details already, ignore this figure and move ahead. After all that is what we wanted to achieve with the concept of a channel.

channel

 

With the concept of channels established, you can now follow steps described in the tutorial section of our official documentation to work with multi-source replication. Note how the FOR CHANNEL <channel_name> clause now allows you to take each master-slave instance individually and work with them as if you were working with a single source replication topology.

Having set up multi-source replication and making sure there are no conflicts you can expect it to just work out of the box. But if you want more details you could look at our monitoring interfaces to provide you the details on every channel. In MySQL-5.7.2, we introduced performance_schema tables to monitor replication, the good news is that these tables were always designed with multi-source replication in mind so they should work seamlessly with multi-source replication. All six replication performance schema tables now have a “channel_name” field added to them to individually access the configuration and status on each channel. As an example, we have described performance_schema.replication_connection status in our manual. Given that you are working with multiple channels, lets walk through this table again and see how it presents the internals. Try out the following query to look at the receiver module:

ps-table3

We can see quite a few things here:

  1. Which master does channel1 replicate from?
    The one whose UUID is as given in the “source_uuid” column.
  2. Which thread is responsible for receiving transaction through channel1?
    Thread number 13. Note that this thread number is same as the one in performance_schema.threads table. You can use this information to now look into other performance_schema tables to mine more statistics using the joins.
  3. Is the receiver module of channel1 active or down at the moment?
    The channel is active and receiving transactions because its service state is ON.
  4. What transactions have been received via channel1?
    Transactions 1-4. See the global transaction identifiers in the field “received_transaction_set”.
  5. What if there was an error?
    The last three fields in each row give an idea of the error on that channel, if any.
    See  the example below where channel1 has lost connection:
    ps-table4
  6. How about my network?
    Look at the last two columns to get an idea of this. The last time a heartbeat signal was sent is shown in the “last_heartbeat_timestamp” column.
    The “count_received_heartbeat” indicates how frequently heartbeats are being sent out. A big number here would mean that either the connection is not stable or this channel was idle having nothing to replicate.

Likewise there are more replication performance_schema tables that you can use to get to finer details. If there is anything else that you think should be available in these tables per channel, leave us a comment and we would be happy to take it forward. We truly believe in working with our users and look forward to your experience with this feature. There is a lot more to explore in multi-source replication so go try out the feature and if there is anything that you have suggestions for, please do leave a comment. Don’t forget that MySQL 5.7.6 is a development milestone release (DMR), and therefore not yet declared generally available.

Purging and Rotating Logs Made Easy with MySQL Utilities

The MySQL Utilities team is happy to introduce a new MySQL utility named  ‘mysqlbinlogpurge‘, which allows users to easily and safely purge binary logs on a master by determining the binary logs that are obsolete. This utility is included in MySQL Utilities
release-1.6.1 Alpha release along with ‘mysqlslavetrx‘ and ‘mysqlbinlogrotate‘.
This utility enables you to purge binary logs by ensuring that any files which are in use or required by any of the slaves in a replication topology are not deleted. This is achieved by
checking which binary logs have been read on each slave. This determines the minimal set of binary log files that can be purged.
The ‘mysqlbinlogpurge’ Utility automates the steps described in:
to manually verify which binary logs are not required by any of the slaves.
The ‘mysqlbinlogpurge’ Utility can help you to quickly free some space on the master without increasing space for your current replication topology.

Main Features

————-
By default, ‘mysqlbinlogpurge’ purges all of the the binary log files that are not in use. You can use the –binlog option to override this behaviour and configure the first binary log file to omit from the purge list. The ‘mysqlbinlogpurge’ is safe to use because it only purges the binary logs if all of the slaves are actively participating in a replication topology. You can use the –dry-run option to display a list of binary log files to be purged.

Requirements

————
The ‘mysqlbinlogpurge’ utility requires the SUPER and REPLICATION SLAVE permissions for the connected user in order to be able to purge the binary logs.

Examples

——–
The following is an example of a purge operation of all the binary log files not in use from a master, specifying the slaves to check and displaying additional information in the process.

 

shell> mysqlbinlogpurge –master=root:root@localhost:3310 \
–slaves=root:root@localhost:3311,root:root@localhost:3312,root:root@localhost:3313 \
-vv
exec_util command=python -u ../scripts/mysqlbinlogpurge.py –master=root:root@localhost:3310 –slaves=root:root@localhost:3311,root:root@localhost:33
12,root:root@localhost:3313 -vv
# Checking user permission to purge binary logs…
#
# Master active binlog file: mysql-bin.000021
# Checking slave: localhost@3311
# I/O thread is currently reading: mysql-bin.000021
# File position of the I/O thread: 120
# Master binlog file with last event executed by the SQL thread: mysql-bin.000021
# I/O thread running: Yes
# SQL thread running: Yes
# Checking slave: localhost@3312
# I/O thread is currently reading: mysql-bin.000021
# File position of the I/O thread: 120
# Master binlog file with last event executed by the SQL thread: mysql-bin.000021
# I/O thread running: Yes
# SQL thread running: Yes
# Checking slave: localhost@3313
# I/O thread is currently reading: mysql-bin.000021
# File position of the I/O thread: 120
# Master binlog file with last event executed by the SQL thread: mysql-bin.000021
# I/O thread running: Yes
# SQL thread running: Yes
# Range of binlog files available: from mysql-bin.000016 to mysql-bin.000021
# Latest binlog file replicated by all slaves: mysql-bin.000020
# Latest not active binlog file: mysql-bin.000020
# Executing query PURGE BINARY LOGS TO ‘mysql-bin.000021′
# Binlog file available: mysql-bin.000021
# Range of binlog files purged: from mysql-bin.000016 to mysql-bin.000020
Upon completion, the utility will display additional information such as status of the I/O and SQL threads of each slave along with the list of binary logs purged and the remaining
binary logs available on the master.

Try it now and send us your feedback

MySQL Utilities release-1.6.1 alpha is available for download from the following links.
The documentation of MySQL Utilities can be obtained from the following link:
http://dev.mysql.com/doc/index-utils-fabric.html
Contributing Ideas:
- Fabric, Sharding, HA, Utilities
Forum: http://forums.mysql.com/list.php?144

Announcing MySQL Utilities release-1.5.4 GA!

The MySQL Utilities Team is pleased to announce the general availability (GA) release of MySQL Utilities. This release includes a number of improvements for useabilty, stability, and a few enhancements. A complete list of all improvements can be found in our release notes.

New Enhancements!

This release represents a stable release of the product. Along with several defect patches, we also include the following enhancements.

Improved support for MySQL 5.7 early releases
Improved output for mysqldbcompare
Improved SSL support

How Can I Download MySQL Utilities?

You can download MySQL Utilities 1.5.2 GA from the following link using one of the pre-built installation repositories including a source download.

http://dev.mysql.com/downloads/tools/utilities/

Where is the Documentation?

You can find online documentation for MySQL Utilities version 1.5 at:

http://dev.mysql.com/doc/index-utils-fabric.html

Announcing MySQL Utilities release-1.6.1 Alpha!

The MySQL Utilities Team is pleased to announce a new alpha release of MySQL Utilities. This release includes a number of improvements for useability, stability, and a few enhancements. A complete list of all improvements can be found in our release notes.

New Enhancements!

This release represents a stable release of the product. Along with several defect patches, we also include the following enhancements.

Improved support for MySQL 5.7 early releases
Improved output for mysqldbcompare
Improved SSL support

New Utilities!

This release also has three new utilities for you to try out:

mysqlslavetrx – skip transactions on one or more slaves to solve errant transactions and consistency errors – docs
mysqlbinlogpurge – purge old binary logs – docs
mysqlbinlogrotate – rotate the binary logs – docs

How Can I Download MySQL Utilities?

You can download MySQL Utilities 1.6.1 Alpha from the following link using one of the pre-built installation repositories including a source download. Click on “Development Releases” to see the 1.6.1 release downloads.

http://dev.mysql.com/downloads/tools/utilities/

Where is the Documentation?

You can find online documentation for MySQL Utilities version 1.6 at:

http://dev.mysql.com/doc/index-utils-fabric.html

Enabling Global Transaction Identifiers Without Downtime in MySQL 5.7.6

A much requested feature has made it to MySQL 5.7.6: Global Transaction Identifiers (GTIDs) can now be enabled online, without stopping writes, synchronizing servers, or restarting servers.

We introduced GTIDs in MySQL 5.6.6. GTIDs allow, among other things, seamless fail-over after a master has crashed. This enables highly available applications, since service can continue without interruption even if the master crashes. For new deployments, you could easily enable GTIDs right away – just set gtid-mode=ON in the configuration files. For deployments that accept a certain amount of downtime, you could switch from off to on too. However, for big deployments with strict limitations on downtime, this was more problematic, since you had to stop updates, sync all servers, and restart all servers simultaneously with GTIDs enabled, and this would lead to several minutes of downtime.

In MySQL 5.7.6, we have now made it possible to enable GTIDs and keep the service up at all times. In fact, we removed the need to stop updates, we removed the need to sync the cluster, and we removed the need to restart any servers. At the same time, we enabled steps to pre-check that the workload is GTID-compatible before forcing it to be so, so as to eliminate the risk for surprises. We also simplified and reinforced parts of the codebase, especially around GTID execution. You can now turn on GTIDs with no interruption at all in application service. Read more

More Awesome Replication Features in MySQL 5.7.6

More Awesome Replication Features in 5.7.6

It is time to celebrate again. The latest MySQL server development milestone (MySQL 5.7.6) was just released, and let me tell you, it is full of great replication enhancements. These new improvements cover many areas, ranging from performance to flexibility and easier deployment. Let me highlight some of them, and give you a brief summary of what they are.

Multi-source Replication

A MySQL 5.7.6 slave/server can now connect to multiple MySQL masters.

After a long development period, two labs releases, handling feedback from community, and a lot of internal (and external testing as well – thank you!), the multi-source replication feature was finally pushed into MySQL 5.7. This is a major milestone for replication itself. The feature allows a single MySQL server to aggregate data from multiple other MySQL servers. This has several use case scenarios: (i) aggregate data from multiple shards for doing cross-shard operations in a simpler manner; (ii) consolidate data for integrated backups; (iii) data-hub for inter-cluster replication; (iv) optimize some deployments of circular replication. I am pretty sure that I am missing other scenarios that I have not listed, but judging from the ones laid out, this feature looks very interesting already.

Enhanced Multi-Threaded Slave

MySQL 5.7.6 has improved the multi-threaded slave applier throughput. There is a more efficient way of tracking dependencies between transactions executing concurrently on the master. The ultimate result? The slave is faster.

It all started with MySQL 5.6. We began changing the replication codebase that had been single threaded since the beginning of MySQL replication. Traditional MySQL replication was single threaded: master executes and serializes transactions in a replication log (the binary log); then slave server pulls the changes to its own transient relay log and a single threaded applier installs these changes, one by one. And this changed in MySQL 5.6! The applier became parallel and it can actually apply transactions concurrently provided that certain conditions are met. The rule enforced is that any two transactions changing the same database will have to be applied sequentially.

In MySQL 5.7, we have gone further. We lifted the rule that only transactions operating on different databases are to be applied in parallel. In fact, the multi-threaded slave applier will apply transactions concurrently as long as they have executed (and thus started committing) concurrently on the master and have not blocked each other during execution. Thence, as the master commits more transactions concurrently, the slave will also apply more transactions in parallel. Pretty cool!

Enhanced Deployment of Global Transaction Identifiers

In MySQL 5.7.6 the user can enable/disable global transactions identifiers (GTID) in the replication topology without having to first synchronize and then stop and restart the entire set of servers.

Yes, MySQL 5.6 was a great release, packed with many replication features. It introduced a new feature that is a considerable leap forward regarding how one tracks the data in the replication stream. Global Transaction Identifiers is the name of the feature set that implements: (i) the ability to automatically track the position in the replication stream, thus reducing administration overhead; (ii) automatically skips transactions that have been already processed, thus enforcing consistency; and (iii) automatically fetches, from the source, only those changes that are actually needed – minimizing network and disk usage. It was also a disrupting feature, therefore, it required some coordination and synchronization between all servers in a replication topology before it could actually been turned on to make the replication stream GTID aware.

In MySQL 5.7.6, we have instrumented the server to be able to change the replication stream to be GTID aware by doing it in phases. I.e., without requiring the topology to be completely synchronized and then restarted in one go. The user has now a distributed procedure to enable or disable the GTIDs in the topology, but more importantly, the user can do it without having to synchronize the entire topology beforehand.

Yes, gtid_mode is now a dynamic variable. ;)

Tracking Replication Session State

MySQL 5.7.6 can be instructed to include global transaction identifiers information in the response the server sends back to the application after executing a read/write or a read only transaction. Such information can be leveraged to track dependencies when accessing data throughout a replication topology.

By tweaking a dynamic system variable, the user can instruct the server to track global transaction identifiers and report them in the response it gives back to the connector (i.e., include GTIDs in the OK packet of the MySQL protocol). Applications, middleware or even connectors can be made aware of this and then use that information for several purposes. One that immediately jumps into mind is to rely on that information to be able to transparently track changes throughout a farm or replicated MySQL servers.

All in all, the user can configure the server to get back no GTIDs at all (default), the GTID of the last committed transaction or the set of GTIDs committed up to the point in time the current transaction finished.

Bug Fixes

As usual, there are a lot of bugs that have been fixed in MySQL 5.7.6. This translates into a more stable, reliable and usable server, therefore a more user friendly replication.

Summary

MySQL 5.7.6 is yet another awesome release with cool new replication features. These new features along with those that were already in, make MySQL 5.7 very appealing. Go and give it a try. Let us know your feedback. The bug tracker is a good starting point for that, but there are other ways, such as the replication mailing list or even by commenting on this blog post. Mind you that MySQL 5.7.6 is a development milestone release (DMR), thence not yet declared generally available. Use it at your own risk.

Have fun playing with MySQL 5.7.6 DMR.

MySQL Utilities Makes Skipping Transactions on Multiple Servers Easy

We are very happy to introduce a new MySQL utility named “mysqlslavetrx“, which allows users to easily skip multiple transactions on multiple servers in a single step. This utility is one of three new utilities included in MySQL Utilities release-1.6.1 Alpha. The other utilities are “mysqlbinlogpurge” and “mysqlbinlogrotate“, which can be used to purge and rotate binary logs.

The mysqlslavetrx utility allows you to skip multiple transactions on several slaves. More precisely, it injects empty transactions for the specified Global Transaction Identifier (GTID) set and list of target slaves. Skipping transactions can be useful to quickly recover from erroneous situations that can occur during the replication process, or to handle errant transactions. Check out the following blog posts for more details about concrete situations where you might need to inject empty transactions:

The mysqlslavetrx utility can assist you in the above situations, making the process of injecting empty transactions less tedious. However, misuse of this technique may cause the slaves to skip the wrong transactions. Thus, take care to ensure the transactions skipped do not introduce data inconsistencies.

Note: You can use the mysqlrplsync utility to check the data consistency of a replication topology.

Main Features

Here is a summary of the main features of the mysqlslavetrx utility:

  • Skips multiple transactions (GTIDs), i.e. inject empty transactions, on multiple slaves.
  • Provides a dry run mode to allow users to confirm which transaction would be skipped with the specified input without actually injecting any empty transactions on the target slaves.
  • Checks existing transactions (GTIDs) and ignore them, since an empty transaction cannot be injected for an already executed GTID.

Requirements

The mysqlslavetrx utility requires all target servers to have GTID enabled.

Example

Here is an example of the execution of the utility:

shell> mysqlslavetrx --gtid-set=ce969d18-7b10-11e4-aaae-606720440b68:1-5 \
          --slaves=dba:pass@slave2:3312,dba:pass@slave3:3313
WARNING: Using a password on the command line interface can be insecure.
#
# GTID set to be skipped for each server:
# - slave2@3312: ce969d18-7b10-11e4-aaae-606720440b68:1-5
# - slave3@3313: ce969d18-7b10-11e4-aaae-606720440b68:1-5
#
# Injecting empty transactions for 'slave2:3312'...
# Injecting empty transactions for 'slave3:3313'...
#
#...done.
#

As expected the only options that need to be specified is the GTID set representing the transactions to skip and the list of target slaves (more precisely their connection strings).

Now with the mysqlslavetrx utility, you no longer need to manually inject an empty transaction one by one on each required slave. Isn’t that cool?

Try it now and send us your feedback

MySQL Utilities release-1.6.1 alpha is available for download from the following links.

MySQL developers website: http://dev.mysql.com/downloads/tools/utilities/
Launchpad project: https://launchpad.net/mysql-utilities

The documentation of MySQL Utilities can be obtained from the following link: http://dev.mysql.com/doc/index-utils-fabric.html

Contributing Ideas:

MySQL 5.7.5- More variables in replication performance_schema tables

At MySQL, replication usability is of utmost importance to us. Replication information has long been part of SHOW commands, SHOW SLAVE STATUS occupying a major chunk of it. The other sources of replication information being:

SHOW MASTER STATUS,
SHOW BINLOG EVENTS,
SHOW RELAYLOG EVENTS,
SHOW VARIABLES,
SHOW STATUS,
ERROR LOGS etc.

As the replication module grows further, there is a lot more monitoring information, so much that the present interfaces seem too rigid to accommodate all the information we would like to present. So we need to organize them in a more structured manner. In MySQL-5.7.2, we introduced replication performance_schema (P_S) tables providing an SQL interface to monitor replication configuration and status partitioned into different tables, each table grouping logically related information. You can read more about the contents of these tables from official MySQL documentation.

In MySQL-5.7.5 we added some more status variables to these performance_schema tables to enable monitoring the latest replication features viz. multi-source replication in labs. Multi-source allows a MySQL slave to replicate from multiple sources (masters) directly. Talking of multi-source, one needs the replication information per source. So we added global variables that would be useful to extend to per-source scope to the replication performance\_schema tables to help monitor multi-source replication. Note that these variables still work for the single sourced replication and can still be accessed as:

Show status like ‘Slave_running';
Show status like ‘Slave_retried_transactions';
Show status like ‘Slave_last_heartbeat';
Show status like ‘Slave_received_heartbeats';
show status like ‘Slave_heartbeat_period';

Note though that the status variables are now mostly useful in single-source mode ONLY. If more sources are added, the status variables still just apply to the first source. For other replication sources (masters), the only way to access these variables is to use the replication performance_schema tables as named in the table below. Here is how the names of server variables map to the names in the replication performance_schema tables:

Variable name P_S Table Name P_S field name
 SLAVE_HEARTBEAT_PERIOD  replication_connection_configuration  HEARTBEAT_INTERVAL
 SLAVE_RECEIVED_HEARTBEATS  replication_connection_status  COUNT_RECEIVED_HEARTBEATS
 SLAVE_LAST_HEARTBEAT  replication_connection_status  LAST_HEARTBEAT_TIMESTAMP
 SLAVE_RETRIED_TRANSACTIONS  replication_execute_status  COUNT_TRANSACTIONS_RETRIES

The variable ‘slave_running’ reports whether the slave is running or not. This can be found by inspecting the two replication components (receiver and applier) separately to see if the receiver module is running or not by executing

SELECT SERVICE_STATE FROM performance_schema.replication_connection_status;

And execute module is running or not by executing

SERVICE_STATE FROM performance_schema.replication_execute_status;

Please try out multi-source replication and our new monitoring interface in the form of replication performance_schema tables. As always, your feedback is very valuable to us.

MySQL 5.7.5-labs: Multi-source Replication

Multi-source replication for MySQL has been released as a part of 5.7.5-labs-preview
downloadable from labs.mysql.com. It is one among the several features that are
cooking in the replication technologies at MySQL.  (For a birds eye view of all
replication features introduced in 5.7 and labs, look  at the blog posts here and here.

Previously, we have introduced a preliminary multi-source feature labs preview. Based on the feed back from that labs release, we have released a feature preview based on 5.7.5 DMR which:

  1.  has a better user interface;
  2. is well tested;
  3. and has a better integration with other features such as Multi-threaded slave (MTS)Global Transaction ID(GTID) and replication performance_schema tables.

In this blog, I give you a description of what and how of a multi-source replication.

Motivation

Multi-source replication makes a replication slave connected to several sources (aka masterssimultaneously and aggregates the data from these sources. This setup has use cases like, having to backup of several servers to a single place or merging of tableshards or making any replication topology possible. Note that, multi-source replication doesn’t do any conflict detection and resolution and those tasks are left to the user’s application. If you are looking for a multi-master update everywhere with conflict detection and automatic resolution, look at the post here.

Replication Channel 

In my previous blog post, the concept of replication channel has been introduced. To recap, a replication channel encompasses the path of binary log events from master to the IO thread to the Relay log files to the applier threads (SQL thread or coordinator & worker threads). Multi-sourced slave creates a replication channel for each master. Each channel has a unique name and configuration parameters can be specified per channel (described below).

Preconditions for setting up multi-source replication

Replication repositories are either stored in FILE or TABLE based repository. TABLE based repository is crash-safe and multi-source replication is usable only in the case of TABLE based repositories. To setup multi-source replication, start mysqld with
master-info-repostiory=TABLE && –relay-log-info-repository=TABLE.
(A better way, is to put these options in the .cnf files). If you are using a single source replication using FILE repository and want to setup a multi-source replication, convert repositories to TABLE dynamically in the following way:
STOP SLAVE;
SET GLOBAL @@master_info_repository = TABLE;
SET GLOBAL @@relay_log_info_repository = TABLE;

Commands for operations on a single channel

To make replication commands act per channel, we have introduced a new clause
called FOR CHANNEL “<channel_name>” to the replication commands so that
a user can manage a channel independent of other channels.
Following commands are channel specific:

  • CHANGE MASTER… FOR CHANNEL “<channel_name>”
    – Creates a channel if it doesn’t exist.
  • START SLAVE FOR CHANNEL “<channel_name>”
  • STOP SLAVE… FOR CHANNEL “<channel_name>”
  • SHOW RELAYLOG EVENTSFOR CHANNEL “<channel_name>”
  • FLUSH RELAY LOGS FOR CHANNEL “<channel_name>”
  • SHOW SLAVE STATUS FOR CHANNEL “<channel_name>”
  • RESET SLAVE [ALL] FOR CHANNEL “<channel_name>”

There is an extra parameter introduced for the following functions.

  • MASTER_POS_WAIT(binlog_file, binlog_pos, time_out, channel_name);
  •  WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS(gtid_set, time_out, channel_name);

Compatibility with single source replication

The server automatically creates a channel whose name is the empty string (“”). We call this the default channel. Replication commands with no FOR CHANNEL clause act on this channel if there is no named channel configured (A named channel is one that has a name different than “”).
This means a default channel exists the user can still use the same commands to interact with replication. As such, there is no need in that case to use the FOR CHANNEL clause at all, since the behavior will match the one before multi-source replication. This means that with a single channel configured, we get backwards compatibility command-wise.

Commands for operations on all channels

Since a slave could have several replication channels, and if FOR CHANNEL clause is not provided, then replication commands shall act on all the channels wherever it is semantically valid. (Note that this behavior is different from previous labs release).
For example:

  • START SLAVE [IO_THREAD|SQL_THREAD] starts replication threads for all the  channels.
  • STOP SLAVE [IO_THREAD/SQL_THREAD] stops replication threads for all the      channels.
  • SHOW SLAVE STATUS reports the status for all the channels.
  • RESET SLAVE [ALL] resets the slave for for all channels.
    – To preserve backward compatibiliy though, server purges all the relay log files
             of named channels but initialize them again for only default channel.
    – The ALL option would delete the internal memory structures and thereby all the
             information and configurations assosiated with a particular channel.
  • FLUSH RELAY LOGS flushes the relay logs of all channels.

However, not all the commands make sense to do for all channels. The following
commands generate error 1964 (“Multiple channels exist on the slave. Please provide channel name as an argument.”) when a channel is not provided and if number of channels are greater than 1. (Note that, a default channel always exists).

  • SHOW RELAYLOG EVENTS;
  • CHANGE MASTER..;
  • MASTER_POS_WAIT();
  • WAIT_FOR_SQL_THREAD_AFTER_GTIDS();

Multi-source monitoring

To monitor the status of all slave channels, a user has two options:

  1. performance_schema tables.: 5.7.2 introduced replication performance_schema tables to monitor replication status. Multi-source replication extended these tables to add Channel_Name as the first column to these tables. This will enable the users to write complex queries based on Channel_Name as a key.
  2. SHOW SLAVE STATUS [FOR CHANNEL “<channel_name>”]This command by default shows the slave status for all channels with one row per channel. The identifier channel_name is added as the column in the result set. If a FOR CHANNEL clause is provided, the user will be provided the status of only that replication channel.

(NOTE: Replication SHOW VARIABLES: SHOW VARIABLES does not work with multi-source
replication. The information that was available through these variables have
been migrated to the replication performance tables. For more information, look at
Shiv’s post  hereThe current SHOW VARIABLES will show status of only the default channel.)

User session and server log error messages

Since, a multi-sourced slave can have multiple channels, we introduced explicit
error messages pertaining to a channel. To make this more consistent across all
channels, we introduced new error codes and messages.
For example: The notes that used to be emitted in previous versions of mysql
 “Slave is already running” or “Slave is already stopped”  is replaced with
  “Replication thread(s) for channel “channe_name” are already running”  and
  “Replication threads(s) for channel “channel_name” are already stopped” respectively.
We also changed server log messages to indicate the channel on which the note/warning/error was generated. This makes debugging and/or tracing easier.

Integration with other features

This feature has been well integrated with Global Transaction Identifiers and
Multi-Threaded Slave. Currently, replication filters are global and applies
to all channels and cannot be set per channel. Semi-sync replication only works when
all the masters and slave is semi-sync enabled. We are looking at further extending
the Multi-source framework for better support of these features.

Conclusion

Multi-source replication has been released as part of 5.7.5-labs-preview. Please try out this
new labs release which can be downloaded  at labs.mysql.com and tell us if you need
some things more to make this feature even better.

High-Availability at MySQL Central

This year’s MySQL Central at Oracle Open World was an exhilarating experience. In contrast to the previous year’s MySQL Connect events, MySQL have now got their own Central at the main Oracle Open World. In the previous years, we were always short on time and trying to get a lot of sessions into just two days was just to much. This time I could both present sessions, attend sessions by other users, and also to talk to people in the MySQL community: something that I really enjoy and also find very valuable to see where we should be heading.

This year, the “MySQL Fabric Team” representation on MySQL Central was me and Narayanan Venkateswaran, which is heading the sharding solution in MySQL Fabric. Together with the conference, we also released MySQL Fabric 1.5.2 as the GA release of MySQL Fabric 1.5 containing a few new features:

  • Server Provisioning support was developed and added to MySQL Fabric 1.5 by Alfranio Correia. This allow us to integrate OpenStack, some other cloud solutions, or in general any asset management system, with MySQL Fabric making it possible to automatically provision new machines and incorporate them into the MySQL Fabric farm. When talking to people and seeing the existing deployments we realized this is critical for operations, so we decided on implementing support for it.
  • In addition, Narayanan extended range sharding using DATETIME and string sharding keys. Before 1.5 there were only support in range sharding for integral keys, where hash sharding supported arbitrary keys, but now you can use range sharding to shard based on datetimes or columns that hold strings such as CHAR, VARCHAR, or BLOB.
  • MySQL-RPC support was added by Geert Vanderkelen so that it is now possible to use the MySQL protocol to talk to the MySQL Fabric node. This is critical to be able to implement Fabric support in connectors that do not have easy access to an HTTP and XML library, such as the C or C++ connector.
  • A Fabric-aware Connector/C was developed by Rafal Somla, Igor Solodovnikov, and Bogdan Degtyariov in the connector team and was distributed as a labs release. The C connector is base for many other connectors such as MySQL/Ruby connector and the Perl connector DBD::mysql, and this has been requested by several people, so we decided to prioritize implementing this. You can find more information in the blog Using Connector/C with Fabric.
  • A Fabric-aware Connector/.NET was developed by Roberto Ezequiel Garcia Ballesteros and Fernando Gonzalez Sanchez and increase our connector support for MySQL Fabric.
  • MySQL Workbench 6.2 have also been released with some support for Fabric so that you can view the structure of a MySQL Fabric farm and the status of the groups and the servers in the farm.

We had several sessions on MySQL Fabric-related topics:

There were a lot of interest in MySQL Fabric and you also had a lot of good questions and comments that will help us going forward. With release 1.5 of MySQL Fabric in GA, we now have a lot of support for managing a farm:

  • High-availability support with automatic fail-over of servers.
  • Range and hash sharding with support for datetime, strings, and integers.
  • Sharding operations to move and split shards with minimal locking time.
  • Integration with OpenStack and new interface to support adding extensions for integrating with other cloud or asset management systems.
  • Easy-to-use command-line interface for managing a farm of MySQL server.
  • Support for custom failure detectors so that you can use your own failure detector with MySQL Fabric.
  • Clear and easy to use APIs in XML-RPC and MySQL-RPC (RPC over MySQL protcol) so that you can write your own applications to automate whatever tasks you need done.
  • Fabric-aware connectors for Java, PHP, .NET, C, and Python.

However, it does not stop there. From the questions and the people I talked to on MySQL Central, it seems like the most important topics are:

  • Ease of use. Many questions were about how to do various tasks, for example: how to add new shards to a hash-sharded system to balance the load, how to repair and bring back servers into the farm, how to provision new slaves, and what changes to make to the application to use MySQL Fabric. Technically, this is already possible, but it can be easier and more intuitive to use. Ease of use is a continuous work we do and something that we always need to improve both in how to use the tools but also through providing good documentation and information on the all the features we have.
  • High-Availability of the Fabric system itself is a topic that came up quite frequently. We gave a solution for this in the “High-Availability on Different Levels” session above, but we need to ensure that we have good documentation for this so that is it clear what needs to be done. And this is, of course, always possible to improve on through introducing solutions that are easier to deploy and use.
  • Integrating with other cloud system than OpenStack was also raised a few times, so it seems like having a broader support for various cloud solutions (and probably some asset management systems that are not cloud based) is important to do.
  • Multi-cast for a sharded system was often requested directly or indirectly. This can be useful for reporting application where you need to collect data from all shards and aggregate information about it.

Does this give an accurate picture of what you consider the most needed items going forward? Do you have any other feature that you want to mention explicitly?