MySQL 5.7.5- More variables in replication performance_schema tables

At MySQL, replication usability is of utmost importance to us. Replication information has long been part of SHOW commands, SHOW SLAVE STATUS occupying a major chunk of it. The other sources of replication information being:

SHOW MASTER STATUS,
SHOW BINLOG EVENTS,
SHOW RELAYLOG EVENTS,
SHOW VARIABLES,
SHOW STATUS,
ERROR LOGS etc.

As the replication module grows further, there is a lot more monitoring information, so much that the present interfaces seem too rigid to accommodate all the information we would like to present. So we need to organize them in a more structured manner. In MySQL-5.7.2, we introduced replication performance_schema (P_S) tables providing an SQL interface to monitor replication configuration and status partitioned into different tables, each table grouping logically related information. You can read more about the contents of these tables from official MySQL documentation.

In MySQL-5.7.5 we added some more status variables to these performance_schema tables to enable monitoring the latest replication features viz. multi-source replication in labs. Multi-source allows a MySQL slave to replicate from multiple sources (masters) directly. Talking of multi-source, one needs the replication information per source. So we added global variables that would be useful to extend to per-source scope to the replication performance\_schema tables to help monitor multi-source replication. Note that these variables still work for the single sourced replication and can still be accessed as:

Show status like ‘Slave_running';
Show status like ‘Slave_retried_transactions';
Show status like ‘Slave_last_heartbeat';
Show status like ‘Slave_received_heartbeats';
show status like ‘Slave_heartbeat_period';

Note though that the status variables are now mostly useful in single-source mode ONLY. If more sources are added, the status variables still just apply to the first source. For other replication sources (masters), the only way to access these variables is to use the replication performance_schema tables as named in the table below. Here is how the names of server variables map to the names in the replication performance_schema tables:

Variable name P_S Table Name P_S field name
 SLAVE_HEARTBEAT_PERIOD  replication_connection_configuration  HEARTBEAT_INTERVAL
 SLAVE_RECEIVED_HEARTBEATS  replication_connection_status  COUNT_RECEIVED_HEARTBEATS
 SLAVE_LAST_HEARTBEAT  replication_connection_status  LAST_HEARTBEAT_TIMESTAMP
 SLAVE_RETRIED_TRANSACTIONS  replication_execute_status  COUNT_TRANSACTIONS_RETRIES

The variable ‘slave_running’ reports whether the slave is running or not. This can be found by inspecting the two replication components (receiver and applier) separately to see if the receiver module is running or not by executing

SELECT SERVICE_STATE FROM performance_schema.replication_connection_status;

And execute module is running or not by executing

SERVICE_STATE FROM performance_schema.replication_execute_status;

Please try out multi-source replication and our new monitoring interface in the form of replication performance_schema tables. As always, your feedback is very valuable to us.

MySQL 5.7.5-labs: Multi-source Replication

Multi-source replication for MySQL has been released as a part of 5.7.5-labs-preview
downloadable from labs.mysql.com. It is one among the several features that are
cooking in the replication technologies at MySQL.  (For a birds eye view of all
replication features introduced in 5.7 and labs, look  at the blog posts here and here.

Previously, we have introduced a preliminary multi-source feature labs preview. Based on the feed back from that labs release, we have released a feature preview based on 5.7.5 DMR which:

  1.  has a better user interface;
  2. is well tested;
  3. and has a better integration with other features such as Multi-threaded slave (MTS)Global Transaction ID(GTID) and replication performance_schema tables.

In this blog, I give you a description of what and how of a multi-source replication.

Motivation

Multi-source replication makes a replication slave connected to several sources (aka masterssimultaneously and aggregates the data from these sources. This setup has use cases like, having to backup of several servers to a single place or merging of tableshards or making any replication topology possible. Note that, multi-source replication doesn’t do any conflict detection and resolution and those tasks are left to the user’s application. If you are looking for a multi-master update everywhere with conflict detection and automatic resolution, look at the post here.

Replication Channel 

In my previous blog post, the concept of replication channel has been introduced. To recap, a replication channel encompasses the path of binary log events from master to the IO thread to the Relay log files to the applier threads (SQL thread or coordinator & worker threads). Multi-sourced slave creates a replication channel for each master. Each channel has a unique name and configuration parameters can be specified per channel (described below).

Preconditions for setting up multi-source replication

Replication repositories are either stored in FILE or TABLE based repository. TABLE based repository is crash-safe and multi-source replication is usable only in the case of TABLE based repositories. To setup multi-source replication, start mysqld with
master-info-repostiory=TABLE && –relay-log-info-repository=TABLE.
(A better way, is to put these options in the .cnf files). If you are using a single source replication using FILE repository and want to setup a multi-source replication, convert repositories to TABLE dynamically in the following way:
STOP SLAVE;
SET GLOBAL @@master_info_repository = TABLE;
SET GLOBAL @@relay_log_info_repository = TABLE;

Commands for operations on a single channel

To make replication commands act per channel, we have introduced a new clause
called FOR CHANNEL “<channel_name>” to the replication commands so that
a user can manage a channel independent of other channels.
Following commands are channel specific:

  • CHANGE MASTER… FOR CHANNEL “<channel_name>”
    – Creates a channel if it doesn’t exist.
  • START SLAVE FOR CHANNEL “<channel_name>”
  • STOP SLAVE… FOR CHANNEL “<channel_name>”
  • SHOW RELAYLOG EVENTSFOR CHANNEL “<channel_name>”
  • FLUSH RELAY LOGS FOR CHANNEL “<channel_name>”
  • SHOW SLAVE STATUS FOR CHANNEL “<channel_name>”
  • RESET SLAVE [ALL] FOR CHANNEL “<channel_name>”

There is an extra parameter introduced for the following functions.

  • MASTER_POS_WAIT(binlog_file, binlog_pos, time_out, channel_name);
  •  WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS(gtid_set, time_out, channel_name);

Compatibility with single source replication

The server automatically creates a channel whose name is the empty string (“”). We call this the default channel. Replication commands with no FOR CHANNEL clause act on this channel if there is no named channel configured (A named channel is one that has a name different than “”).
This means a default channel exists the user can still use the same commands to interact with replication. As such, there is no need in that case to use the FOR CHANNEL clause at all, since the behavior will match the one before multi-source replication. This means that with a single channel configured, we get backwards compatibility command-wise.

Commands for operations on all channels

Since a slave could have several replication channels, and if FOR CHANNEL clause is not provided, then replication commands shall act on all the channels wherever it is semantically valid. (Note that this behavior is different from previous labs release).
For example:

  • START SLAVE [IO_THREAD|SQL_THREAD] starts replication threads for all the  channels.
  • STOP SLAVE [IO_THREAD/SQL_THREAD] stops replication threads for all the      channels.
  • SHOW SLAVE STATUS reports the status for all the channels.
  • RESET SLAVE [ALL] resets the slave for for all channels.
    – To preserve backward compatibiliy though, server purges all the relay log files
             of named channels but initialize them again for only default channel.
    – The ALL option would delete the internal memory structures and thereby all the
             information and configurations assosiated with a particular channel.
  • FLUSH RELAY LOGS flushes the relay logs of all channels.

However, not all the commands make sense to do for all channels. The following
commands generate error 1964 (“Multiple channels exist on the slave. Please provide channel name as an argument.”) when a channel is not provided and if number of channels are greater than 1. (Note that, a default channel always exists).

  • SHOW RELAYLOG EVENTS;
  • CHANGE MASTER..;
  • MASTER_POS_WAIT();
  • WAIT_FOR_SQL_THREAD_AFTER_GTIDS();

Multi-source monitoring

To monitor the status of all slave channels, a user has two options:

  1. performance_schema tables.: 5.7.2 introduced replication performance_schema tables to monitor replication status. Multi-source replication extended these tables to add Channel_Name as the first column to these tables. This will enable the users to write complex queries based on Channel_Name as a key.
  2. SHOW SLAVE STATUS [FOR CHANNEL “<channel_name>”]This command by default shows the slave status for all channels with one row per channel. The identifier channel_name is added as the column in the result set. If a FOR CHANNEL clause is provided, the user will be provided the status of only that replication channel.

(NOTE: Replication SHOW VARIABLES: SHOW VARIABLES does not work with multi-source
replication. The information that was available through these variables have
been migrated to the replication performance tables. For more information, look at
Shiv’s post  hereThe current SHOW VARIABLES will show status of only the default channel.)

User session and server log error messages

Since, a multi-sourced slave can have multiple channels, we introduced explicit
error messages pertaining to a channel. To make this more consistent across all
channels, we introduced new error codes and messages.
For example: The notes that used to be emitted in previous versions of mysql
 “Slave is already running” or “Slave is already stopped”  is replaced with
  “Replication thread(s) for channel “channe_name” are already running”  and
  “Replication threads(s) for channel “channel_name” are already stopped” respectively.
We also changed server log messages to indicate the channel on which the note/warning/error was generated. This makes debugging and/or tracing easier.

Integration with other features

This feature has been well integrated with Global Transaction Identifiers and
Multi-Threaded Slave. Currently, replication filters are global and applies
to all channels and cannot be set per channel. Semi-sync replication only works when
all the masters and slave is semi-sync enabled. We are looking at further extending
the Multi-source framework for better support of these features.

Conclusion

Multi-source replication has been released as part of 5.7.5-labs-preview. Please try out this
new labs release which can be downloaded  at labs.mysql.com and tell us if you need
some things more to make this feature even better.

High-Availability at MySQL Central

This year’s MySQL Central at Oracle Open World was an exhilarating experience. In contrast to the previous year’s MySQL Connect events, MySQL have now got their own Central at the main Oracle Open World. In the previous years, we were always short on time and trying to get a lot of sessions into just two days was just to much. This time I could both present sessions, attend sessions by other users, and also to talk to people in the MySQL community: something that I really enjoy and also find very valuable to see where we should be heading.

This year, the “MySQL Fabric Team” representation on MySQL Central was me and Narayanan Venkateswaran, which is heading the sharding solution in MySQL Fabric. Together with the conference, we also released MySQL Fabric 1.5.2 as the GA release of MySQL Fabric 1.5 containing a few new features:

  • Server Provisioning support was developed and added to MySQL Fabric 1.5 by Alfranio Correia. This allow us to integrate OpenStack, some other cloud solutions, or in general any asset management system, with MySQL Fabric making it possible to automatically provision new machines and incorporate them into the MySQL Fabric farm. When talking to people and seeing the existing deployments we realized this is critical for operations, so we decided on implementing support for it.
  • In addition, Narayanan extended range sharding using DATETIME and string sharding keys. Before 1.5 there were only support in range sharding for integral keys, where hash sharding supported arbitrary keys, but now you can use range sharding to shard based on datetimes or columns that hold strings such as CHAR, VARCHAR, or BLOB.
  • MySQL-RPC support was added by Geert Vanderkelen so that it is now possible to use the MySQL protocol to talk to the MySQL Fabric node. This is critical to be able to implement Fabric support in connectors that do not have easy access to an HTTP and XML library, such as the C or C++ connector.
  • A Fabric-aware Connector/C was developed by Rafal Somla, Igor Solodovnikov, and Bogdan Degtyariov in the connector team and was distributed as a labs release. The C connector is base for many other connectors such as MySQL/Ruby connector and the Perl connector DBD::mysql, and this has been requested by several people, so we decided to prioritize implementing this. You can find more information in the blog Using Connector/C with Fabric.
  • A Fabric-aware Connector/.NET was developed by Roberto Ezequiel Garcia Ballesteros and Fernando Gonzalez Sanchez and increase our connector support for MySQL Fabric.
  • MySQL Workbench 6.2 have also been released with some support for Fabric so that you can view the structure of a MySQL Fabric farm and the status of the groups and the servers in the farm.

We had several sessions on MySQL Fabric-related topics:

There were a lot of interest in MySQL Fabric and you also had a lot of good questions and comments that will help us going forward. With release 1.5 of MySQL Fabric in GA, we now have a lot of support for managing a farm:

  • High-availability support with automatic fail-over of servers.
  • Range and hash sharding with support for datetime, strings, and integers.
  • Sharding operations to move and split shards with minimal locking time.
  • Integration with OpenStack and new interface to support adding extensions for integrating with other cloud or asset management systems.
  • Easy-to-use command-line interface for managing a farm of MySQL server.
  • Support for custom failure detectors so that you can use your own failure detector with MySQL Fabric.
  • Clear and easy to use APIs in XML-RPC and MySQL-RPC (RPC over MySQL protcol) so that you can write your own applications to automate whatever tasks you need done.
  • Fabric-aware connectors for Java, PHP, .NET, C, and Python.

However, it does not stop there. From the questions and the people I talked to on MySQL Central, it seems like the most important topics are:

  • Ease of use. Many questions were about how to do various tasks, for example: how to add new shards to a hash-sharded system to balance the load, how to repair and bring back servers into the farm, how to provision new slaves, and what changes to make to the application to use MySQL Fabric. Technically, this is already possible, but it can be easier and more intuitive to use. Ease of use is a continuous work we do and something that we always need to improve both in how to use the tools but also through providing good documentation and information on the all the features we have.
  • High-Availability of the Fabric system itself is a topic that came up quite frequently. We gave a solution for this in the “High-Availability on Different Levels” session above, but we need to ensure that we have good documentation for this so that is it clear what needs to be done. And this is, of course, always possible to improve on through introducing solutions that are easier to deploy and use.
  • Integrating with other cloud system than OpenStack was also raised a few times, so it seems like having a broader support for various cloud solutions (and probably some asset management systems that are not cloud based) is important to do.
  • Multi-cast for a sharded system was often requested directly or indirectly. This can be useful for reporting application where you need to collect data from all shards and aggregate information about it.

Does this give an accurate picture of what you consider the most needed items going forward? Do you have any other feature that you want to mention explicitly?

MySQL Group Replication – Transaction life cycle explained

The multi master plugin for MySQL is here. MySQL Group Replication provides virtually synchronous updates on any node in a group of MySQL servers, with conflict handling and automatic group membership management and failure detection.

For a better understanding on how things work, we go under the hood in this post and will analyse the transaction life cycle on multi master and which components does it interact with. But before that we need to understand first what a group is.

Group Communication Toolkit

The multi master plugin is powered by a group communication toolkit. This is what decides which servers belong to the group, performs failure detection and orders server messages. Being the ordered messaging the magic thing that allows the data to be consistent across all nodes. You can check the details of the group communication toolkit at Group communication behind the scenes post.

1

These strong properties, together with the same initial state of each server that joins the group, allowed us to implement a database state machine replication (DBSM)[1].

Continue reading

New MySQL Releases

More MySQL releases than ever before. Check out the following that we just released:

– MySQL Connector Java 5.1.33 GA
– MySQL Fabric & MySQL Utilities 1.5.2 GA
– MySQL Connector Python 2.0.1 GA
– MySQL Fabric and Utilities 1.6.0 Alpha
– MySQL Connector C (Labs)
– MySQL Connector Python (Labs)
– MySQL Group Replication (Labs)
– MySQL HTTP Plugin (Labs)

Download from http://dev.mysql.com/downloads/ and http://dev.mysql.com/downloads/

MySQL Group Replication – Testing

The multi master plugin for MySQL is here “MySQL Group Replication“. It is a virtual synchronous solution for MySQL with conflict detection. It also supports automatic group membership management, failure detection and automatic distributed recovery.

With the introduction of this new feature there was a need to perform some good amount of testing as it involves complex functionalities like :

  • Servers execute local transactions and broadcasts the update to the group.
  • All servers in the group, even the sender, receive same transaction in the same order and check for conflicts.
  • All servers, independently, decide to commit the transaction – no conflicts.
  • A new node can join an existing group, so in this case we need distributed recovery to bring it at par with the other servers.

So a great deal of effort has been spent on testing these functionality and making MySQL Group Replication stable. Continue reading

MySQL Group Replication – Monitoring

Multi-master update everywhere solution for MySQL is here. MySQL Group Replication ensures virtual synchronous updates on any node in a group of MySQL servers, with conflict handling and failure detection. Distributed recovery is also in the package to ease the process of adding new nodes.

Starting with MySQL 5.7.2 there has been a constant effort from the replication team to provide more fields to monitor the replication performance in the performance schema tables. This post gives a brief  overview of the Performance Schema tables that have been introduced for MySQL Group Replication. Continue reading

WAIT_FOR_EXECUTED_GTID_SET for MySQL Replication

With the introduction of Global Transaction Identifiers(GTID) in MySQL from mysql-5.6 GA a whole lot of different functionality have been developed around it so that GTID could be used in a much simpler and efficient way.

One of the interesting functionality with GTID in use is the function to sync a slave with its
master server using the WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS. This is used to do a timed or an indefinite wait till the servers in contention are in sync with respect to the GTID executed set.

This function is used to address specific use cases in which transactions are applied using SQL/worker thread. In order to give more flexibility to the user a new function of syncing the servers with GTID has been introduced in mysql-5.7.5 called WAIT_FOR_EXECUTED_GTID_SET. Continue reading

Preserve Master’s Commit Order on Slave

On MySQL-5.7.2, we released a new type of multi-threaded slave (MTS). It is called logical clock based MTS. Because it can parallel apply transactions in the same schema, it has the potential to improve slave throughput on almost any application, regardless of the schema layout. After it was released, we continued to work on the framework to improve it further. Therefore, 5.7.5 includes a few enhancements, in addition to those released as part of previous DMRs. This blog post introduces one of the new features in the latest DMR.

slave_preserve_commit_order

Continue reading

Transaction Retry Is Available On Multi-Threaded Slave

 I am happy to announce that slave_transaction_retries is available on multi-threaded slave now. You will no longer see the warning that slave_transaction_retries is not supported by multi-threaded slave since MySQL-5.7.5.

Multi-threaded slave and single-threaded slave share the system variable slave_transaction_retries and they have similar transaction retry logic. So this blog just brings you a little new knowledge and then refresh you with the transaction retry logic. Continue reading