More Awesome Replication Features in 5.7.6
It is time to celebrate again. The latest MySQL server development milestone (MySQL 5.7.6) was just released, and let me tell you, it is full of great replication enhancements. These new improvements cover many areas, ranging from performance to flexibility and easier deployment. Let me highlight some of them, and give you a brief summary of what they are.
A MySQL 5.7.6 slave/server can now connect to multiple MySQL masters.
After a long development period, two labs releases, handling feedback from community, and a lot of internal (and external testing as well – thank you!), the multi-source replication feature was finally pushed into MySQL 5.7. This is a major milestone for replication itself. The feature allows a single MySQL server to aggregate data from multiple other MySQL servers. This has several use case scenarios: (i) aggregate data from multiple shards for doing cross-shard operations in a simpler manner; (ii) consolidate data for integrated backups; (iii) data-hub for inter-cluster replication; (iv) optimize some deployments of circular replication. I am pretty sure that I am missing other scenarios that I have not listed, but judging from the ones laid out, this feature looks very interesting already.
Enhanced Multi-Threaded Slave
MySQL 5.7.6 has improved the multi-threaded slave applier throughput. There is a more efficient way of tracking dependencies between transactions executing concurrently on the master. The ultimate result? The slave is faster.
It all started with MySQL 5.6. We began changing the replication codebase that had been single threaded since the beginning of MySQL replication. Traditional MySQL replication was single threaded: master executes and serializes transactions in a replication log (the binary log); then slave server pulls the changes to its own transient relay log and a single threaded applier installs these changes, one by one. And this changed in MySQL 5.6! The applier became parallel and it can actually apply transactions concurrently provided that certain conditions are met. The rule enforced is that any two transactions changing the same database will have to be applied sequentially.
In MySQL 5.7, we have gone further. We lifted the rule that only transactions operating on different databases are to be applied in parallel. In fact, the multi-threaded slave applier will apply transactions concurrently as long as they have executed (and thus started committing) concurrently on the master and have not blocked each other during execution. Thence, as the master commits more transactions concurrently, the slave will also apply more transactions in parallel. Pretty cool!
Enhanced Deployment of Global Transaction Identifiers
In MySQL 5.7.6 the user can enable/disable global transactions identifiers (GTID) in the replication topology without having to first synchronize and then stop and restart the entire set of servers.
Yes, MySQL 5.6 was a great release, packed with many replication features. It introduced a new feature that is a considerable leap forward regarding how one tracks the data in the replication stream. Global Transaction Identifiers is the name of the feature set that implements: (i) the ability to automatically track the position in the replication stream, thus reducing administration overhead; (ii) automatically skips transactions that have been already processed, thus enforcing consistency; and (iii) automatically fetches, from the source, only those changes that are actually needed – minimizing network and disk usage. It was also a disrupting feature, therefore, it required some coordination and synchronization between all servers in a replication topology before it could actually been turned on to make the replication stream GTID aware.
In MySQL 5.7.6, we have instrumented the server to be able to change the replication stream to be GTID aware by doing it in phases. I.e., without requiring the topology to be completely synchronized and then restarted in one go. The user has now a distributed procedure to enable or disable the GTIDs in the topology, but more importantly, the user can do it without having to synchronize the entire topology beforehand.
Yes, gtid_mode is now a dynamic variable. 😉
Tracking Replication Session State
MySQL 5.7.6 can be instructed to include global transaction identifiers information in the response the server sends back to the application after executing a read/write or a read only transaction. Such information can be leveraged to track dependencies when accessing data throughout a replication topology.
By tweaking a dynamic system variable, the user can instruct the server to track global transaction identifiers and report them in the response it gives back to the connector (i.e., include GTIDs in the OK packet of the MySQL protocol). Applications, middleware or even connectors can be made aware of this and then use that information for several purposes. One that immediately jumps into mind is to rely on that information to be able to transparently track changes throughout a farm or replicated MySQL servers.
All in all, the user can configure the server to get back no GTIDs at all (default), the GTID of the last committed transaction or the set of GTIDs committed up to the point in time the current transaction finished.
As usual, there are a lot of bugs that have been fixed in MySQL 5.7.6. This translates into a more stable, reliable and usable server, therefore a more user friendly replication.
MySQL 5.7.6 is yet another awesome release with cool new replication features. These new features along with those that were already in, make MySQL 5.7 very appealing. Go and give it a try. Let us know your feedback. The bug tracker is a good starting point for that, but there are other ways, such as the replication mailing list or even by commenting on this blog post. Mind you that MySQL 5.7.6 is a development milestone release (DMR), thence not yet declared generally available. Use it at your own risk.
Have fun playing with MySQL 5.7.6 DMR.