MySQL Router on Labs – The Newest Member of the MySQL family

The MySQL team are busily working on improving ways to use MySQL in scalable highly available systems. As most of you are already aware, the MySQL Group Replication project is progressing at full speed. Indeed, we just got preview release 0.5 out, while Fabric 1.5.6 was just released and work is ongoing for the Fabric 1.6 release.

In the Fabric project, we received requests for supporting third-party connectors such as Perl, Ruby, and C/C++. And also receive requests for being able to use Fabric with existing connectors for PHP, Java, and Python, since it is not always possible to change the connector to one of our Fabric-aware connectors.

Other requests include having a proxy that can handle a connection multiplexing and a way split the server traffic and send queries to other servers for mirroring the traffic.

Clearly, with this sort of requests, and a few more, we decided to release an early version of the MySQL Router that is now published on labs.mysql.com.

Introducing the MySQL Router

The MySQL team is dedicated to providing high-performance tools, and for that reason we started building the router with performance in mind from the start. In addition to that, we wanted to have a flexible tool that can be extended with features as necessary. Thus, the MySQL Router is built on a harness supporting a pluggable architecture with clear ABIs, dependency tracking, and version handling.

There are already a few plugins included with the MySQL Router, but the two most important plugins are the Connection Routing plugin and the Fabric Cache plugin.

  1. The Connection Routing plugin does connection-based routing, meaning that it just forwards the packets to the server without looking at them at all. This is a very simplistic approach, but it provides a very high throughput.
  2. The Fabric Cache plugin maintains the connection with the Fabric instance and caches the information in the same way as the connector does.

To configure the router, you need to provide a configuration file, listing the plugins and providing configuration parameters for the plugins. The router harness will then load the plugins and start them.

Basic High Availability setup

For the most basic use-case, that of providing Fabric support to legacy connectors, we create a configuration file fabric_ha.ini containing sections for plugins.

The first two sections just list some default paths and list the logging plugin to ensure that it is properly initialized.

The next two sections are the interesting part. This is where you configure the Fabric Cache and the Connection Routing plugin.

This section configures the fabric_cache plugin to fetch farm data from the Fabric instance at fabric.example.com and cache it in memory. It assigns a key my_cache to the section so that it can be referenced later.

This section configures the routing plugin to listen (“bind to”) port 7002 on localhost in order to fetch destination information from the Fabric Cache given by the key my_cache and route the packets to the active primary in high-availability group my_group.

You can now start the router using:

After this you can connect to the bind address of the router using a normal MySQL connector or client.

Static Routing

Fabric handles slave promotion, addition, and removal of servers to groups. These changes are automatically propagated to the Fabric Cache in the router. It can also detect failing servers and take them offline, which is great.

In some cases, such as when the fail-over is already handled using existing scripts, it is not necessary to use the Fabric to handle these. In other cases, fail-over is handled manually and it is only necessary to re-route the traffic, should the primary go down. For those cases, you can define a static routing scheme and store it in the configuration file. To do that, you can replace the previous
fabric_cache and routing sections with:

In this case, all traffic will go to the first server in the list (master1.example.com), and when the server fails, traffic will go to the second server (master2.example.com), etc.

Connection Load Balancing

Using connection routing, it is also possible to do round-robin load balancing of connections using the following configuration:

With this configuration, it will send the first connection to slave1.example.com, second connection to slave2.example.com, etc. and circle back to the first slave again after the list is exhausted.

The connection routing works best with the traditional workload for MySQL: where connections are opened for a batch of queries and closed after the job is done.

Minimizing aborts for Group Replication

The connection routing is especially useful for reducing the number of aborts due to concurrent execution at different masters. (If you use Group Replication in a master-slaves setup – i.e., write to only one server, then there is no problem to solve.)

In MySQL Group Replication, any two transactions that change the same data on different servers concurrently will create a conflict. On conflicts, one of those two transactions is rolled back, i.e., aborted. This means that the users get an error and need to re-submit the transaction. In order to minimize the aborts, an easy strategy is to partition the transactions based on what data they change. This way, transactions will not normally be aborted. This is fully described in the introductory post on MySQL Group Replication.

To do this using the router, you can set up different ports for different partitions of the data – as many as you like. The application can then connect to the port that represent the data partition that you want to change.

As an example, let’s assume that we have a setup similar to TPC-C, where there are several warehouses that keep inventory of the stock using a database. In this example, there would be one server in each warehouse and they would all replicate using MySQL Group Replication.

You could then either set up a router in each warehouse or a central router somewhere and configure each to listen on different ports depending on what warehouse the transaction is intended for. Since we want to handle fail-over as well, we list all servers in each entry, but have them in different order so that traffic intended for Warehouse 1 always would go to Warehouse 2 server if Warehouse 1 server is down.

The configuration file for this would then look like this:

In this case, we have a single configuration file, but a router can be
set up in each warehouse and would then works independent of the
others.

Idea: Connection Multiplexing

One request that is quite common, but currently not handled, is to be able to have a connection multiplexing.

The idea is that the router would be listening on a range of addresses and when queries arrive they are sent to a single connection on the server. When the response from the server then comes back, it would be passed back to the client that sent the request.

The need for handling connection multiplexing, preferably on the level of 200 to 1, is because keeping a lot of connections open to a MySQL server affects performance. For setups where there are many application servers, each with very little traffic, the performance of MySQL suffers only because the server have to manage the sockets. In this case, it is better to have the application server connect to a lightweight “router” that merges the incoming traffic into a few connections on the server.

This means that the traffic from multiple connections are interleaved and can potentially conflict, which need to be avoided by, for example, sending single statement transactions (i.e., with AUTOCOMMIT=1).

This should be contrasted with connection pooling in that when a client connects to the router, one existing connection from a pool of connections will be picked, but the client traffic will not be interleaved with traffic from any other clients.

If connection multiplexing is combined with connection pooling, the traffic will be interleaved, but we get the advantage from connection pooling avoiding reconnection overhead by reusing existing connections.

Idea: Traffic Splitter

Another request that has been mentioned is to be able to take a copy of the packets and send them to a different server. This could be useful for testing a newly configured server with production traffic before bringing it into production, but also for saving problematic traffic situations and be able to play them back for finding problems.

In this case, the router could take a copy of each packet and send it to a second server, similar to how the “tee” command works in Unix, and either save it away in a file or send it to a second server.

As you can see, even with the basic features already available in the MySQL Router it is possible to do a lot of things, and thanks to the plugin-oriented architecture, the possibilities are endless. I am sure that you have a few ideas about what the MySQL Router could be used for, and we would love to hear about those.

If you want to learn more about the MySQL Router and ask questions, you should go to the talks by Luis and Nuno.

About mats.kindahl@oracle.com

Mats is team lead and head architect for the MySQL High-Availability Team, which maintain and release MySQL Fabric. Mats has been working for MySQL since 2004 and has implemented a number of server features, mainly in replication. Prior to joining MySQL, Mats worked with implementing C and C++ compilers and as a researcher developing algorithms for verification of distributed systems.

3 thoughts on “MySQL Router on Labs – The Newest Member of the MySQL family

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter. *