MySQL Binlog Events Use case and Examples

This is in continuation to the previous post, if you have not read that, I would advise you to go through that before continue reading.
I discussed about three use cases in my last post, here I will explain them in detail.

  1.  Extracting data from a binary log file:
    The binary log file is full of information, but what if you want selected info, for example:

    • printing the timestamp for every event in the binary log file.
    • extracting the event types of all the events occurring in the binary log file.

    And any other such data from an event in real time. Below is some sample code which shows how to do that step by step.

    • Connect to the available transport
      // the base class for both types of transport, the term driver and transport are
      // used interchangeably in code.
      binary_log::system::Binary_log_driver *drv;

      • For tcp transport
        // An example of the uri can be mysql://neha@localhost:3306  drv=binary_log::system::create_transport("user@host:port_no");
      • For file transport
        // The path should be absolute like
        // “/home/neha/binlog/masterbin.000001”
        drv=binary_log::system::create_transport("path of the binary log file");
      • Common code for both transports
        binary_log::Binary_log binlog(drv);
        // This internally calls methods from libmysqlclient library
        binlog.connect();
    • Fetch the stream in a buffer
      // In this pair object the first value will store stream buffer and second value
      // will store length buffer, only one event is received at a time.
      std::pair buffer_buflen;
      // This internally calls methods from libmysqlclient library.
      drv= get_next_event(&buffer_buflen);
    • Decode the buffer and get an event object
      binary_log::Binary_log_event *event; // To store the event object
      // This class contains method to decode all the binary log events
      Decoder decode;

      event= decode.decode_event((char *) buffer_buflen.first, buffer_buflen.second,
      NULL/* char array to store the error message*/, 1)
    • Now that the event object is created, it can be used to extract all the information about that particular event, for example
      std::cout<<”Event Data”<<std::endl;
      std::cout<<”Event type ”<<event->get_event_type()<<std::endl;
      std::cout<<”Data written ”<<event->header()->data_written<<std::endl;
      std::cout<<”Binlog Position ”<<event->header()->log_pos<<std::endl;

    You can see the complete code below.

    Screenshot from 2015-04-21 13:25:28

    The above example will print just one event, if you want to continue reading events, place the code for decoding and printing in a while loop, with breaking condition as per your requirement. If there are no more events left to read, an EOF message will appear in case of file transport.

  2. Filtering events from a Binary log file:
    Imagine a case where you want to have all the queries executed on your server in a
    printable format at one place may be in some file, or you want to be notified whenever a binary log file is rotated(this can be identified by Rotate event). Similar to these there can be many other use cases where you are only interested in some events and don’t want to load yourself with extra information.In the last use case we saw how we can get an event object, now to filter event we can do this:
    The steps for getting the event object will be same as last example/*
    Filtering query_event and printing the query for each Query_event
    The type of event object is Binary_log_event which is the abstract class for all events
    to extract any data which is specific to that event We will need to have the appropriate
    cast.
    */
    if (event->get_event_type() == QUERY_EVENT)
    std::cout<<” The Query is “<<static_cast<Query_event*>(event)->query<<std::endl;
    And as this is real time, you can just run your application and all the queries will be printed as soon as they executed on the MySQL Server in case of a tcp transport.
  3. Building application to support heterogeneous replication in real time. Heterogeneous replication is where the two datastore participating in the replication process are different. There are multiple scenarios where this is beneficial, for example: Apache Kafka captures thedata from different streams and processes them, and imagine a scenario where we want to establish a replication system between MySQL server and Apache Kafka, so we can create an application using MySQL Binlog Events which will enable you to create a pipeline between MySQL Server and Apache Kafka, and replicate the data from MySQL to Apache Kafka in real time.

These are just some of the many places where MySQL Binlog Events can be used. This is a very generic framework so there will be many other use-cases. Let us know how you are using this library. Its always exciting to hear about the real use cases. If you have any comments or questions, please get in touch.
If you happen to find a bug, please file the bug at bugs.mysql.com.
This library is availabe at labs.mysql.com, choose MySQL Binlog Events 1.0.0 from the drop down menu.

Looking forward to the feedback.

MySQL Binlog Events – reading and handling information from your Binary Log

MySQL replication is among the top features of MySQL. In replication data is replicated from one MySQL Server (also knows as Master) to another MySQL Server(also known as Slave). MySQL Binlog Events is a set of libraries which work on top of replication and open directions for myriad of use cases like extracting data from binary log files, building application to support heterogeneous replication, filtering events from binary log files and much more.
All this in  REAL TIME .
INTRODUCTION
I have already defined what MySQL Binlog Events is, to deliver on any of the above use-cases, you first need to read the event from the binary log. This can be done using two types of transports:

1) TCP transport: Here your application will connect to an online MySQL Server and receive events as and when they occur.

2) File transport: As the name suggests the application will connect to an already existing file and read the events from it.

In the MySQL framework we always receive one event at a time irrespective of the transport , after the connection is established, we will start reading the events from the binary log files and hold the information in a buffer.

To do something useful with the event it needs to be decoded, and then the relevant information will be stored or processed based on the requirements of the application.
Event processing can be implemented in a very sophisticated manner using the content handlers  in MySQL Binlog Events which are designed specifically for that purpose.

The set of libraries in MySQL Binlog Events are all C++ libraries, this is a brief introduction of them

– libmysqlstream -> Methods related to connecting with the transport and receiving the buffer from the connection and processing the events using content handlers.
– libbinlogevents   -> Methods for decoding the  events after receiving them from any of the transports the application is connected via.

GETTING AND COMPILING THE SOURCE CODE
You can get a copy of source code from labs.mysql.com, select MySQL Binlog Events 1.0.0 from the drop down menu. After selecting that you will get an option to download either the source code or the binaries, for respective platforms.

If downloading the source code these are some prerequisite which you need to have on your machine in order to compile the source code successfully.

  • MySQL binaries of 5.7.6 or greater
    • To link to the libmysqlclient library.
    • The header files in the include directory are also used.
  • MySQL Source code of the same version as of binaries
    • To get few header files which are not part of the binaries and are being used
  • CMake 2.6 or greater
  • GCC 4.6.3 or greater

Steps for compiling MySQL Binlog Events source:

  1. Run the ‘cmake’ command on the parent directory of the MySQL Binlog Events source. This will generate the necessary Makefiles. Options to set while running CMake.
    1. Make sure to set cmake option ENABLE_DOWNLOADS=1; which will install Google Test which is required to run the unit
      tests.
    2. The option MYSQLCLIENT_STATIC_LINKING:BOOL=TRUE; will make the linking with libmysqlclient static
    3. You need to point to the MySQL Binary directory in order to make the library libmysqlclient and other header files
      visible, this is done by setting MYSQL_DIR={parent directory of the MySQL Binaries}.
    4. As I have already discussed that, we also need to include header files from the MySQL Source code, for that you need
      to do MYSQL_SOURCE_INCLUDE_DIR={Parent directory of MYSQL source}/include
      the full command for doing CMake iscmake . -DMYSQLCLIENT_STATIC_LINKING:BOOL=TRUE
      -DMYSQL_DIR= {Parent directory of MySQL binary}
      -DMYSQL_SOURCE_INCLUDE_DIR={Parent directory of MYSQL source}
      -DENABLE_DOWNLOADS=1Run the ‘cmake’ command on the parent directory of the MySQL Binlog Events source. This will generate the necessary Makefiles. Options to set while running CMake.
  2. Run make and make install to build and install, this will create the libraries libmysqlstream and libbinlogevents

That will be all, you are now ready to create your own application using MySQL Binlog Events.

For more details about the use cases and how to create a small application using MySQL Binlog Events refer to the  second blog in this series.

MySQL Cluster 7.4 New Features Webinar Replay

MySQL Cluster 7.4 GAI recently hosted a webinar introducing MySQL Cluster and then looking into what’s new in the latest version (MySQL Cluster 7.4) in some more detail. The replay of the MySQL Cluster 7.4 webinar is now available here. Alternatively if just want to skim through the charts then scroll down.

Abstract

MySQL Cluster powers the subscriber databases of major communication services providers as well as next generation web, cloud, social and mobile applications. It is designed to deliver:

  • Real-time, in-memory performance for both OLTP and analytics workloads
  • Linear scale-out for both reads and writes
  • 99.999% High Availability
  • Transparent, cross-shard transactions and joins
  • Update-Anywhere Geographic replication
  • SQL or native NoSQL APIs
  • All that while still providing full ACID transactions.

Understand some of the highlights of MySQL Cluster 7.4:

  • 200 Million queries per minute
  • Active-Active geographic replication with conflict detection and resolution
  • 5x faster on-line maintenance activities
  • Enhanced reporting for memory and database operations

Charts

Questions and Answers

  • Can all data be stored on disk rather than in memory? Any column that isn’t part of an index can be stored on disk if you want it to be. There is then an in-memory cache for the disk-based data.
  • Is in-memory data safe from the whole system shutting down (e.g. power loss for the whole data center)? The in-memory data is (asynchronously) checkpointed to disk (so that there is persistence but latency isn’t impacted).
  • I need to store more than 14K non BLOB/TEXT data in a single row – has this been addressed? As you say the options are to use TEXT/BLOB columns (or of course to split the data over multiple rows).
  • Can you comment on improvements of virtualized deploymets regarding the 7.4 version? Only to say that more and more people are deploying on VMs and we’re not seeing issues caused – if we do then they’ll be fixed
  • Can I upgrage from the previous version (7.3) to MySQL Cluster 7.4 or do I have to reinstall the product of the new version (7.4)? You can perform a rolling upgarade from MySQL Cluster 7.3 to MySQL Cluster 7.4 – the database stays uo throughout the process and you don’t lose any data or have to stop writing changes.

MySQL Cluster 7.4.6 is now available

MySQL Cluster Logo

The binary and source versions of MySQL Cluster 7.4.6 have now been made available at http://www.mysql.com/downloads/cluster/.

MySQL Cluster NDB 7.4.6 is a new maintenance release of MySQL Cluster, based on MySQL Server 5.6 and including features from version 7.4 of the NDB storage engine, as well as fixing a number of recently discovered bugs in previous MySQL Cluster releases.

This release also incorporates all bugfixes and changes made in previous MySQL Cluster releases, as well as all bugfixes and feature changes which were added in mainline MySQL 5.6 through MySQL 5.6.24.

The MySQL Cluster 7.4.6 Release Notes are available here.

MySQL 5.7.6: It is easier to switch master now!

Introduction

One of the primary objectives of MySQL replication is providing an easy failover process i.e. switching to a redundant system if the primary MySQL server fails. In MySQL this translates to switching to the most appropriate slave server in the eventuality of a failure of the master server.

The promotion of a candidate slave to become a new master is based on a lot of factors, undoubtedly the most important factor being that the chosen slave should be most up to date with the primary server (the old master) before we lost it! This blog explains how to use new features in MySQL 5.7.4 and later to make failover easier.

To find the most up to date slave server, a failover script looks at the set of transactions received by the slave and compares it with every other slave to find the one that has received the biggest set of transactions. There could be more sophisticated ways of doing this, for instance you could choose one preferred slave that you want to promote (perhaps it has a better hardware configuration, it has no filters, physical location etc) and make sure it receives every transaction that has been received by all the other slaves. For simplicity though, let’s narrow down our definition of the most appropriate slave to promote to be the one that is most up to date with the lost master.

How to failover using GTID based replication

To denote a set of transactions, MySQL uses GTID sets (a set of global transaction identifiers). To read more about GTIDs, you can refer to our official documentation or developer blogs. To find the set of transactions received by a MySQL slave server, you simply execute:

mysql> SELECT RECEIVED_TRANSACTION_SET FROM peformance_schema.replication_connection_status;
+------------------------------------------+
| RECEIVED_TRANSACTION_SET                 |
+------------------------------------------+
| 4D8B564F-03F4-4975-856A-0E65C3105328:1-4 |
+------------------------------------------+

Execute this on every slave and compare the sets to find the slave with largest received transaction set- that’s your candidate slave for promotion. Let us call this slave the new master. Before you switch a slave to replicate from the new master, you earlier had to make sure all the transactions the slave has received are executed. In versions of MySQL prior to 5.7.4, you needed to do the following steps:

  1. stop slave.
  2. start slave to replicate until all received transactions are executed.
  3. wait until all received transactions are executed.
  4. switch master to redirect slaves to replicate from new master.
  5. start slave

This would translate to the following MySQL commands:

  1. STOP SLAVE;
  2. START SLAVE UNTIL SQL_AFTER_GTIDS= <received_transaction_set>;
  3. SELECT WAIT_FOR_EXECUTED_GTID_SET(<received_transaction_set>);
  4. CHANGE MASTER TO <new_master_def>;
  5. START SLAVE;

However, in MySQL-5.7.4, we introduced a feature which allows one to selectively stop only that component of replication which requires a change. This means that in the present context, to switch to a new master we only need to stop the receiver module (or in technical terms the I/O thread). The applier threads can continue applying transactions, if there are any pending, while we switch master. Building on this infrastructure we can now reduce the above steps to the following:

  1. Stop the receiver module (STOP SLAVE IO_THREAD).
  2. Switch master (CHANGE MASTER TO <new_master_def>).
  3. Start the receiver module (START SLAVE IO_THREAD).

Note the removal of the wait function (wait_for_gtid_executed_set) to ensure that the received transactions are executed before you switch master. There is no need for that step anymore!

How to failover using non-GTID based replication

If you are not using GTIDs, though we highly recommend you should, you can still take advantage of the current improvements. This means you can change the following part of your failover script:

  1. STOP SLAVE
  2. SHOW SLAVE STATUS to get coordinates (Read_Master_Log_Pos, Master_Log_File)
  3. START SLAVE UNTIL coordinates saved in step (2).
  4. SELECT MASTER_POS_WAIT (coordinates,…).
  5. CHANGE MASTER TO <new_master_def>.
  6. START SLAVE.

to the following simpler steps:

  1. STOP SLAVE IO_THREAD;
  2. CHANGE MASTER TO <new_master_def>;
  3. START SLAVE IO_THREAD.

Conclusion

Using the improvements in the newer versions of MySQL 5.7, failover becomes easier. Our effort to improve MySQL high availability continues and we remain committed to easing the processes. Please do try this, and as always let us know your feedback. You can also use mysqlfailover or MySQL Fabric that automate the failover process.

MySQL Cluster 7.4 New Features – upcoming webinar

MySQL Cluster 7.4 GAOn Thursday I’ll be hosting a webinar introducing MySQL Cluster and then looking into what’s new in the latest version (MySQL Cluster 7.4) in some more detail. As always the webinar is free but please register here. I’ll be answering on-line Q&A during the presentation.

Even if you can’t join the live webinar, it’s worth registering as you’ll be emailed a link to the replay as soon as it’s available.

Abstract

MySQL Cluster powers the subscriber databases of major communication services providers as well as next generation web, cloud, social and mobile applications. It is designed to deliver:

  • Real-time, in-memory performance for both OLTP and analytics workloads
  • Linear scale-out for both reads and writes
  • 99.999% High Availability
  • Transparent, cross-shard transactions and joins
  • Update-Anywhere Geographic replication
  • SQL or native NoSQL APIs
  • All that while still providing full ACID transactions.

Understand some of the highlights of MySQL Cluster 7.4:

  • 200 Million queries per minute
  • Active-Active geographic replication with conflict detection and resolution
  • 5x faster on-line maintenance activities
  • Enhanced reporting for memory and database operations

When

  • Thu, Apr 09: 09:00 Pacific time (America)
  • Thu, Apr 09: 10:00 Mountain time (America)
  • Thu, Apr 09: 11:00 Central time (America)
  • Thu, Apr 09: 12:00 Eastern time (America)
  • Thu, Apr 09: 13:00 São Paulo time
  • Thu, Apr 09: 16:00 UTC
  • Thu, Apr 09: 17:00 Western European time
  • Thu, Apr 09: 18:00 Central European time
  • Thu, Apr 09: 19:00 Eastern European time
  • Thu, Apr 09: 21:30 India, Sri Lanka
  • Fri, Apr 10: 00:00 Singapore/Malaysia/Philippines time
  • Fri, Apr 10: 00:00 China time
  • Fri, Apr 10: 01:00 日本
  • Fri, Apr 10: 02:00 NSW, ACT, Victoria, Tasmania (Australia)

SQL/NoSQL – Best of Both Worlds; webinar replay available

MySQL Cluster LogoLast week I hosted a webinar explaining how you can get the best from the NoSQL world while still getting all of the benefits of a proven RDBMS. The webinar replay is now available to view here.

Abstract

There’s a lot of excitement about NoSQL data stores, with the promise of simple access patterns, flexible schemas, scalability, and high availability. The downside comes in the form of losing ACID transactions, consistency, flexible queries, and data integrity checks. What if you could have the best of both worlds? Join this webinar to learn how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data—whether it’s in a simple key-value API (memcached) or REST, JavaScript, Java, or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance; 99.999 percent availability; online maintenance; and linear, horizontal scalability through transparent autosharding.

Upcoming webinar – SQL & NoSQL – the best of both worlds

MySQL Cluster LogoOn Thursday I’ll be hosting a webinar explaining how you can get the best from the NoSQL world while still getting all of the benefits of a proven RDBMS. As always the webinar is free but please register here.

Even if you can’t join the live webinar, it’s worth registering as you’ll be emailed a link to the replay as soon as it’s available.

Abstract

There’s a lot of excitement about NoSQL data stores, with the promise of simple access patterns, flexible schemas, scalability, and high availability. The downside comes in the form of losing ACID transactions, consistency, flexible queries, and data integrity checks. What if you could have the best of both worlds? Join this webinar to learn how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data—whether it’s in a simple key-value API (memcached) or REST, JavaScript, Java, or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance; 99.999 percent availability; online maintenance; and linear, horizontal scalability through transparent autosharding.

When

  • Thu, Mar 26: 09:00 Pacific time (America)
  • Thu, Mar 26: 10:00 Mountain time (America)
  • Thu, Mar 26: 11:00 Central time (America)
  • Thu, Mar 26: 12:00 Eastern time (America)
  • Thu, Mar 26: 13:00 São Paulo time
  • Thu, Mar 26: 16:00 UTC
  • Thu, Mar 26: 16:00 Western European time
  • Thu, Mar 26: 17:00 Central European time
  • Thu, Mar 26: 18:00 Eastern European time
  • Thu, Mar 26: 21:30 India, Sri Lanka
  • Fri, Mar 27: 00:00 Singapore/Malaysia/Philippines time
  • Fri, Mar 27: 00:00 China time
  • Fri, Mar 27: 01:00 日本
  • Fri, Mar 27: 03:00 NSW, ACT, Victoria, Tasmania (Australia)