Preserving commit order on replicas with binary log disabled

MySQL 8.0.19 introduces Binlogless replicas with commit ordering which means you can deploy asynchronous replicas without binary logs enabled, and commit transactions in the same order they are replicated in. Yes, you can disable binlog (skip-log-bin) and the logging of changes done by the applier (log-slave-updates=FALSE) while at the same preserve commit order (slave-preserve-commit-order=TRUE).

Among other things, preserving commit order can be useful for those relying on WRITESET-based dependency tracking for aggressive transaction parallelization. So, now on your replicas that have no binary logs, you can make use of both parallel worker threads and WRITESETs to boost the applier throughput, all while exporting the same transaction commit order on the replica, as observed in the input replication stream.

Before diving into further details of how this change would benefit users of MySQL, let me provide some background of Slave Preserve Commit Order (SPCO). The readers who are already familiar with concept of Slave Preserve Commit Order, can skip the next section.

Slave Preserve Commit Order

For multithreaded slaves (slave_parallel_workers > 1), enabling slave_preserve_commit_order variable ensures that transactions are externalized on the replica in the same order as they appear in the replica’s relay log, preserving the same transaction history on the slave as on the master. The transactions can be executed in parallel, but the executing thread waits for its turn until all previous transactions are committed before committing.

Example

Suppose we have two transactions T1 and T2 and we assume that T1 appears before T2 in the relay log. The transactions T1 and T2 can begin to execute in parallel, but the executing thread will wait before the commit, until all previous transactions are committed.

The transaction T2 can execute before T1, but as T1 is before T2 in the relay log and as slave_preserve_commit_order is also enabled, it will only get committed after T1.

The Slave Preserve Commit Order functionality where each transaction wait for all previous transaction to commit before committing, works same way whether its binlog or binlogless replica. In the following section, will get deeper in the implementation and check some benchmark results on performance impact.

Slave Preserve Commit Order for Binlogless replicas

In MySQL 5.6, the binary log group commit implementation was added to improve performance for multi threaded slave (MTS), achieved by reducing the number of writes and flushes to disk. Slave Preserve Commit Order for Binlogless replicas uses the Flush Stage of Binary log group commit. Binary log group commit, whose details you can read on WL#5223, includes other binlog related stages (Sync and Commit) that are not relevant for this topic. The Flush stage, where records are written and flushed to the disk, is considered a costly operation and so to improve performance, records of multiple slave threads are flushed together in group.

Performance

To assess the benefit of Slave Preserve Commit Order on Binlogless replica, let’s have a look at the results of benchmarks where we are comparing it with Binlog enabled replica for MySQL 8.0.19 for durable settings. The benchmarks were executed with below mysql configuration:

  1. Master : All the masters irrespective of binlog ON/OFF used below setting:
    • –binlog-transaction-dependency-tracking= COMMIT_ORDER  –innodb_flush_log_at_trx_commit=2 –sync_binlog=0
  2. Binlogless Replica
    • –slave-parallel-workers=128 –slave-parallel-type=LOGICAL_CLOCK –slave-preserve-commit-order=ON –innodb_flush_log_at_trx_commit=1 –sync_binlog=0 –disable-log-bin
  3. Binlog Replica
    • –slave-parallel-workers=128 –slave-parallel-type=LOGICAL_CLOCK –log_bin=mysql-bin.log –log_slave_updates=ON –slave-preserve-commit-order=ON  –innodb_flush_log_at_trx_commit=1 –sync_binlog=1
  4. The below system variable configurations were common for all mysql servers used for performance runs:
    • –gtid-mode=ON –enforce_gtid_consistency –log_slave_updates –innodb_buffer_pool_instances=45 –innodb_buffer_pool_size=92160M –innodb_doublewrite=0  –innodb_io_capacity=4000 –innodb_log_files_in_group=4 –innodb_log_file_size=1024M –innodb_open_files=10000 –innodb_read_io_threads=32 –innodb_thread_concurrency=64 –innodb_undo_log_truncate=OFF –innodb_undo_tablespaces=3 –innodb_write_io_threads=32 –loose-information_schema_stats=latest –max_connections=2050 –max_prepared_stmt_count=1048576 –sort_buffer_size=327680 –sql_mode=NO_ENGINE_SUBSTITUTION –table_open_cache=8000
  5. All the mysql system variables not mentioned above does use there default values.

The benchmarks were executed on a machine with two Intel(R) Xeon(R) CPU E5-2660 v3 processors (20 cores, 40 h/w threads) and 256GB RAM. The workload is applied on master with the applier (sql_thread) on the replica stopped, when all data is loaded in relay log of replica then only applier (sql_thread) is started. The charts below were obtained after executing two Sysbench workloads on the master.

The OLTP Write-only where each transaction consists of one oltp-index-updates, one oltp-non-index-updates, one oltp-ins-del-updates. The 7.5 million transactions are applied using 1024 client threads on master. The applier throughput (transactions/second) is calculated for 128 slave parallel worker threads as shown in below chart:

The OLTP Insert benchmark which uses 10 client threads each executing transactions consisting of 1024 insert query on master. The 78 thousand transaction are applied on master. The applier throughput (transactions/second) is calculated for 128 slave parallel worker threads as shown in below chart:

It can be clearly observed from the above charts that Slave Preserve Commit Order for Binlogless replica is giving better throughput as compared to Slave Preserve Commit Order for Binlog enabled replica for both OLTP Write-Only and Insert benchmark irrespective of durability settings.

Conclusions

Slave Preserve Commit Order for Binlogless replicas is an important enhancements for MySQL Replication in the following aspects:

  • Save disk space on replicas otherwise used for binary logging.
  • Improve throughput of the binary log applier on the replica, as shown in above charts, when compared with binlog enabled replicas.
  • Freely use WRITESET parallelization (–binlog_transaction_dependency_tracking= WRITESET | WRITESET_SESSION) to improve throughput of the binary log applier on the slave. Please check Improving the Parallel Applier with Writeset-based Dependency Tracking to know the improvements it bring to the throughput of the binary log applier.

Reference

 7,701 total views,  24 views today

Leave a Reply