More P_S Instrumentation for Group Replication

In order to improve observability and manageability of MySQL Group Replication (GR), we enhanced its instrumentation in MySQL 8.0.4. With these goals in mind, we have extended the usage of Performance Schema (P_S) from this plugin to the included Group Communication System (GCS) module as well as its submodule eXtended Communications (XCom) which are responsible for all communication involved in the dissemination of transactions to other MySQL instances.

The introduction of P_S in GCS/XCom implies the instrumentation of several code elements and, therefore, it is a step-wise process. The first step was to instrument the existing threads on GCS, since thread instrumentation is required to enable the instrumentation of other code elements, more specifically, to render their monitoring data visible in the corresponding P_S tables.

The remainder of this blog post describes the steps that can be executed individually on a GR server. It is required that you have a working GR setup, independently from the number of cluster nodes, and for that purpose, refer to the Getting Started section of the Group Replication chapter of the MySQL 8.0 Reference Manual.

Threads

Let’s start by having a look at the P_S threads table, which lists all running server threads. Each entry on this table contains information about a thread and indicates whether its monitoring and historical event logging are enabled. The most relevant columns on this table are:

  • THREAD_ID – A unique thread identifier.
  • NAME – The name associated with the thread instrumentation code in the server.
  • TYPE – The thread type, either FOREGROUND or BACKGROUND.
  • INSTRUMENTED – Defines whether events executed by the thread are instrumented.
  • HISTORY – Defines whether historical events for the thread should be logged.

For further detail on the threads table check the corresponding manuasection.

Since all GCS threads are background threads, the value of the INSTRUMENTED and HISTORY columns  will be YES by default and can be altered during the thread’s lifetime.

 To monitor events of an instrumented thread, thread_instrumentation must be enabled in the setup_consumers table, as well as the events instrumentation in the setup_instruments table.

Currently, GCS contains four instrumented threads, which are alive during the entire lifecycle of GCS which occurs while GR is active, and they are visible in the P_S threads table.

Here is a small description of these 4 threads:

  • Gcs_ext_logger_impl::m_consumer deals with logging events triggered by any code inside GCS, in order to minimize any performance impact due to the building and printing of logging messages.
  • Gcs_xcom_engine::m_engine_thread processes internal GCS events sequentially.
  • Gcs_xcom_control::m_xcom_thread is the responsible for running XCom.
  • Gcs_xcom_control::m_suspicions_processing_thread is used to expel from the group any members that the node suspects might have failed.

Mutexes and Conditions

The second step of the process for improving the monitoring of GR lies in the instrumentation of the synchronization objects present in the source code of GCS, namely mutexes and conditions.

The instrumented mutexes and conditions will appear in the setup_instruments table, and the most important columns of this table are:

  • NAME – Contains the instrument’s name, which is composed by the class where it is located and the corresponding variable name, prefixed by ‘wait/synch/mutex/group_rpl/’ in case it is a mutex, or ‘wait/synch/cond/group_rpl/’ if it is a condition, followed by ‘GCS_‘ in order to distinguish from other GR-specific instruments.
  • ENABLED – Determines if the instrument is enabled, and in this case, generated events are captured.
  • TIMED – Determines if the instrument is to be timed. In case timing is disabled, the instrument can still be enabled but no timing values are gathered.

The value of the last two columns can be modified at anytime but will not affect existing instances of that instrument.

For more information on the setup_instruments table, read on in the corresponding manual section.

You can see that no instrumented GCS mutexes or conditions are currently enabled or timed in the setup_instruments table, which is the default behavior.

Nevertheless, their instances will show up in the mutex_instances and cond_instances tables.

Both of these tables share some columns, such as:

NAME, which identifies the instrumented mutex or condition.

OBJECT_INSTANCE_BEGIN, the memory address of the instrumented mutex or condition.

The mutex_instances table has an additional column, LOCKED_BY_THREAD_ID, which contains the THREAD_ID of the thread that locked it. Otherwise, it is valued as NULL.

Let’s check the contents of the mutex_instances and cond_instances tables:

In this case, no mutex instance appears as locked by any thread, but this is due to the fact that the instrumented GCS mutexes are not enabled in the setup_instruments table. In order to have this information, i.e. the id of the thread that locked the mutex, the instrumentation of the mutex must be enabled.

You can use the following command to enable the instrumentation and timing of all the instrumented mutexes and conditions in GCS:

After this step, you can verify that these changes were applied to the GCS mutexes and conditions in the setup_instruments table:

After executing these changes, you must restart GR in order for them to become effective for all mutex and condition instances.

Please refer to the Launching Group Replication section of the manual in order to restart GR properly, and notice the value of the group_replication_bootstrap_group variable, specially if you’re using a single node.

After restarting the GR plugin, there will be some visible differences in the entries for the GCS mutexes and conditions in the mutex_instances and cond_instances tables:

As you can see, all the values in the OBJECT_INSTANCE_BEGIN column of all these entries have changed, as expected since we have restarted GR. Furthermore, in the mutex_instances table, we can now see the identifiers of the threads that locked the Gcs_async_buffer::m_free_buffer_mutex and Gcs_xcom_engine::m_wait_for_notification_mutex, which are the Gcs_ext_logger_impl::m_consumer (THREAD_ID: 95) and the Gcs_xcom_engine::m_engine_thread (THREAD_ID: 96) threads, respectively.

Wait Event Tables

But useful information, other than the memory address or the identification of the thread that locks a mutex, can be extracted from the P_S tables, namely from the following wait event tables, which contain all the wait events produced by the instrumented synchronization objects:

events_waits_current – Stores current wait events.

events_waits_history – Stores the most recent N wait events per thread. The value of N is auto-sized by the server during startup, but can also be set explicitly through the performance-schema-events-waits-history-size system variable at server startup.

events_waits_history_long – Stores the most recent L wait events globally. The value of L is auto-sized by the server during startup, but can also be set explicitly through the performance-schema-events-waits-history-long-size system variable at server startup.

These three tables share the exact same structure, and we can identify the following relevant columns:

THREAD_ID – Identifies the thread associated with the event.

EVENT_ID – Identifier of the event.

END_EVENT_ID – Contains NULL when the event starts and then takes the value of EVENT_ID when it ends.

EVENT_NAME – The name of the instrumented element that produced the event.

SOURCE – Identifies the source file and the line number of the instrumented code element.

TIMER_STARTTIMER_ENDTIMER_WAIT – These columns comprise the timing information for the event in picoseconds. TIMER_START indicates when the event started, and TIMER_ENDwhen it ended, or the current timer if the event hasn’t finished. TIMER_WAIT corresponds to the elapsed time (TIMER_END – TIMER_START). If timing is disabled for an instrument, all of these three columns will be NULL.

OBJECT_INSTANCE_BEGIN – Memory address of the synchronization object.

OPERATION – States the type of operation of the event. In this case, we will observe the values wait for conditions and lock for mutexes.

Let’s then start by verifying if the consumers corresponding to these tables are enabled in the setup_consumers table.

In this case, as well as by default, they are disabled. Therefore, we must enable them in order to enable historical event logging for the instrumented code elements that are enabled in the setup_instruments table, and for the instrumented threads with the HISTORY column enabled in the threads table.

Finally we can verify the contents of the wait event tables:

We can notice that the events_waits_current table comprises the current 3 wait events for GCS and that the events_waits_history and events_waits_history_long tables have a large number of rows. Therefore, we have decided to just show 10 rows of their contents.

For more information about these tables, for instance to enable the desired instruments and these tables at server startup, you can follow the instructions in the Performance Schema Wait Event Tables section of the manual.

Summary tables

Since wait events can be very fast and hardly traceable using the aforementioned wait event tables, the P_S summary tables are very useful, as they aggregate information for terminated events over time in different ways. The most useful summary tables for our case are the events_waits_summary tables, as they aggregate information on the instrumented synchronization objects.

There are several summary tables that aggregate information, such as the usage count and waiting time statistics, for completed wait events, but we will only look at two of them:
events_waits_summary_global_by_event_name – Summarizes data of all the instances for a given instrument.
events_waits_summary_by_instance – Summarizes data for a given instrument instance.

All wait event summary tables have these summary columns containing aggregated values:

  • COUNT_STAR – Number of summarized events, whether timed or not.
  • SUM_TIMER_WAIT – Total wait time of all summarized timed events.
  • MIN_TIMER_WAIT – Minimum wait time of the summarized timed events.
  • AVG_TIMER_WAIT – Average wait time of the summarized timed events.
  • MAX_TIMER_WAIT – Maximum wait time of the summarized timed events.

Additionally to these columns, both these tables also includes the EVENT_NAME column, which is used as primary key in the events_waits_summary_global_by_event_name table. The events_waits_summary_by_instance table also includes the OBJECT_INSTANCE_ID column, which is its primary key.

Let’s have a look at the content of these two tables, namely on the entries corresponding to the instrumented mutexes and conditions of GCS:

One can notice that the 21 rows selected from the events_waits_summary_global_by_event_name table correspond to the 13 mutexes plus the 8 conditions that are instrumented in GCS.

The 26 selected rows from the events_waits_summary_by_instance table belong to the same instrumented objects, but include 5 more rows than the previous table, which correspond to the 5 additional instances of the XCom_handler::m_lock instrumented mutex.

Conclusion

In this blog post we have seen the new features for monitoring how GCS and XCom are working, namely through the instrumentation of its synchronization objects and threads. Stay tuned for further instrumentation steps of GCS and XCom in the upcoming versions of MySQL, that will provide even more extensive monitoring. In the meantime, please refer to the MySQL Performance Schema chapter of the manual for further information on instrumentation.

We hope that you try this new monitoring feature of GR included in MySQL 8.0.4. Please let us know your experience, feedback or doubts through comments on this blog or MySQL bugs, but have in mind that this release candidate is still not generally available yet, thus not production ready. Therefore, use it at your own risk.

One thought on “More P_S Instrumentation for Group Replication

Leave a Reply