MySQL Cluster Manager 1.4.2 released!

MySQL_Cluster_Manager
MySQL Cluster Manager 1.4.2 is now available for download from My Oracle Support.


In this blog post we will highlight some of the details of the MCM 1.4.2 release.

Progress reporting
First, some eye candy. With MCM 1.4.2 we added simple progress reporting. There is now a simple way to get some sense of progress, especially interesting for long-running commands. There is a plain variant, and one which includes an ascii-art progressbar.

Combine this with something like the linux/unix watch command, and you can easily monitor cluster health directly in the CLI. Please note that monitoring using a very short timer, combined with a log-level of debug/info, will cause excessive logging:

Update process
Like it or not, one of the main motivations for running a distributed system in production is the ability to handle disasters and disaster recovery. In production it can be more important to make your service available again, than exactly how. In desperation the missing process is restarted on the commandline to restore availability. When the dust finally settles, you clean up.

In earlier versions there was no way for MCM to (re)discover the restarted process, meaning yet another restart was needed for MCM to resume management of the lost process. With very large databases this could mean an unwanted service and durability degradation, since a datanode may take a significant amount of time to restart.

With MCM 1.4.2 we add the ability to reattach a process started manually on the commandline, and put it under MCM control once again. The new functionality builds on the import cluster functionality of MCM and impose similar restrictions on the reattached process. Please have a look at the documentation for details and limitations.

Example:

Assume your mission critical cluster is up and running in production. You did ensure your backups are good, actually are restoreable, and stored in a safe place, right?

Disaster strikes, at the worst time possible, and host shark1 does not survive the carnage for some reason. Necessary actions are taken to quickly replace the hardware, and the failed datanode is restarted on the commandline to restore availability. All looks well from the ndb_mgm commandline client. Phew!

… but, hang on. MCM doesn’t see it!

You have to let MCM know the pid of the restarted process using the new update process command:

The restarted process has been successfully reattached to mcm and service fully restored.

Do note that the failed datanode must be restarted using the --foreground=true option, as to restart without an angel.

Collect logs
There are some significant improvements to the existing collect logs feature in MCM 1.4.2.

The port number used by the collector thread is now user configurable using the --copy-port=4567 option, or as a setting in the agent’s defaults-file. This enables log collection in setups where a firewall would block access to the random port number designated by previous versions of MCM.

With MCM 1.4.2 the sender threads on each agent send more metadata to the collector thread about the files being collected. This improves progress tracking and detection of partially transferred file sets, or partially transferred files.

Reinitializing mysqld datadir
Assume your mysqld datadir was corrupted, and the mysqld refuses to start. Earlier versions required manual intervention to reinitialize the mysqlds datadir, but with MCM 1.4.2 this may now be done using start process --initial. It requires datadir to be empty for this command to succeed. If there are files left in datadir, MCM will refuse the reinitialization and throw an error. Do note that mcmd does not rerun any restore operation on non-NDB tables, so you still need to manually restore any non-system InnoDB tables that you need, from your safely stored backups…

MCM 1.4.1 also apply the same automation if you fully replace a mysqld host or disk. When mcmd rejoins an existing site without any configuration metadata, it automatically recovers the configuration metadata from the other members and restarts any processes that should run on the replaced host. MCM 1.4.2 will automatically reinitialize any missing mysqld datadirs on the replaced host, as if you ran start process --initial .

More details
We have also done a number of smaller improvements, and as always fixed bugs. More details are available in the the MCM 1.4.2 Release Notes.

And updated documentation is available here.

Enjoy!

About Thomas Nielsen

Thomas Nielsen is a software engineer, and the Team Lead for MySQL Cluster Manager, at Oracle. He has worked on different high availability products and technologies in databases, as well as JavaDB, before joining the MySQL Cluster team. He holds an engineering degree from the Norwegian University of Science and Technology, and his past-time passion is gliding instruction and cross country soaring.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter. * Time limit is exhausted. Please reload CAPTCHA.