This is a description of how to prevent mysqldump
failures that are hard to explain and only happen sporadically.
When taking backups of a database containing large rows—usually with BLOB columns—you might sporadically experience error messages like:
mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table `tablename` at row: 1342
One of the strange symptoms of this problem is that the row number may vary between runs of the identical statement on an unchanged set of data and that there seems to be nothing wrong with the records specified in the error message.
We ran into this problem time and again, but very infrequently, over the course of several months. Often restarting the dump would make the problem disappear, only to have it come up again after a seemingly unpredictable number of successful runs.
The problem was finally diagnosed and identified as documented in MySQL Bug #46103 at http://bugs.mysql.com/bug.php?id=46103.
When mysqldump
runs, it will connect to the MySQL server using a network connection—just like any other MySQL client. As such it is subject to the usual settings, especially the different network timeouts and the max_packet_size
setting.
What may now happen with large table rows is that the net_write_timeout
may be set to a time limit that is too short to transfer a whole data packet of max_packet_size
bytes length from the server to the client and write it to the disk there. From our experience, this might even happen on a loaded machine when mysqldump
is connecting via localhost.
To the MySQL server, this will look as if the client is not responding anymore and it will terminate the connection after net_write_timeout
seconds, causing the error message shown earlier. As this problem is connected to server and network load factors, the error message can contain varying row numbers, making the problem even more difficult to understand at first glance.
The fix is quite easy—configure the net_write_timeout
value to a large enough value before running mysqldump
, making sure that a full data packet can be transferred via the network and its contents be written to the SQL dump file:
$ mysql -e "SET GLOBAL net_write_timeout=120;"
This will give mysqldump
two minutes to retrieve and store a single data packet, which should be plenty even for large BLOB columns.
The bug report #46103 is being kept open as a feature request at the time of writing this, so that mysqldump
will request a long enough timeout automatically. Until that gets implemented, you can use the workaround presented here.