Elusive problem on AWS RDS MySQL

All we need is an easy explanation of the problem, so here it is.

I have a provisioned, single-master RDS DB (5.6.mysql_aurora.1.23.0) with a writer and a reader. The clients are a bunch of jetty applications with a pool of up to 250 connections each.

I have for some time been struggling with intermittent problems: Connection atttempts start to time out, mostly. It seems to correlate with high loads, and I have started copying the entire information_schema.processlist to a log table at regular intervals (with a timestamp, obviously), and it seems the problem becomes most pronounced when the number of connections reaches 2000. It never gets near max_connections or max_user_connections:

mysql> show variables like "%max%conn%";
| Variable_name                | Value |
| aurora_max_connections_limit | 16000 |
| max_connect_errors           | 100   |
| max_connections              | 3000  |
| max_user_connections         | 0     |
4 rows in set (0.03 sec)

So, which parameters should I concentrate on in order to get past this problem? I haven’t found anything likely looking, that explains why 2000 connections seem to cause this; of course, it may simply be a coincidence that such a round number pops up.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

This is most likely going to be the result of memory exhaustion. Every connection to the database requires a bit of memory and, if each connection needs 1MB (a low number), you could be looking at about 2GB of memory used just for the connections.

This SQL query will let you know the maximum amount of memory any single connection can use:

SELECT ( @@read_buffer_size
       + @@read_rnd_buffer_size
       + @@sort_buffer_size
       + @@join_buffer_size
       + @@binlog_cache_size
       + @@thread_stack
       + @@tmp_table_size
       + 2*@@net_buffer_length
       ) / (1024 * 1024) AS MEMORY_PER_CON_MB;

Check that this number isn’t too high (which is a subjective number based in experience with your specific databases). If the database is having its memory exhausted regularly, then you may need either a memory-optimized instance or stricter limits on how many resources each connection may consume.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply