MySQL:memory used by query joins does not get released

All we need is an easy explanation of the problem, so here it is.

I have a MySQL db running on a 8cpu 24Gb machine.
There are multiple scripts writing or reading from it at any point in time, often in parallel.

My settings:

[mysqld]
innodb_buffer_pool_instances = 6
innodb_buffer_pool_size = 9663676416
innodb_log_file_size = 1073741824
read_rnd_buffer_size = 134217728
join_buffer_size = 8000000
group_concat_max_len = 100000
sort_buffer_size = 256870912
binlog_expire_logs_seconds = 21600
max_connections = 70

So that means 9Gb for the buffer pool, 256Mb for sort bufferand 80Mb for join buffers.
I know these are wrong settings as given the max_connections, I can totally overshoot the RAM. Mysqltuner returns:

Total buffers: 9.0G global + 445.0M per thread (70 max threads)

This was working well so far as the queries in use were not heavy on the join and sort buffers.

Last week we needed to add some functionalities and modified heavily one of the critical queries that is run every 20 mins.

The original query was looking like:

insert into mytable
    (column1, column2, column3, column4, column5, column6, column7, column7, column8, column9, column10, column11, column12)
    (SELECT * FROM 
         (select 
             column1, column2, 
             group_concat(DISTINCT column3 SEPARATOR ';') as new_column3, 
             column4, 
             date_add(column4, interval max(column5) second) as new_column5, 
             group_concat(column6 order by duration asc SEPARATOR ';') as new_column6,
             min(column7) as column7,
             0 as column8,
             .....
             from my_initial_table t 
             group by column1, column2, column4  
             having date_add(column4, interval max(column5) second) <= date_sub(now(), interval 1800 second)
             and column4 > date_sub(CURDATE(), interval 3 day)) p
         ) 
    )

After refactoring, we use 2 temp tables that use partitions and feed it to the initial select:

insert into mytable
(column1, column2, column3, column4, column5, column6, column7, column8, column7_first_5mins, column9, column10, column11, column12, column13)

WITH source AS (
    SELECT a.*
        ,max(column6) over(partition by column1,column3) as max_per_3
        ,row_number() over(partition by column1,column3 order by column6 asc) as per_3_order 
        ,row_number() over(partition by column1 order by column6 asc) as per_1_order
        ,max(column6) over(partition by column1,x) as max_per_x
        ,row_number() over(partition by column1,x order by column6 asc) as per_x_order 
    FROM my_initial_table a
) , 1_3_history AS  (
    SELECT column1
        , JSON_ARRAYAGG(
            JSON_OBJECT('column3',column3
                ,'column4',CASE WHEN per_1_order = 1 THEN column4 ELSE DATE_ADD(column4,INTERVAL max_per_3 SECOND) END
            )
    ) as history
    FROM source 
    WHERE per_3_order = 1
    GROUP BY column1
)                        
, 1_x_history AS (
    SELECT column1
        , JSON_ARRAYAGG(
            JSON_OBJECT('x', x
                ,'column4',CASE WHEN per_1_order = 1 THEN column4 ELSE DATE_ADD(column4,INTERVAL max_per_x SECOND) END
            )
    ) as history
    FROM source 
    WHERE per_x_order = 1
    GROUP BY column1
)

(SELECT * FROM 
    (select 
        a.column1, a.column2, 
        group_concat(DISTINCT column3 SEPARATOR ';') as new_column3, 
        column4, 
        date_add(column4, interval max(column6) second) as new_column5, 
        max(column6) as new_column6, 
        avg(v_count) as new_column7,
        max(v_count) as new_column8,
        null as new_column7_first_5mins,
        group_concat(v_count order by column6 asc SEPARATOR ';') as new_column9,
        min(column10) as column10,
        0 as column11,
        hist.history as column12,
        hist2.history as column13
    FROM source a
    LEFT JOIN 1_3_history hist on hist.column1 = a.column1
    LEFT JOIN 1_x_history hist2 on hist2.column1 = a.column1
    group by a.column1, a.column2,  column4  
    having date_add(column4, interval max(column6) second) <= date_sub(now(), interval 10800 second)
    and column4 > date_sub(CURDATE(), interval 3 day)) p
) 

Since we deployed this new query, the RAM usage builds up quite quickly and reach the 24Gb installed on the machine.
I definitely suspect that it comes from the sort or join buffers, however this query never run in parallel, it runs every 20 mins and run time is about 2 to 3 mins. So I would expect it to release the connection and that the sort and join buffer gets cleaned, but instead it seems like it build up.

I monitored my table open cache and it does not grow in relation to the memory consumption, so this does not look like the guilty one.

How can I dump the sort and join buffer from closed connections?
I fear that if I reduce my sort and join buffer size that will impact the performances of other queries. Any help appreciated here.

EDIT: corrected the query with the exact replica of prod, added measurements of ram from tests and explain of the whole query in json

JSON Explain:

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "207154.15"
    },
    "table": {
      "insert": true,
      "table_name": "mytable",
      "access_type": "ALL"
    },
    "insert_from": {
      "table": {
        "table_name": "p",
        "access_type": "ALL",
        "rows_examined_per_scan": 1841348,
        "rows_produced_per_join": 1841348,
        "filtered": "100.00",
        "cost_info": {
          "read_cost": "23019.35",
          "eval_cost": "184134.80",
          "prefix_cost": "207154.15",
          "data_read_per_join": "210M"
        },
        "used_columns": [
          "column1",
          "column2",
          "new_column3",
          "column4",
          "new_column5",
          "new_column6",
          "new_column7",
          "new_column8",
          "new_column7_first_5mins",
          "new_column9",
          "column10",
          "column11",
          "column12",
          "column13"
        ],
        "materialized_from_subquery": {
          "using_temporary_table": true,
          "dependent": false,
          "cacheable": true,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "2859846.11"
            },
            "grouping_operation": {
              "using_filesort": true,
              "cost_info": {
                "sort_cost": "1841348.00"
              },
              "nested_loop": [
                {
                  "table": {
                    "table_name": "a",
                    "access_type": "ALL",
                    "rows_examined_per_scan": 460337,
                    "rows_produced_per_join": 460337,
                    "filtered": "100.00",
                    "cost_info": {
                      "read_cost": "5756.71",
                      "eval_cost": "46033.70",
                      "prefix_cost": "51790.41",
                      "data_read_per_join": "491M"
                    },
                    "used_columns": [
                      "column1",
                      "column2",
                      "column3",
                      "v_count",
                      "column4",
                      "column6",
                      "column10",
                      "title",
                      "max_per_3",
                      "per_3_order",
                      "per_1_order",
                      "max_per_x",
                      "per_x_order"
                    ],
                    "materialized_from_subquery": {
                      "using_temporary_table": true,
                      "dependent": false,
                      "cacheable": true,
                      "query_block": {
                        "select_id": 3,
                        "cost_info": {
                          "query_cost": "2353167.20"
                        },
                        "windowing": {
                          "windows": [
                            {
                              "name": "<unnamed window>",
                              "definition_position": 1,
                              "using_temporary_table": true,
                              "using_filesort": true,
                              "filesort_key": [
                                "`column1`",
                                "`column3`"
                              ],
                              "frame_buffer": {
                                "using_temporary_table": true,
                                "optimized_frame_evaluation": true
                              },
                              "functions": [
                                "max"
                              ]
                            },
                            {
                              "name": "<unnamed window>",
                              "definition_position": 2,
                              "using_temporary_table": true,
                              "using_filesort": true,
                              "filesort_key": [
                                "`column1`",
                                "`column3`",
                                "`column6`"
                              ],
                              "functions": [
                                "row_number"
                              ]
                            },
                            {
                              "name": "<unnamed window>",
                              "definition_position": 3,
                              "using_temporary_table": true,
                              "using_filesort": true,
                              "filesort_key": [
                                "`column1`",
                                "`column6`"
                              ],
                              "functions": [
                                "row_number"
                              ]
                            },
                            {
                              "name": "<unnamed window>",
                              "definition_position": 4,
                              "using_temporary_table": true,
                              "using_filesort": true,
                              "filesort_key": [
                                "`column1`",
                                "`x`"
                              ],
                              "frame_buffer": {
                                "using_temporary_table": true,
                                "optimized_frame_evaluation": true
                              },
                              "functions": [
                                "max"
                              ]
                            },
                            {
                              "name": "<unnamed window>",
                              "definition_position": 5,
                              "last_executed_window": true,
                              "using_filesort": true,
                              "filesort_key": [
                                "`column1`",
                                "`x`",
                                "`column6`"
                              ],
                              "functions": [
                                "row_number"
                              ]
                            }
                          ],
                          "cost_info": {
                            "sort_cost": "2301685.00"
                          },
                          "table": {
                            "table_name": "a",
                            "access_type": "ALL",
                            "rows_examined_per_scan": 460337,
                            "rows_produced_per_join": 460337,
                            "filtered": "100.00",
                            "cost_info": {
                              "read_cost": "5448.50",
                              "eval_cost": "46033.70",
                              "prefix_cost": "51482.20",
                              "data_read_per_join": "470M"
                            },
                            "used_columns": [
                              "column1",
                              "column2",
                              "column3",
                              "v_count",
                              "column4",
                              "column6",
                              "column10",
                              "x"
                            ]
                          }
                        }
                      }
                    }
                  }
                },
                {
                  "table": {
                    "table_name": "hist",
                    "access_type": "ref",
                    "possible_keys": [
                      "<auto_key0>"
                    ],
                    "key": "<auto_key0>",
                    "used_key_parts": [
                      "column1"
                    ],
                    "key_length": "8",
                    "ref": [
                      "a.column1"
                    ],
                    "rows_examined_per_scan": 2,
                    "rows_produced_per_join": 920674,
                    "filtered": "100.00",
                    "cost_info": {
                      "read_cost": "230168.50",
                      "eval_cost": "92067.40",
                      "prefix_cost": "374026.31",
                      "data_read_per_join": "28M"
                    },
                    "used_columns": [
                      "column1",
                      "history"
                    ],
                    "materialized_from_subquery": {
                      "using_temporary_table": true,
                      "dependent": false,
                      "cacheable": true,
                      "query_block": {
                        "select_id": 4,
                        "cost_info": {
                          "query_cost": "3.50"
                        },
                        "grouping_operation": {
                          "using_filesort": true,
                          "table": {
                            "table_name": "source",
                            "access_type": "ref",
                            "possible_keys": [
                              "<auto_key0>"
                            ],
                            "key": "<auto_key0>",
                            "used_key_parts": [
                              "per_3_order"
                            ],
                            "key_length": "8",
                            "ref": [
                              "const"
                            ],
                            "rows_examined_per_scan": 10,
                            "rows_produced_per_join": 10,
                            "filtered": "100.00",
                            "cost_info": {
                              "read_cost": "2.50",
                              "eval_cost": "1.00",
                              "prefix_cost": "3.50",
                              "data_read_per_join": "10K"
                            },
                            "used_columns": [
                              "column1",
                              "column2",
                              "column3",
                              "v_count",
                              "column4",
                              "column6",
                              "column10",
                              "title",
                              "max_per_3",
                              "per_3_order",
                              "per_1_order",
                              "max_per_x",
                              "per_x_order"
                            ],
                            "materialized_from_subquery": {
                              "sharing_temporary_table_with": {
                                "select_id": 3
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                },
                {
                  "table": {
                    "table_name": "hist2",
                    "access_type": "ref",
                    "possible_keys": [
                      "<auto_key0>"
                    ],
                    "key": "<auto_key0>",
                    "used_key_parts": [
                      "column1"
                    ],
                    "key_length": "8",
                    "ref": [
                      "a.column1"
                    ],
                    "rows_examined_per_scan": 2,
                    "rows_produced_per_join": 1841348,
                    "filtered": "100.00",
                    "cost_info": {
                      "read_cost": "460337.00",
                      "eval_cost": "184134.80",
                      "prefix_cost": "1018498.11",
                      "data_read_per_join": "56M"
                    },
                    "used_columns": [
                      "column1",
                      "history"
                    ],
                    "materialized_from_subquery": {
                      "using_temporary_table": true,
                      "dependent": false,
                      "cacheable": true,
                      "query_block": {
                        "select_id": 6,
                        "cost_info": {
                          "query_cost": "3.50"
                        },
                        "grouping_operation": {
                          "using_filesort": true,
                          "table": {
                            "table_name": "source",
                            "access_type": "ref",
                            "possible_keys": [
                              "<auto_key1>"
                            ],
                            "key": "<auto_key1>",
                            "used_key_parts": [
                              "per_title_order"
                            ],
                            "key_length": "8",
                            "ref": [
                              "const"
                            ],
                            "rows_examined_per_scan": 10,
                            "rows_produced_per_join": 10,
                            "filtered": "100.00",
                            "cost_info": {
                              "read_cost": "2.50",
                              "eval_cost": "1.00",
                              "prefix_cost": "3.50",
                              "data_read_per_join": "10K"
                            },
                            "used_columns": [
                              "column1",
                              "column2",
                              "column3",
                              "v_count",
                              "column4",
                              "column6",
                              "column10",
                              "x",
                              "max_per_3",
                              "per_3_order",
                              "per_1_order",
                              "max_per_x",
                              "per_x_order"
                            ],
                            "materialized_from_subquery": {
                              "sharing_temporary_table_with": {
                                "select_id": 3
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
}

RAM measurements:
I tried to run the query manually 3 times in a row then waited 20 mins, and running again, giving:

> 12.4 -> 12.7, peak during exec at 16.4
> 12.7 -> 12.9, peak during exec at 13.8
> 12.9 -> 13.0, peak during exec at 14.1
> 
> wait 20 mins
> 
> 13.1 -> 13.3, peak at during exec 16.3

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

Vincent, In the data provided by SHOW GLOBAL STATUS;

Com_group_replication_stop  0
Com_stmt_execute    1849
Com_stmt_close  0
Com_stmt_fetch  0
Com_stmt_prepare    173
Com_stmt_reset  0
Com_stmt_send_long_data 0
Com_truncate    1

it appears you forgot to CLOSE the activity when done with execution as mentioned here,
which releases resources used.

Method 2

Suggestions on your first set:

** Remove the extra layer:

 SELECT * FROM ( 
               )

** Move this from HAVING to WHERE:

  column4 > date_sub(CURDATE(), interval 3 day)

** have this index (with the columns in this order):

 INDEX(column1, column2, column4)

Your second version seems to have a different GROUP BY; are you sure it delivers the same values?

For further discussion, please provide EXPLAIN SELECT .... This will show, for example, whether the "join buffer" is being used any. If EXPLAIN does not tell us enough, I will ask for EXPLAIN FORMAT=JSON SELECT ....

If there is no "swapping", then those settings are "not too big", at least for now.

As for the original question ("memory not being released")… Check the memory after several cycles (20-minutes each). Is the memory jumping up the first time, then not staying steady? Is memory jumping up by the same amount each 20 minutes? Or something in between. (A graph would be nice, but a list of values would be OK.)

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply