All we need is an easy explanation of the problem, so here it is.
I have a table with index.
CREATE TABLE PART ( P_PARTKEY INTEGER NOT NULL,
P_NAME VARCHAR(55) NOT NULL,
P_MFGR CHAR(25) NOT NULL,
P_BRAND CHAR(10) NOT NULL,
P_TYPE VARCHAR(25) NOT NULL,
P_SIZE INTEGER NOT NULL,
P_CONTAINER CHAR(10) NOT NULL,
P_RETAILPRICE DECIMAL(15,2) NOT NULL,
P_COMMENT VARCHAR(23) NOT NULL );
CREATE TABLE LINEITEM ( L_ORDERKEY INTEGER NOT NULL,
L_PARTKEY INTEGER NOT NULL,
L_SUPPKEY INTEGER NOT NULL,
L_LINENUMBER INTEGER NOT NULL,
L_QUANTITY DECIMAL(15,2) NOT NULL,
L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL,
L_DISCOUNT DECIMAL(15,2) NOT NULL,
L_TAX DECIMAL(15,2) NOT NULL,
L_RETURNFLAG CHAR(1) NOT NULL,
L_LINESTATUS CHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NULL,
L_COMMITDATE DATE NOT NULL,
L_RECEIPTDATE DATE NOT NULL,
L_SHIPINSTRUCT CHAR(25) NOT NULL,
L_SHIPMODE CHAR(10) NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL);
ALTER TABLE LINEITEM
ADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);
ALTER TABLE PART
ADD PRIMARY KEY (P_PARTKEY);
And query
SELECT Sum(l_extendedprice * ( 1 - l_discount )) AS revenue
FROM lineitem,
part
WHERE ( p_partkey = l_partkey
AND p_brand = 'Brand#52'
AND p_container IN ( 'SM CASE', 'SM BOX', 'SM PACK', 'SM PKG' )
AND l_quantity >= 4
AND l_quantity <= 4 + 10
AND p_size BETWEEN 1 AND 5
AND l_shipmode IN ( 'AIR', 'AIR REG' )
AND l_shipinstruct = 'DELIVER IN PERSON' )
OR ( p_partkey = l_partkey
AND p_brand = 'Brand#11'
AND p_container IN ( 'MED BAG', 'MED BOX', 'MED PKG', 'MED PACK' )
AND l_quantity >= 18
AND l_quantity <= 18 + 10
AND p_size BETWEEN 1 AND 10
AND l_shipmode IN ( 'AIR', 'AIR REG' )
AND l_shipinstruct = 'DELIVER IN PERSON' )
OR ( p_partkey = l_partkey
AND p_brand = 'Brand#51'
AND p_container IN ( 'LG CASE', 'LG BOX', 'LG PACK', 'LG PKG' )
AND l_quantity >= 29
AND l_quantity <= 29 + 10
AND p_size BETWEEN 1 AND 15
AND l_shipmode IN ( 'AIR', 'AIR REG' )
AND l_shipinstruct = 'DELIVER IN PERSON' );
I want to know
- how many reads are performed on index.
- how many reads are performed on table data.
- how many index are in the buffer (memory) and in the disk.
- how many table data are in the buffer (memory) and in the disk.
Thanks!
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Method 1
Metrics
It is possible to find the number of rows looked at in the tables for a single query. These count rows of index or data, not counted separately, for your connection. If no other queries are running, total counts are possible.
SHOW SESSION STATUS LIKE 'Handler%';
Here is a discussion of using that to help with optimization: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#handler_counts
From
SHOW GLOBAL STATUS LIKE 'Innodb_%';
you can get certain global counts for disk accesses, etc. for all InnoDB statements from all connections. (This is not precisely what you are asking for, but may be interesting.)
Another source is the system database performance_schema
. (In older MySQL versions, information_schema
had some tables, but there are mostly moved to PS now.)
If you elaborate on the goals of your question, we might be able to help in more detail.
Optimizing that query
Start by using the modern JOIN
syntax:
FROM lineitem AS l
JOIN part AS p ON p.partkey = l.partkey
That was a common part of the ORs; this makes it so the Optimizer can do the JOIN more efficiently.
In case the Optimizer decides to start with part
, have this
lineitem: INDEX(partkey) -- (but see below)
Now I see that l.shipinstruct = 'DELIVER IN PERSON'
is common across the ORs, so pull it out. Then this would probably be useful
lineitem: INDEX(shipinstruct) -- (but see below)
But, all of that probably still won’t do as much Optimization as turning the OR
into UNION ALL
:
( SELECT ... Brand#52 ... )
UNION ALL
( SELECT ... Brand#11 ... )
UNION ALL
( SELECT ... Brand#51 ... )
Then we can build some even better indexes:
part: INDEX(brand, container)
part: INDEX(brand, size)
lineitem: INDEX(shipinstruct, quantity)
lineitem: INDEX(shipinstruct, shipmode)
lineitem: INDEX(partkey, shipinstruct, quantity)
lineitem: INDEX(partkey, shipinstruct, shipmode)
There are many ways the Optimizer might pick to run the query; I think these cover all the bases.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0