All we need is an easy explanation of the problem, so here it is.
I took a look at my old accounting system, and it seems that performance is taking a role in the daily labor of the employees using it. So, I discovered that using a subquery was the problem, I’ve been reading, testing and, it seems that using a JOIN is like 100x faster as the data that we have in our Databases is huge now.
How do I can convert this subquery into a JOIN?
I’m seeking for help because I’m trying, but I’m being unable to do it, and I’m starting to think that this is not possible.
$sql = "SELECT orders.order_id, orders.order_time, orders.order_user, orders.payment_state, orders.order_state, orders.area_name, ( SELECT COUNT(*) FROM order_item WHERE order_item.order_id = orders.order_id ) AS items_number FROM orders WHERE orders.order_state = 1 AND order_time BETWEEN DATE_SUB(NOW(), INTERVAL 365 DAY) AND NOW()";
Being specific, the data we are retrieving here is all the rows created in the last year from the orders table AND the number of items purchased in each order, which is called from the subquery as items_number from order_item table WHERE order_id is equal in each table.
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
I would not do what another Answer suggests — it involves the "explode-implode", which is even slower.
I would start by seeing if this helped much:
orders: INDEX(order_state, order_time)
Your correlated subquery is not necessarily inefficient in this case. (That 100X quote is based on too few examples to be trustworthy.)
This avoids the explode-implode and turns the subquery into a "derived table" (which is similar) but it needs to be executed only once.
SELECT o.order_id, o.order_time, o.order_user, o.payment_state, o.order_state, o.area_name, i.items_number FROM ( SELECT order_id, COUNT(*) AS items_number FROM order_item GROUP BY order_id ) AS i JOIN orders AS o ON o.order_id = i.order_id WHERE order_state = 1 AND order_time >= NOW() - INTERVAL 365 DAY
If you need "zero" values for orders without items, a minor change can achieve that; let me know.
The index above is needed here. Also, if you don’t already have
order_id indexed in `order_item, add that.
A simple way to do this would be like so:
SELECT ord.order_id, ord.orden_time, ord.orden_user, ord.payment_state, ord.order_state, ord.area_name, COUNT(itm.item_id) AS items_number FROM orders ord INNER JOIN order_item itm ON ord.order_id = itm.order_id WHERE ord.order_state = 1 and ord.order_time BETWEEN DATE_SUB(NOW(), INTERVAL 365 DAY) AND NOW() GROUP BY ord.order_id, ord.orden_time, ord.orden_user, ord.payment_state, ord.order_state, ord.area_name
This query assumes that every
orders record will contain a minimum of one
order_item record, which makes sense. Be sure to change
item_id to whatever the primary key is for the
As an aside, if you would like your
BETWEEN statement to be a little more fixed in time, you might want to use
DATE_FORMAT() to ensure the start time is always
00:00:00 rather than the time of day when the query was run:
and ord.order_time BETWEEN DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 365 DAY), '%Y-%m-%d 00:00:00') AND NOW()
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂