All we need is an easy explanation of the problem, so here it is.
My query looks like this
select employees.PK_worker,
concat(employees.name,' ',employees.surname) AS name,
(select coalesce(sum(employees_to_hand.cost),0)
from employees_to_hand
where employees_to_hand.PK_worker = employees.PK_worker
and employees_to_hand.reg_date between @dstart and @dend) AS to_hand_cost,
(select coalesce(sum(employees_delegations.cost),0)
from employees_delegations
where employees_delegations.PK_worker = employees.PK_worker
and employees_delegations.dstart between @dstart and @dend) AS delegation_cost
FROM employees
join (select @dstart := str_to_date('10/01/2017','%m/%d/%Y'),
@dend := str_to_date('11/30/2017','%m/%d/%Y')) d
order by 1;
I am wondering if it is possible to do some basic summing for instance (to_hand_cost
+ delegation_cost
) as simple_sum I assume it might be not possible but repeating the selects again to get the values seems overwhelming. Got any suggestions?
Mysql playground mcve : https://www.db-fiddle.com/f/5FtV24czfAfUiE1dj5EeYt/13
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Method 1
The two inner tables should have INDEX(PK_worker, reg_date, cost)
and INDEX(PK_worker, dstart, cost)
.
Plan A:
The following may help, and it may depend on what version of MySQL you are running:
JOIN ( SELECT str_to_date('10/01/2017','%m/%d/%Y') AS start,
str_to_date('11/30/2017','%m/%d/%Y') AS end ) AS d
and then use start
and end
instead of @start
and @end
.
Rationale: @variables
are sometimes not well optimized.
Plan B: (But this requires 5.6, perhaps 5.7 – see the EXPLAIN
for “auto-key”.)
SELECT e.PK_worker,
concat(e.name,' ',e.surname) AS name,
FROM employees AS e
JOIN ( SELECT PK_worker,
coalesce(sum(cost),0) AS to_hand_cost
FROM employees_to_hand
WHERE ...
AND reg_date BETWEEN start AND end
GROUP BY 1 ) AS ethc
ON e.PK_worker = ethc.PK_worker
JOIN ( ... ) AS edc
ON ...
ORDER BY 1;
INDEX(reg_date, PK_worker, cost)
Rationale: Calculating all the costs in one pass over the secondary tables is probably faster than one employee at a time. Hopefully it is enough faster than the cost of automatically creating INDEX(PK_worker)
after generating the ‘derived’ table. The different index is a guess as to what would work better for this.
Thanks for dbfiddle, but in this situation, some of the optimizations depend on the size of the table, distribution of the values (esp reg_date
, and the version of MySQL).
Method 2
One option would be to:
- continue to use the
@dstart
and@dend
variables to designate our date/search range (couldn’t get derived table columns – eg,d.dstart
– to be used in sibling derived tables; this is normal behavior in other RDBMSs, but not sure about MySQL) - convert the select/sub-queries into derived tables
left join
employees
with these derived tables (since not all employees are guaranteed to have all 7x costs for a given date range)- this gives us summed costs (by worker) at the top level of the query, so from here we can do some simple math on the sums
Pulling this all together, a shortened query for the proposed summation (to_hand_cost
+ delegation_cost
) looks like:
select e.PK_worker,
concat(e.name,' ',e.surname) as name,
coalesce(eth.sum_cost,0) as to_hand_cost,
coalesce(ed1.sum_cost,0) as delegation_cost,
coalesce(eth.sum_cost,0) + coalesce(ed1.sum_cost,0) as simple_sum
from employees e
join (select @dstart := str_to_date('10/01/2017','%m/%d/%Y'),
@dend := str_to_date('11/30/2017','%m/%d/%Y')) as d
left
join (select PK_worker,
sum(cost) as sum_cost
from employees_to_hand
where reg_date between @dstart and @dend
group by PK_worker) as eth
on eth.PK_worker = e.PK_worker
left
join (select PK_worker,
sum(cost) as sum_cost
from employees_delegations
where dstart between @dstart and @dend
group by PK_worker) as ed1
on ed1.PK_worker = e.PK_worker
order by 1;
Here’s an updated db-fiddle for the complete query. A quick review appears to show the same results as the original db-fiddle.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0