All we need is an easy explanation of the problem, so here it is.
I am new at subqueries. Is there a general strategy on how basic subqueries are evaluated and executed (besides from sql engines automatically optimizing them). For example, consider this select query:
select * from "users" order by "users"."id" asc limit 1000 offset 0;
how would the execution plan be if we convert it like this?
select * (select * from users) as u order by u.id asc limit 1000 offset 0;
More specifically, In the second example, will the sql engine fetch all the rows first before limitting them (and make it slower)?
I am using postgres 13.3 it it matters.
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
Let’s take an analogy with application development. There, a functional spec describes the behaviour of the system. It details, amongst other things, what the outputs should be, how they’re formatted and how they’re sorted. The programmer is free to implement these requirements in any way so long as the specified conditions are met. A functional spec has nothing to say about variable names, memory structures or any of the implementation detail.
In a DBMS the "functional spec" is the SQL. It describes the content and format of the output but has nothing to say about which data structures or files should be read, how, or in what sequence. Those implementation choices are made by the query optimiser (QO). It plays the role of the application programmer in this analogy.
So I’m sorry to say there isn’t much else "besides from sql engines automatically optimizing them." The optimiser is free to rearrange the submitted SQL into any alternative that is guaranteed to return identical results. In your given example it finds that the sub-query is identical to referencing the table directly. This is why the execution plans are the same – because the two queries mean the same thing, are guaranteed to return identical results in all circumstances, and the QO has identified this.
So no, the inner query is not evaluated then filtered. It is eliminated from the runtime execution plan.
Complex sub-queries (say, in the SELECT list, or correlated with the outer query) take more thinking about. The QO does not always have good options when dealing with those. Performance can be significantly different depending on the SQL submitted.
Use PostgreSQL’s built-in
EXPLAIN (ANALYZE,BUFFERS) select ... to take a look at the execution plan for both queries. In your particular very simple example, PostgreSQL will almost certainly generate an identical plan. As I said above, the more complex a query is, the more likely you are to see variations.
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂