There have been a lot of posts on LinkedIn of late about the “logical execution order” of a query that all really miss some really big points. I was corrected myself in some terminology because I mistook the term “processing” to mean the same as “order” in these discussions when I was explaining why logical execution order is not what people expected.
Note: this is mostly SQL Server based in example, but the basic explanations are true for any optimizer that is cost based.
Just a quick search on bing and I see many posts that have it wrong and even AI tried to call it logical execution order. Sigh, because the logical processing order is important, but it may not help you write faster queries in all cases, but it will help you understand a few compilation errors.
Semantics
The crux of the issue is (as always) a matter of semantics. On Microsoft’s SELECT query page, it lists the “Logical Query Processing Order” as:
FROMONJOINWHEREGROUP BYWITH CUBEorWITH ROLLUPHAVINGSELECTDISTINCTORDER BYTOP
(Along with some caveats about it perhaps varying. the key here is what the word “processing” means. This list is prefixed with the following quote:
“he following steps show the logical processing order, or binding order, for a
SELECTstatement.”
This list is an important thing to understand because it explains why when you type:
SELECT columnName as A
You can’t reference A in the FROM, ON. JOIN. WHERE, etc.; whereas you can in the DISTINCT, or ORDER BY. To make this more interesting, I created this query:
SELECT TOP 1000 st.Name AS Territory, YEAR(soh.OrderDate) AS SalesYear, SUM(soh.TotalDue) AS TotalSales, COUNT(DISTINCT soh.SalesOrderID) AS OrderCount, COUNT(DISTINCT soh.CustomerID) AS UniqueCustomers FROM Sales.SalesOrderHeader soh INNER JOIN Sales.SalesTerritory st ON soh.TerritoryID = st.TerritoryID WHERE soh.OrderDate IS NOT NULL GROUP BY st.Name, YEAR(soh.OrderDate) HAVING COUNT(DISTINCT soh.SalesOrderID) > 100 ORDER BY SalesYear;
So in the ORDER BY clause, I was able to use the alias SalesYear instead of the expression YEAR(soh.OrderDate). If you try that in the HAVING expression:
--HAVING COUNT(DISTINCT soh.SalesOrderID) > 1001
HAVING OrderCount > 1001
Processing Order is not Execution Order
The processing order is not the execution order either in a logical sense of the term, or the actual sense of the term. The query presented in the previous section would have this query plan (using SQL Sentry Plan Explore so I can fit the graphical plan on the screen this time for simplicity as I don’t want to dig deep into the details and it has a way to make the plan more compact):

The query plan reads from right to left, and you can see the query ends with SELECT and TOP occurs before that. In every SELECT statement you will see that SELECT is the final operator. And nothing inside the plan will really match up to the list of clauses.
Now, think about this list again:
FROMONJOINWHEREGROUP BYWITH CUBEorWITH ROLLUPHAVINGSELECTDISTINCTORDER BYTOP
Notice that HAVING is below where. So if this were vaguely the execution order, changing the HAVING shouldn’t considerably change the costs of the query, right? It will have to process data though the WHERE, though the GROUP BY, and then process the HAVING clause.
Changing the HAVING clause to be 1=2, the query plan now says “what, do you think I look stupid?”:

It really only evaluates the expression = FALSE and knows “I don’t need to do anything else”. But the processing of the query is the same, because it has to determine the shape of the output of a billion rows or 0 rows. Even an empty set of rows is a formatted nothing, not just nothing.
So you get this as an output:
Territory SalesYear TotalSales OrderCount UniqueCustomers----------------------- ----------- --------------------- ----------- ---------------
My non-technical analogy
So when I was thinking about this, I started thinking about how much the query processing cycle is like a person asking for something, and I landed on a employee asking their manager to go on a trip. It goes a bit like this:
E (Employee): I would like to go to a database conference in Orlando, and I want to use company money to spend 2 days at Disney World. (Query)
M (Manager): Parses this query, starts making a list of thing that would need to be done to plan this trip, and gets an error on “company money to spend 2 days at Disney World” and rejects the query. (Query processing)
E: Ok, I want to go to a database conference in Orlando and take vacation for 2 days.
M: Processes the request again, agrees it is something that can be tried. Takes the simple request and simplifies it to the simplest way of handling it. 1. Makes a plan for what would need to be done. 2. Check budget 3. Make sure time is covered when E is gone 4. Send transportation details to travel team etc. (Query rewriting, turning the query into it’s simplest form, covering all the things that need to be done to answer the query.)
E thanks manager, and waits for the optimized plan from the various parties. Costs are optimized, and a travel plan is conceived. Any and all modes of transportation and deals are considered. If it seems really expensive/complicated, the plan is tweak by the requestor like “Can I not change plans 10 times for what could be a 2 hour flight?” After the plan is approved and the execution begins. (Query planning/optimization, and now we have an estimated plan.)
Cars are rented, plane tickets acquired, etc. (Resource allocation) When it is time for the trip, E executes the plan. The plan doesn’t always work as expected due to traffic, issues with vehicles, cows blocking the road, etc, so the details may change depending on what happens in real time. To paraphrase Mike Tyson, everyone has a plan until all the airlines have a simultaneous failure. (Query execution)
If all goes well, the trip happens, and then E reports what actually occurred to inform future travelers how to execute a travel plan. (Caching the query plan)
Of course this is a very loose example, but hopefully it helps to clarify that there are three big phases in the process that takes the query you make to the database platform to get you your results:
- Parsing and simplifying the query
- Rewriting and optimizing the query to find the plan
- Executing the plan
The logical processing that is so often called logical execution is not at all how a query will be executed.
An admission and more reading
My admission is that I don’t know a tremendous amount about query optimization. I know more than the average T-SQL programmer, I will admit, but not enough that I am comfortable in teaching it. I know enough to take a poorly executing query and pretty quickly get to the heart of the problem and then do a good job in optimizing that code.
But the topic of query optimization itself is really deep and complicated with a lot of math involved to calculate costs. Any math knowledge I had has long since evaporated and I don’t really want it back (certainly not enough to seriously crack a textbook on the subject.)
My point in this blog was to make it clear to people to not take that logical processing/binding order and think this is how any relational engine would process a query.
When I am testing a query before and after I execute it, I generally think of a simplistic execution order to be something like this:
FROM– outputs a set of data (JOINandONwould be done in this step)WHERE– filters those rowsGROUP BY– groups rows together in to “bundles of rows to query”- Compute Aggregates – (make new values that are basically added to the columns from the
FROMclause) HAVING– Filtering groups, typically by aggregates.CUBEandROLLUP– Computing sums for the rollupORDER BY– getting the rows to orderDISTINCT– filtering out duplicate rowsTOP– grab the easiest N RowsSELECT– calculate any values and stream to user
The biggest difference is that I have rewritten the query to be the easiest way I would do the work manually. SELECT is last because that is where I would format the data and calculate any scalar values that aren’t used elsewhere. So I just manually say “does this query do what is expected of it. If FROM returns a billion rows that is fine because I don’t need to SEE those rows. I just need to know what those rows would have in them.
And if I missed the HAVING 1=2 in the query, I know that means no rows will be output, but this is why the optimizer does the work to determine the best way to execute the query.
Do you want to know more?
Because there is so much more. From relational algebra to take you query and find the simplest version to execute; to finding the best plan to execute the query on your database on your server at the time you are running it; to dealing with locks and versions and so much more.
If you want to know more, here are some resources to check out:
Book by the wonderful Grant Fritchey: SQL Server 2025 Query Performance Tuning: Troubleshoot and Optimize Query Performance | Springer Nature Link
Article on Microsoft Learn about query processing: Query Processing Architecture Guide – SQL Server | Microsoft Learn
Hugo Kornelis’ SQLServerFast Site: SQL Server Execution Plan Reference – SQLServerFast
There are probably others, but when I searched, a lot of articles were pretty lean in nature or started out with a lot of gobblety-gook.
The best part?
Most of this isn’t that important. It is nice to know how queries work, but very few SQL users need to know exactly how SQL works. they just need to know enough to realize what is and isn’t bad code. Really the only reason I wrote this blog was there was just too much misinformation about there about how SQL executes queries. If someone truly believes that a SQL query works in the exact order of any of the operators, they will get other confused ideas too.
One of which is my next planned blog.



Leave a Reply