SQL Server’s support for ANSI SQL includes the WITH keyword. With a little effort, this little gem lets you remove awkward sub-queries from the body of a SELECT statement, turning them in to tables that can be joined like any other.
Sub-queries are bad for several reasons. Most of all, they make queries harder to read. There is no obvious way to indent a sub-query, so it kind of sits in the middle of what would otherwise be a well formated statement. This can obscure the true behaviour and intentions of the query. Furthermore, they can severely hurt performance. SQL Server might need to execute a complex sub-query seperately for every single row that is in an outer results set. If we’re interested in performance and scalability, we need SQL Server to be able to execute our entire query with just one plan.
Some queries are often just uneccessary and ill-thought out. I have seen sub-queries where joins should have been used on many an occassion. This kind of subquery is always hurting perofrmance. As a general rule of thumb, if your sub-query is not performing any aggregate calculations (COUNT, AVG, MAX with GROUP BY) then it can probably be re-written as a join.
Here, however, we are not intersted in that kind of mis-use. I am going to show you how to turn a query containing a genuine sub-query in to a far more readable block of code.
The following query, complete with a typical subquery, returns the most expensive product for each category in a sample database (we are, of course, talking about NorthWind):
SELECT c.CategoryName, p.ProductName, p.UnitPrice FROM Categories c INNER JOIN (SELECT CategoryId, MAX(UnitPrice) AS MaxPrice FROM Products GROUP BY CategoryId) maxprice ON maxprice.CategoryId = c.CategoryId INNER JOIN Products p ON p.CategoryId = c.CategoryId AND p.UnitPrice = maxprice.MaxPrice ORDER BY MaxPrice DESC
What we will do, using WITH, is remove the subquery from the main body of our query, and reference it using an alias:
WITH MostExpensiveProducts (CategoryId, MaxUnitPrice) AS ( SELECT CategoryId, MAX(UnitPrice) FROM Products GROUP BY CategoryId ) SELECT c.CategoryName, p.ProductName, p.UnitPrice FROM Categories c INNER JOIN MostExpensiveProducts mep ON mep.CategoryId = c.CategoryId INNER JOIN Products p ON p.CategoryId = mep.CategoryId AND p.UnitPrice = mep.MaxUnitPrice ORDER BY mep.MaxUnitPrice DESC;
In our first query where we had a subquery, we now have the table name “MostExpensiveProducts”. It is much easier to see, at a glance, what this table is doing.
Although it is only a hunch, I suspect that SQL Server is also better able to create a plan for the aliased query that will not become invalid. That’s just a hunch.
About
We are a small British company that produces business-oriented software and solutions. These articles are a product of our daily work - information that we think might be useful to share. We hope you find them useful.
Our Software
These are some of our products. Several are open source, some are web-based and others are proprietary:
Categories
- .NET (10)
- Apple (2)
- Business (5)
- CSS (1)
- HTML (2)
- Innovation (4)
- Java (4)
- Javascript (1)
- Leadership (1)
- MySQL (2)
- Oracle (6)
- Postgres (1)
- Programming (5)
- Rails (4)
- Ruby (10)
- SQL Server (9)
- Subversion (1)
- Web (5)
- Windows Server (2)
Archives
- July 2010 (2)
- September 2009 (5)
- August 2009 (1)
- July 2009 (12)
- June 2009 (16)
- May 2009 (3)