SQL Server’s support for ANSI SQL includes the WITH keyword. With a little effort, this little gem lets you remove awkward sub-queries from the body of a SELECT statement, turning them in to tables that can be joined like any other.

Sub-queries are bad for several reasons. Most of all, they make queries harder to read. There is no obvious way to indent a sub-query, so it kind of sits in the middle of what would otherwise be a well formated statement. This can obscure the true behaviour and intentions of the query. Furthermore, they can severely hurt performance. SQL Server might need to execute a complex sub-query seperately for every single row that is in an outer results set. If we’re interested in performance and scalability, we need SQL Server to be able to execute our entire query with just one plan.

Some queries are often just uneccessary and ill-thought out. I have seen sub-queries where joins should have been used on many an occassion. This kind of subquery is always hurting perofrmance. As a general rule of thumb, if your sub-query is not performing any aggregate calculations (COUNT, AVG, MAX with GROUP BY) then it can probably be re-written as a join.

Here, however, we are not intersted in that kind of mis-use. I am going to show you how to turn a query containing a genuine sub-query in to a far more readable block of code.

The following query, complete with a typical subquery, returns the most expensive product for each category in a sample database (we are, of course, talking about NorthWind):

SELECT c.CategoryName, p.ProductName, p.UnitPrice
    FROM Categories c
    INNER JOIN (SELECT CategoryId, MAX(UnitPrice) AS MaxPrice
        FROM Products GROUP BY CategoryId) maxprice
        ON maxprice.CategoryId = c.CategoryId
    INNER JOIN Products p
        ON p.CategoryId = c.CategoryId
            AND p.UnitPrice = maxprice.MaxPrice
    ORDER BY MaxPrice DESC

What we will do, using WITH, is remove the subquery from the main body of our query, and reference it using an alias:

WITH MostExpensiveProducts (CategoryId, MaxUnitPrice) AS
(
    SELECT CategoryId, MAX(UnitPrice)
        FROM Products
        GROUP BY CategoryId
)
SELECT c.CategoryName, p.ProductName, p.UnitPrice
    FROM Categories c
    INNER JOIN MostExpensiveProducts mep
        ON mep.CategoryId = c.CategoryId
    INNER JOIN Products p
        ON p.CategoryId = mep.CategoryId
            AND p.UnitPrice = mep.MaxUnitPrice
    ORDER BY mep.MaxUnitPrice DESC;

In our first query where we had a subquery, we now have the table name “MostExpensiveProducts”. It is much easier to see, at a glance, what this table is doing.

Although it is only a hunch, I suspect that SQL Server is also better able to create a plan for the aliased query that will not become invalid. That’s just a hunch.


About

We are a small British company that produces business-oriented software and solutions. These articles are a product of our daily work - information that we think might be useful to share. We hope you find them useful.

Our Software

These are some of our products. Several are open source, some are web-based and others are proprietary:

Categories

Archives

Syndicate

ml> ._trackPageview(); } catch(err) {} ml> l> pageTracker._trackPageview(); } catch(err) {} ml> ._trackPageview(); } catch(err) {} ml> l>