SQL – subqueries versus Common Table Expressions (CTE)
For the people that are using SQL quite often, both a subquery and a CTE are known terms. They both kind of do the same thing and you will notice some people using subqueries and other people using CTE’s. Then there’s also people mentioning using either a subquery or a CTE to increase performance. Sometimes this is based on a misunderstanding of how the database engine works. In this blogpost we will find out the differences between a subquery and a CTE, and when to use one or the other. Let’s first start with some basic information about both.
Subquery
A subquery is nothing more than a SQL query nested in another SQL query. Subqueries can be nested in either SELECT, INSERT, UPDATE or DELETE statements, but in this blogpost we will focus solely on SELECT statements. Let’s first look at a fictive practical example. Let’s say we have two tables. One with basic customer data called “Customer”:

And another with sales data per customer called “SalesData:

In the Customer table, we want to add to total sales amount per customer. We can use a subquery for this:
SELECT Customer.customer_id,
TotalSales.TotalAmount
FROM Customer
INNER JOIN
(SELECT customer_id,
SUM(sales_amount) as TotalAmount
FROM
SalesData
GROUP BY customer_id) as TotalSales
on customer.customer_ID = TotalSales.customer_ID
In the example above, we use the subquery to select the total sales per customer which we then join to our customer data.
CTE (Common Table Expression)
A CTE is, just like a subquery, a result of a select statement that is defined in the query itself. In most cases, a CTE can serve the same purpose as a subquery. A CTE always starts with the “WITH” clause, after which the CTE’s select statement is written. Let’s create a CTE to get exactly the same result as the sub query above:
WITH TotalSales AS
(SELECT customer_id,
SUM(sales.sales_amount) as TotalAmount
FROM Sales
GROUP BY customer_id)
SELECT customer.customer_id,
TotalSales.TotalAmount
FROM customer
INNER JOIN TotalSales
on customer.customer_id = TotalSales.customer_id
In the example above, we first create a CTE which generates the total sales amount per customer. In the SELECT statement that follows, we join in the CTE we just created to get the total sales amount per customer. When we need to use multiple CTE’s in one query, we always start with writing all the CTE’s in the beginning of the query and then referring to them afterwards, like so:
WITH CTE1 as (SELECT * FROM TABLE1),
CTE2 as (SELECT * FROM TABLE2),
CTE3 as (SELECT * FROM TABLE3),
CTE4 as (SELECT * FROM TABLE4)
SELECT * FROM CTE1
UNION
SELECT * FROM CTE2
UNION
SELECT * FROM CTE3
UNION
SELECT * FROM CTE4
This standard structure of the query, makes it more readable. However, opinions about readability of CTE’s vary as people are used to their own way of writing SQL queries.
(Functional) differences between sub queries and CTE’s
- Subqueries can be defined anywhere in the query. CTE’s have to be declared in the beginning of the query after which they can be referred to
- Subqueries can be used with WHERE clause in combination with
IN
andEXISTS
keywords to filter the data. CTE’s do not allow you to do this. An example of this will be shared below - CTE’s enable recursive functionality. An example of this will be shared below.
- CTE’s are defined in the beginning of the query, and subqueries are defined inline
- CTE’s always have to be defined with a name (this is not required for subqueries, but it is best practice to do so)
- CTE’s can be referred to multiple times in the SQL query, subqueries can only be used once
- Subqueries can be used to update the contents of the table, CTE’s do not offer this functionality
Subquery exclusive: using IN keyword
We have two tables with employee data. The first table, called EMPMASTER, contains some employee master data:

The second table, called EMPSCALE, contains data related to the payscales of employees:

If we want to filter the first table based on the contents of the second table, without joining, we can use the following code in which we use the WHERE clause in combination with the IN keyword that is using a subquery as the source:
SELECT * FROM EMPMASTER
WHERE EmpNumber IN(
SELECT EmpNumber FROM EMPSCALE WHERE Payscale >13)
This particular example cannot be recreated by using a CTE.
CTE exclusive: recursive queries
A big strength of a CTE is the ability to be recursive. Recursive queries makes it possible to work with hierarchical data. This can be rather useful when looking at the hierarchy in an organization in which every manager has a manager until we reach the CEO. With the recursive CTE, we can indicate the hierarchy levels of the employees within a organization. A full explanation of recursive CTE’s and its use will be shared in a future blogpost.
Conclusion
In this blogpost we have touched upon the functional difference between a CTE and a subquery. Based on those differences, we can make a decision to choose one of the two options. Generally speaking, people tend to find that CTE’s make the query more readable, but this is mostly based on personal preference. The general misconception that CTE’s are are more efficient that subqueries because the data is temporarily stored, is false. If performance is an important aspect, it is better to create a temp table with an index, which both the CTE and subquery cannot offer.
Leave a Reply