Common Table Expressions in SQL

Common Table Expressions (CTE)

CTEs are shortcuts you can build into your SQL scripts to help simplify and condense your analyses. In particular, they help simplify complex subqueries and joins in scripts. Introduced in 2005 in SQL, Microsoft’s descriptoin of CTE is: “Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE or MERGE statement” (source: https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-ver15). CTEs are also extremly useful if you’re working in a program that only allows you to run a single query at a time. In these instances, declaring a CTE and then calling it in a select or join query is really helpful to simplify queries and processess. I’ll dive into some examples now that will hopefully clarify the syntax and use of these expressions.

Syntax

A simple example of a CTE is as follows. Let’s say you have sales data of how much a store sold in an online store. The data is from https://www.kaggle.com/vijayuv/onlineretail.

select top 5 * 
from dbo.retail_sales
Table 1: 5 records
InvoiceNoStockCodeDescriptionQuantityInvoiceDateUnitPriceCustomerIDCountry
53636585123AWHITE HANGING HEART T-LIGHT HOLDER62010-12-01 08:26:00.00000002.5517850United Kingdom
53636571053WHITE METAL LANTERN62010-12-01 08:26:00.00000003.3917850United Kingdom
53636584406BCREAM CUPID HEARTS COAT HANGER82010-12-01 08:26:00.00000002.7517850United Kingdom
53636584029GKNITTED UNION FLAG HOT WATER BOTTLE62010-12-01 08:26:00.00000003.3917850United Kingdom
53636584029ERED WOOLLY HOTTIE WHITE HEART.62010-12-01 08:26:00.00000003.3917850United Kingdom

For a simple example just to grasp the syntax, imagine you want to only select item with Stock Code = 22423 and then perform a wide variety of calculations on that sub-population without creating new temp tables or tables.

with cakestand_sales as (
  select top 5 *
  from dbo.retail_sales
  where StockCode = '22423'
)
select * 
from cakestand_sales
Table 2: 5 records
InvoiceNoStockCodeDescriptionQuantityInvoiceDateUnitPriceCustomerIDCountry
53647722423REGENCY CAKESTAND 3 TIER162010-12-01 12:27:00.000000010.9516210United Kingdom
53650222423REGENCY CAKESTAND 3 TIER22010-12-01 12:36:00.000000012.7516552United Kingdom
53652522423REGENCY CAKESTAND 3 TIER22010-12-01 12:54:00.000000012.7514078United Kingdom
53652822423REGENCY CAKESTAND 3 TIER12010-12-01 13:17:00.000000012.7515525United Kingdom
53653022423REGENCY CAKESTAND 3 TIER12010-12-01 13:21:00.000000012.7517905United Kingdom

Now, let’s say you want to calculate the average amount of each item sold per day and compare that to the number sold in a particular sale and you want to do this in one query.


with avg_quantity as (
SELECT InvoiceDate, StockCode, avg(Quantity) as avg_quantity
FROM DBO.RETAIL_SALES
GROUP BY InvoiceDate, StockCode
)
select a.InvoiceDate,a.InvoiceNo, a.StockCode, a.Quantity daily_quantity, b.avg_quantity, abs(a.Quantity - b.avg_quantity) difference
from dbo.retail_sales a 
left join avg_quantity b 
on a.stockCode  = b.stockCode and a.InvoiceDate = b.InvoiceDate
Table 3: Displaying records 1 - 10
InvoiceDateInvoiceNoStockCodedaily_quantityavg_quantitydifference
2010-12-01 08:26:00.000000053636585123A660
2010-12-01 08:26:00.000000053636571053660
2010-12-01 08:26:00.000000053636584406B880
2010-12-01 08:26:00.000000053636584029G660
2010-12-01 08:26:00.000000053636584029E660
2010-12-01 08:26:00.000000053636522752220
2010-12-01 08:26:00.000000053636521730660
2010-12-01 08:28:00.000000053636622633660
2010-12-01 08:28:00.000000053636622632660
2010-12-01 08:34:00.00000005363678487932320

In these two examples, we build one CTE and then applied it to a query. However, you can build as many as you want. Let’s take our previous query but say we also want to add two columns indicating the max and min quantities sold in a day for all products across the store. Then, we could write the following query:

with avg_quantity as (
SELECT InvoiceDate, StockCode, avg(Quantity) as avg_quantity
FROM DBO.RETAIL_SALES
GROUP BY InvoiceDate, StockCode
),
min_max_all_products as (

SELECT InvoiceDate, max(Quantity) as max_quantity, min(Quantity) as min_quantity
FROM dbo.retail_sales 
group by InvoiceDate

)
select a.InvoiceDate,a.InvoiceNo, a.StockCode, a.Quantity daily_quantity, b.avg_quantity, abs(a.Quantity - b.avg_quantity) difference, c.max_quantity, c.min_quantity
from dbo.retail_sales a 
left join avg_quantity b 
on a.stockCode  = b.stockCode and a.InvoiceDate = b.InvoiceDate
left join min_max_all_products c 
on a.InvoiceDate = c.InvoiceDate
Table 4: Displaying records 1 - 10
InvoiceDateInvoiceNoStockCodedaily_quantityavg_quantitydifferencemax_quantitymin_quantity
2010-12-01 08:26:00.000000053636585123A66082
2010-12-01 08:26:00.00000005363657105366082
2010-12-01 08:26:00.000000053636584406B88082
2010-12-01 08:26:00.000000053636584029G66082
2010-12-01 08:26:00.000000053636584029E66082
2010-12-01 08:26:00.00000005363652275222082
2010-12-01 08:26:00.00000005363652173066082
2010-12-01 08:28:00.00000005363662263366066
2010-12-01 08:28:00.00000005363662263266066
2010-12-01 08:34:00.00000005363678487932320322
Next
Previous