Maximize Aggregates
C. Langdon
Aggregates
Database platforms do a lot of things. So many things, we sometimes lose sight of maximizing their capabilities.
One common operation is to aggregate data. That is, combining data in similar groups and summarizing it by using various aggregate functions such as MIN
and MAX
. For example:
SELECT MAX(OrderDate)
FROM Order
WHERE 1=1
AND CustomerId = 'ACME'
In this post, we briefly discuss the MIN
/MAX
aggregate functions and how you might be able to use them beyond their simplest form.
Common Query
We can use the MAX
function on a table column and retrieve the maximum value within the column. In the example below, the query will return the maximum OrderDate
where CustomerId
is 'ACME'.
SELECT MAX(OrderDate)
FROM Order
WHERE 1=1
AND CustomerId = 'ACME'
Most of the time, we're wanting to process our data vertically when using aggregate functions. For example, aggregating the values in a column that results in the SUM(Quantity)
or the SUM(Amount)
or the MAX(OrderDate)
.
CustomerId | OrderDate | Quantity | Amount |
ACME | 2018-11-01 | 10 | 100 |
ACME | 2018-11-03 | 5 | 50 |
ACME | 2018-11-05 | 100 | 1000 |
Table - Order
SELECT MAX(OrderDate)
FROM Order
WHERE 1=1
AND CustomerId = 'ACME'
Result:
2018-11-05
SELECT SUM(Quantity) AS TotalOrders
FROM Order
WHERE 1=1
AND CustomerId = 'ACME'
Result:
115
SELECT SUM(Amount) AS TotalRevenue
FROM Order
WHERE 1=1
AND CustomerId = 'ACME'
Result:
1150
In all these queries, notice how we're aggregating the values of a column. Wouldn't it be interesting if we could do similar functionality across multiple columns of a row?
Less Common Query
Let's assume an order has multiple processing dates. When a customer submits an order, it is known as the OrderDate
. The order is then processed and shipped which is referred to as the ShipmentDate
. Once shipped, the order is invoiced and known as the InvoiceDate
.
OrderId | CustomerId | OrderDate | ShipmentDate | InvoiceDate | Quantity | Amount |
1 | ACME | 2018-11-01 | 2018-11-10 | 2018-11-12 | 10 | 100 |
2 | ACME | 2018-11-03 | 2018-11-27 | NULL | 5 | 50 |
3 | ACME | 2018-11-05 | NULL | NULL | 100 | 1000 |
Table - Order
To evaluate the MAX processing date on an order, use MAX(OrderDate, ShipmentDate, InvoiceDate)
.
SELECT OrderId, MAX(OrderDate, ShipmentDate, InvoiceDate) AS MaxDate
FROM Order
WHERE 1=1
AND CustomerId = 'ACME'
GROUP BY OrderId
Result:
OrderId MaxDate
------- -------
1 2018-11-12
2 2018-11-27
3 2018-11-05
Wrap It Up
It isn't everyday you'll need to use this functionality. And, frankly, it will be on rare occasions. Most of the time, half the battle in solving a problem is having a solution. Don't lose sight on maximizing your aggregates. It could one day help you solve a problem.