SQL CUME_DIST Function

The CUME_DIST function returns the cumulative distribution within a specified group of values.

The values returned by CUME_DIST are greater than 0 and less than or equal to 1.

CUME_DIST calculates the cumulative position of each value in a group of values.

CUME_DIST is similar to the PERCENT_RANK function.

Example

#

Calculate the cumulative distribution of the product prices.

SELECT ProductName, UnitPrice,
       CUME_DIST() OVER(ORDER BY UnitPrice) AS 'Price Distribution'
  FROM Product
 ORDER BY UnitPrice
Result:  78 records
ProductName UnitPrice Price Distribution
Geitost 2.50 0.0128205128205128
Guaraná Fantástica 4.50 0.0256410256410256
Konbu 6.00 0.0384615384615385
Filo Mix 7.00 0.05128205128205128
Sir Rodney's Marmalade 81.00 0.9615384615384616
Mishi Kobe Niku 97.00 0.974358974358974
Thüringer Rostbratwurst 123.79 0.987179487179487
Côte de Blaye 263.50 1

Note: The cumulative distribution starts low and goes up to 1 (the last row).


Syntax

#

Syntax of the CUME_DIST function.

CUME_DIST() 
  OVER( PARTITION BY expression, ...
        ORDER BY expression, ... )

PARTITION BY expression -- optional. Organizes rows into different groups where the CUME_DIST will be applied. If not specified, CUME_DIST treats the result set rows as a single group.

ORDER BY expression -- specifies the sort order of the result set.

More Examples

CUME_DIST with PARTITION

ORDER
Id
OrderDate
OrderNumber
CustomerId
TotalAmount
CUSTOMER
Id
FirstName
LastName
City
Country
Phone
Problem: List the orders for the year 2013. Include the cumulative distribution of the order amounts within each month sorted by amount.
SELECT MONTH(OrderDate) AS Month, 
       CONCAT(FirstName, ' ', LastName) AS Customer, TotalAmount, 
       CUME_DIST() OVER(PARTITION BY MONTH(OrderDate)
                        ORDER BY TotalAmount) AS 'Sales Distribution'
  FROM [Order] O
  JOIN Customer C ON O.CustomerId = C.Id
 WHERE YEAR(OrderDate) = 2013
 ORDER BY MONTH(OrderDate), TotalAmount
Result:  408 records
Order Month Customer TotalAmount Sales Distribution
1 Paolo Accorti 49.80 0.030303030303030304
1 Yoshi Latimer 102.40 0.06060606060606061
1 Maurizio Moroni 192.00 0.09090909090909091
1 Roland Mendel 5796.00 0.9393939393939394
1 Jytte Petersen 11283.20 0.9696969696969697
1 Jean Fresnière 11493.20 1
2, 3, 4, ... 11
12 Patricio Simpson 12.50 0.0208333333333333
12 Paolo Accorti 18.40 0.0416666666666667
12 Pascale Cartrain 28.00 0.0625
12 Martín Sommer 4035.80 0.958333333333333
12 Marias Larsson 4337.00 0.979166666666667
12 Roland Mendel 6984.50 1

CUME_DIST with GROUP BY

ORDER
Id
OrderDate
OrderNumber
CustomerId
TotalAmount
Problem: List the monthly sales for the year 2013. Include the cumulative distribution values for each month sorted by month.
SELECT MONTH(OrderDate) AS 'Month', 
       SUM(TotalAmount) AS 'Monthly Sales', 
       CUME_DIST() OVER(ORDER BY SUM(TotalAmount)) AS CumeDist
  FROM [Order]
 WHERE YEAR(OrderDate) = 2013
 GROUP BY MONTH(OrderDate)
 ORDER BY MONTH(OrderDate)
Result:  12 records
Month Monthly Sales CumeDist
1 66692.80 0.833333333333333
2 41207.20 0.25
3 39979.90 0.166666666666667
4 55699.39 0.583333333333333
5 56823.70 0.666666666666667
6 39088.00 0.0833333333333333
7 55464.93 0.5
8 49981.69 0.416666666666667
9 59733.02 0.75
10 70328.50 0.916666666666667
11 45913.36 0.333333333333333
12 77476.26 1

Note: 67% of the sales from year 2013 are greater than 50,000.


You may also like



Guides


vsn 3.1