SQL PERCENT_RANK and CUME_DIST


Example

The PERCENT_RANK function calculates the ranking of a row relative to the row set. The percentage is based on the number of rows in the group that have a lower value than the current row.

The first value in the result set always has a percent rank of zero. The value for the highest-ranked – or last – value in the set is always one.


The CUME_DIST function calculates the relative position of a specified value in a group of values, by determining the percentage of values less than or equal to that value. This is called the cumulative distribution.

SELECT BusinessEntityID, JobTitle, SickLeaveHours,
PERCENT_RANK() OVER(PARTITION BY JobTitle ORDER BY SickLeaveHours DESC)
       AS "Percent Rank",
CUME_DIST() OVER(PARTITION BY JobTitle ORDER BY SickLeaveHours DESC)
       AS "Cumulative Distribution"
FROM Employee;

In this example, you use an ORDER clause to partition – or group – the rows retrieved by the SELECT statement based on employees' job titles, with the results in each group sorted based on the numbers of sick leave hours that employees have used.

BusinessEntityIDJobTitleSickLeaveHoursPercent RankCumulative Distribution
267Application Specialist5700.25
268Application Specialist560.3333333333333330.75
269Application Specialist560.3333333333333330.75
272Application Specialist5511
262Assitant to the Cheif Financial Officer4801
239Benefits Specialist4501
252Buyer5000.111111111111111
251Buyer490.1250.333333333333333
256Buyer490.1250.333333333333333
253Buyer480.3750.555555555555555
254Buyer480.3750.555555555555555

The PERCENT_RANK function ranks the entries within each group. For each entry, it returns the percentage of entries in the same group that have lower values.

The CUME_DIST function is similar, except that it returns the percentage of values less than or equal to the current value.