BLOGS BY TOPIC
BLOGS BY AUTHOR
BLOGS BY YEAR
Posted by Andy Brown on 12 January 2016
You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.
Using the RANKX function in DAX to sort data
This blog explains how you can sort data in DAX. Note that in addition to the RANKX function, there are two other possibilities I know of:
|RANK.EQ||This function is included to provide compatibility with Excel. It provides a subset of the functionality of RANKX, and is rarely used.|
|EARLIER||This function provides an alternative way of ranking data, and is if anything even harder to understand than RANKX!|
Syntax of the RANKX function
Here's how you can write the RANKX function:
You can miss out all but the first two arguments. The arguments are:
|Argument||What it is||What it means|
|1||Table||The table you are ranking over.|
|2||Expression||The expression you're ranking by.|
|3||Value||A bizarre chance to do one-off value substitutions (see foot of page).|
|4||Sort order||Whether you're sorting in ascending or descending order.|
|5||Ties||How to handle ties.|
Arguments 3, 4 and 5 are explained under separate headings further down this page.
Ranking using a calculated column
This is nice and simple! The example below sorts the products in list price order, with the most expensive item coming first:
The Emperor is the most expensive product; Wol the cheapest.
Setting sort order
The Order argument can take one of two values:
|Value||What it means|
|0 or FALSE||Sort into descending order|
|1 or TRUE||Sort into ascending order|
Note that the default is to use descending order, which is the opposite of usual Microsoft practice (not to mention common sense).
So you could rewrite our formula like this to put Wol first and The Emperor last:
Note the comma as a placeholder for the third argument.
Ranking using measures
Things get a bit more complicated when you create a measure to rank data. To illustrate this, I'll show getting it wrong first.
Let's suppose that you've already created a measure to show total sales, and you want to use this to rank species:
First create a measure called TotalSales, to sum the quantity for transactions.
The obvious thing to do is to create this measure in the Animal table:
This measure ranks species by total sales. What could possibly go wrong?
Here's how this measure shows up in a typical pivot table:
The rank is 1 for every cell.
The reason for this is that a measure will aggregate data for the records in the underlying query context. This formula is showing (for example) that the total sales for the East quadrant for the Amphibian species is the first (and only) value in the set of figures for the relevant cell.
The solution is to widen the query context to rank over all species:
ALL( Species ),
This will then give the correct results:
I've included the total quantity sold to show the results aren't fiddled! Mammals are always the best-sellers, and amphibians always the worst.
Dealing with ties
Suppose that you have the following measure in the transactions table, showing for any set of results the highest single transaction amount (usually 4, but occasionally less):
Also, in the Product table (which you'll need to unhide from client tools) you create the following measure:
You then create a pivot table showing these measures by product for the West quadrant:
There are 6 products which share top ranking for this measure (with 4), so the next ranking begins at number 7.
This is the default (and most sensible) way of dealing with ties. It's called Skip, but you can also use Dense for the 5th argument:
The same rankings but using the Dense method of dealing with ties.
Here's the amended measure (note the commas acting as placeholders for missing arguments):
Whether it's worth knowing this is another question! And on this theme (and purely for the sake of completeness), I'll finish this blog by showing what I call ...
To complete the picture, the third argument of the RANKX function allows you (bizarrely) to substitute a different value for each cell of a pivot table. The only way to understand this is by example!
I'm going to try to explain why in this pivot table the WeirdRank statistic shows 6 for Owl.
To start with, I've created (as can you) two measures in the Transaction table:
In the Animal table, I've then created this measure:
What this will do is rank each animal's highest quantity into the correct order, but using the lowest price for the animal and the highest quantity for all of the others. Here's what this means for the Owl animal:
|Animal||HighestQuantity||LowestPrice||What to use|
Here's the table for the Cat animal:
|Animal||HighestQuantity||LowestPrice||What to use|
Here's another example pivot table, to try to explain:
Any lowest price more than 4 (the biggest HighestQuantiy) ranks as number 1.
This use of RANKX blows my mind, to be honest; just as well it's not that useful!
And with that, I'm done with ranking!