Recently I needed to create a calculated column in Power Pivot that calculated a running total. Although this seemed pretty straight forward, in practice I found the DAX expressions to be a little counter intuitive. The blogs I looked at had great code samples, but none quite explained the how and why, so I figured I would shed a little light on the subject.
I’ve got a fact table that has Date, Resource Name, Project Name, Role & Hours as columns. This table contains the number of hours a resource spent on a project and what role they played on what date.
In order to create a burn down chart, I needed to be able to hook a chart up to a data source that could tell me how many hours were spent on a given project by day, with the ability to break that down by project and discipline.
Given there are many duplicate dates, you can’t connect the chart without aggregating by date, and you need to make sure the aggregation doesn’t add up values for different projects.
The formula I used is this:
Let’s break this down into it’s parts so you can understand what each piece is doing, and how it actually works.
Calculate is the Swiss Army Knife of DAX. The purpose of Calculate, is to evaluate an expression within the context of the filters you’ve specified.
This basically means, you give it the math, and it will make sure that it’s performing or excluding any filters that are being passed along via your Slicers, Query Parameters, etc.
We’re wrapping what we want in a CALCULATE function so that we can give it the time context that we need, and tell it how to aggregate the values.
SUM is pretty self explanatory. If you haven’t used Excel, I really have no idea how you got here, but there is something interesting to note here. Although we’re telling it to summarize the Hours column in the ResourceFacts table, it’s the CALCULATE container that’s passing what values to summarize into the expression. That’s important to remember. The SUM expression is very vague, so that it is flexible enough to summarize the correct values based on the context that’s being applied.
This is where I was a little thrown off. The FILTER function is passed along with a secondary function that defines the scope of the filter. There are quite a few options here, but we’re going to focus on two: ALL and ALLEXCEPT.
ALL(ResourceFacts) would remove all filters from the table. With this in the filter, the sum would calculate every row in the table period. That’s useful if you would like a grand total on your row to use in a calculated column, but it’s useless for our scenario.
ALL(ResourceFacts[Hours]) would remove any filters that are applied to the Hours column. That means if you have a slicer or value filter that’s restricting the results based on the hours, the SUM expression would still contain all values that are being filtered out of your result set. You can overload ALL with multiple columns as well. Again, useful in some scenarios, but not ours.
ALLEXCEPT(ResourceFacts,RoleDisciplineList[Discipline],ResourceFacts[Project]) This is the filter function I used. While I understand it now, it did take me a few minutes to “get” it. As it says, it removes all context filters except the filters that are applied to the specified columns.
My initial understanding was, that if I was passing a context that was filtering based on Discipline of the Resource or Project the were working on, that would be ignored. Which sounds correct, but doesn’t quite explain the result in practice.
This is the result, which is what I wanted, I just didn’t quite understand why I was getting it! Notice the RunningTotal column resets when the Role changes. I’m not passing any filters related to Role, we’re just looking at the Calculated Column in Power Pivot. By telling the CALCULATE function to filter ALLEXCEPT Discipline and Project, I get a calculated column that’s aggregating hours for the Resource, by Project AND the Discipline of the Resource. (Note that there is a 2nd table in Power Pivot that is assigning each role to a discipline. This table has a relationship with the fact table, allowing aggregations on Discipline as opposed to the more granular Role)
If I wanted this calculation to keep a running total of on person’s hours by project, regardless of the Discipline they are in, I would just remove “RoleDisciplineList[Discipline] from the ALLEXCEPT function.
Calculate Part 2, The Reckoning
We’re summarizing what we want, and it’s now set up to accept context correctly, but we still have to attack this whole “Running Total” thing. We’ll do this by adding one more filter to the CALCULATE function.
The syntax, CALCULATE(<expression>,<filter1>,<filter2>…) means we give it an expression, and we can keep adding filters, comma separated, to further refine the calculation.
We’ve got our expression for SUM, we’ve applied a filter to aggregate our columns the way we’d like, but we still have to tackle the aggregation that builds on date.
We’re using the EARLIER function here to keep our total running. It’s important to mention that if you are doing calculations against monstrous data sets, EARLIER isn’t the most performant tool in your belt. It can be kind of slow. Power Pivot does do some optimization to help, but it’s always good practice to use recursion in calculations with some care.
There’s one logical gotcha here as well. The EARLIER function doesn’t reference earlier rows or earlier values/dates. The EARLIER function “holds” the value for the current row in the table so you can use it for evaluation. Because of this, we’re using the Less Than, not the Greater Than sign. A bit counter intuitive, but once you know how it works, you can put it to work for you.
We’re telling the CALCULATE function to add all rows to the SUM of [Hours] if the [Date] of the row is less than or equal to the [Date] of the current row.
That’s really the key thing to remember. Our expression is always calculating a value for the current row. You aren’t telling the expression what to do with the current row, you’re telling the expression what to do with all of the rows so you can use that value in the current row.
So let’s take a look at this expression in pseudo code:
The Running Total for the Current Row should be the Sum of the Hours for All Rows that match the Project and Discipline of the Current Row, if the Date is Less Than or Equal To the Date of the Current Row.
and here’s the actual expression again:
Hopefully this sheds a little light into not only creating running totals in Power Pivot, but the DAX language itself.
If you’ve been salivating over charts like this:
But couldn’t figure out how to get there using standard Pivot Charts in Excel, this is your ticket. The only other thing I didn’t mention in this article, is that I added a StartOfWeek column to my Time Dimension table that always returns the Monday of the week any given date is in. That allowed me to have this chart summarizing hours by week instead of day.
Since we didn’t add a filter to our date in our ALLEXCEPT clause, this aggregation passes through from our Pivot Chart and everything calculates just like we would expect.