Group Multiple Rows to Single Delimited Row in PowerPivot

When PowerPivot was announced a couple of years back, I was one among the many who started feeling insecure on having to learn a new trade. I could have sat there and kept on cribbing, or picked myself up and started my baby steps towards the new technology. As Hillary Clinton said, the challenges of change are always hard. It is important that we begin to unpack those challenges and realize that we each have a role that requires us to change and become more responsible for shaping our own future. I am glad that I gave PowerPivot a chance and kept on learning (it sure helps sitting next to a passionate PowerPivot supporter like Javier Guillen!), else I wouldn’t have realized how much useful it is to me. Agreed that it has it’s fair sets of limitations, but as the community grows and the technology matures, we will see that many of it gets resolved or have workarounds. For eg, if you had asked me before on whether there was any way in PowerPivot to group all the row values into a single delimited row, I would have said no. But seeing this question in the PowerPivot forum, I decided to spend a little more time in the weekend and found a workaround, which I am sharing through this post.

Group Multiple Rows to Single Delimited Row in PowerPivot

Scenario
Let us say that we have a table which shows the sales by Country and State. The result should be a table which shows the sum of sales by country along with a delimited row of states as shown below

Scenario

Solution
There is only one function (atleast as far as I know) in PowerPivot that returns a delimited list from multiple rows – PATH(). But the issue is that PATH() works only with a parent child hierarchy. So the trick for getting this piece done would be to build an artificial parent child hierarchy. Follow the steps below to get the solution:- 1) Make a calculated column called Rnk which will generate a running number of the states within a country.

=RANKX (
  FILTER ( All ( Table1 ), Table1[Country] = EARLIER ( Table1[Country] ) ),
    Table1[State],    ,    1,   DENSE
  )

This will allow us to uniquely identify each state value within a country.

Rank States within country using RANKX

2) Make a calculated column called ParRnk with the formula below

=IF (
  Table1[Rnk] <> 1,
  Table1[Rnk] – 1
)

This will allow us to get the parent node id of the parent child hierarchy.

Parent Rank

3) Now, make a calculated column called ParState which will get the Parent State for the current row (based on the ParRnk column).

=CALCULATE (
FIRSTNONBLANK ( Table1[State], 1 ),
FILTER (
ALLEXCEPT ( Table1, Table1[Country] ),
    Table1[Rnk] = EARLIER ( Table1[ParRnk] )
  )
)

Get value of State for the parent

4) The final part of the solution is to make the delimited list column by using the path function where State is the child node and ParState is the parent node. But if we directly apply the PATH function in each row, we will not get the intended results as you can see in the image below.

PATH() without passing the relevant context

What we need is the delimited list value for the row with the maximum rank in each country applied across all other rows in that country. For that, make a calculated column called ConcatStates with the formula below

=CALCULATE (
PATH ( Table1[State], Table1[ParState] ),
CALCULATETABLE (
FILTER ( Table1, Table1[Rnk] = Max ( Table1[Rnk] ) ),
ALLEXCEPT ( Table1, Table1[Country] )
  )
)

Now the results should look like below

PATH() with the correct table context

Now make a new pivot table and drag and drop the Country, ConcatStates and Sales fields to see the intended results

solution

Wasn’t that a fascinating workaround? If you want to play around, you can download the excel file from here.

Posted by SQLJason

33 comments

This is brilliant! Well done, Jason!

Yes, this is a good one. In fact you've written some great posts recently, thanks!

christianwade

Jason, I don't mean to repeat what has already been said, but thanks for a great read!

Brent Greenwood

Very clever solution Jason. Nice work, and thanks for sharing.

U save`d my life!

Great solution

This works brilliantly if Column2 is a unique subset of Column1, but what if Column2 was language for example.

Is there any clever way to concatenate languages(s) that are not unique to country?

Niels Balsgaard Mortensen

Mr Parsnip, if you're still looking for a solution to this, take a look at my replies to Elissa Lappenga.

Elissa Lappenga

Thank you for this! One problem, though – Excel keeps crashing when I try to add the ConcatStates column. Any idea why and how to fix that?

Unfortunately, I don't know why this would crash Excel, it doesn't on my side. Could be an excel installation related issue, or some excel patch/bug

Elissa Lappenga

Figured it out! My "States" column has a couple of values that repeat across "Countries" (in your example, it would be as if India and the US both had States called "NC"). This must create the issue. This was so close to being exactly what I need – just wish I knew how to get over this last hump.

Don't think that would cause an issue, but maybe I didn't understand your scenario real well. Did you follow each step one by one? If yes, at what step does the logic start breaking?

Elissa Lappenga

The very last step is where I have trouble. I can get the first three to work fine, the last one crashes it. I uploaded a sample set to this document along with an example of what I'm trying to achieve: https://ucla.box.com/s/g1puezz09azp7s7kuqypc5ke5t3mc30f
I renamed my CourseID and Instructor columns as Country and State so you could see how I was trying to apply your fix. I would really appreciate any help you can offer. Thanks!

Elissa Lappenga

Hi again, just checking in to see if you had any thoughts on my last comment. Thanks in advance!

Niels Balsgaard Mortensen

I haven't tried myself, but could you perhaps create and additional calculated column CONCATENATE([Country],[State]) and use that as your state column, and then somewhere in the final steps remove the Country names using SUBSTITUTE or similar?

Niels Balsgaard Mortensen

So I tested it today and it works!
Create the aforementioned CountryState column, but add an identifier character to be safe (I used "/"):
="/" & Table1[Country] & "/" & Table1[State]
Then replace [State] with [CountryState] in all the formulas.
Finally, wrap the ConcatStates formula in a SUBSTITUTE:
=SUBSTITUTE( >Original ConcatStates formula< , "/" & Table1[Country] & "/", "" )

Hi Jason,

Your post was really helpful. However, I needed your suggestion to resolve the following scenario: I have 2 tables A & B. Table A has unique Country ID's & Country Names. Table B has columns: Country Names & State Names, country names of which are redundant. Hence, Table A to B is a one to many relationship. What I am wanting to do is get the concatenated state names as a column in Table A. I am able to get the concatenated result on the same sheet, but to move this to Table A is a challenge I am facing. Is this something you can help me with please?

Much appreciated.

Thank you.
Regards,
sweta

Once you get the right results in a column of Table B, you can move it to table A by using a calculated column in A with formula like FIRSTNONBLANK of TableB[new column]

Thanks Jason. It worked perfectly. However, I am getting an error off late, and it seems appears when there is more than 1 parent child hierarchy. The error is at step 4 of your solution (I do not have a way to attach the screenshot, hence trying to explain this based on your table)

Error Message:

Calculation error in column 'Table1'[]: Each value in 'Table1'[State] must have the same value in 'Table1'[ParState]. The value '' has multiple values.

Is this something you can help me with? Please let me know if the error did not make sense, and where can I send you the screenshot of the error.

Appreciate your support. Thank you.
Sweta

Dan Anderson

Jason – this is fantastic. Its exactly what I've been looking for. However my situation is like Sweta's. I have duplicates in my table because there's an additional Date column dimension. I get the same 'multiple values' error.

Do you know how to fix that?

The problem is that the arguments in the PATH() must be unique identifiers. So if you have a state that belongs to 2 countries, you will get the above mentioned error (note that having duplicates of states with the same country will not cause the error). That said, to solve this error, you must make your State unique. A way to do it is to create the following calculated columns

Mstate="#" & [Country] & "#" & [State
MParState=IF(ISBLANK([ParState]),BLANK(),"#" & [Country] & "#" & [ParState])]
FInal=SUBSTITUTE(CALCULATE (
PATH ( Table1[MState], Table1[MParState] ),
CALCULATETABLE (
FILTER ( Table1, Table1[Rnk] = Max ( Table1[Rnk] ) ),
ALLEXCEPT ( Table1, Table1[Country] )
)
), "#" & [Country] & "#","")

Note how we make the state unique by prefixing the country within a delimiter. Also, note at the end that we substitute the delimiter + country with an empty space, thus giving the result that we need

That was awesome! Thank you, Jason!

Brett Castro

Jason, you are amazing! How 'bout another twist though!

What if I only want to concatenate specific 'states' based on if they meet one of three criteria?

Hi Jason,
This is a great post.
I have a similar scenario that maybe you can help me with:
Order | State | Quantity
A | CA | 10
A | NY | 20
B | CA | 30
C | CA | 40
C | FL | 50

and the result would be:
ConcatState | Units
CA&NY | 30
CA | 30
CA&FL | 90

Can you please help?
Many thanks

Thank you for the great post. I’d appreciate your advice in solving a problem. I have a rolling forecast file in which I collect comments of analyst on product sales by region, sub region, product category, product . I to display comments in a measure for variance explanation in scenarios say, actual vs budget, forecast vs budget, this year vs next year. I want to get top 3 or 5 comments and ignore the comments on insignificant variances. Thank

Hi Jason,
How would you do the reverse of this? For example under the ‘Scenario’ section, how would you go from table 2 to table 1 without the [Sales] column?
Thanks!
YG

Jason, Great solution! Thanks for sharing.

Hi this was really brilliant
Thanks for this post
I was simulating the formulas on my data, but my tables’ children are not unique for each parent. I have Order_Numbers as parent and Categories as Child.
I am trying to get results like : Order_1: CategoryName_5|CategoryName_2|…
Order_2:CategoryName_3|CategroyName_6|
actually getting the different categories that each order has.
I have searched everywhere for the last week for SQL or M solutions. and nothing up till now
can you help me with this case please ?

With latest version of SQL Server (2016), Power Bi Desktop or Excel 2016 you could use

CONCATENATEX(RELATEDTABLE([table Name],[Colum Name],”; “)

Hi,

I have a similar problem in power BI.

I have 2 tables – 1st one have a job number, start date & time and finish date & time. 2nd table has an incident and its date and time.

in excel I can use sumproduct to get the incident to relate to the job number. however I need to create this in power bi and I am struggling any ideas?

Hi Jason,

You post is amazing! But i have one question.

I tried to see what’s the table return for:
Table = CALCULATETABLE(filter(Table1, Table1[Rnk]=max(Table1[Rnk])), ALLEXCEPT(Table1, Table1[Country]))

It turns out only return one record for both USA and Canada with the highest Rnk, which is 4. But no India record.

So how we get the india result as our filter context do not include it?

Jason.. This is great . Although I do have a question.. Is it possible for you to filter concatenated states when you only select certain states instead of showing dynamic ..
When I select USA it shows me all the sales aggregated and all states ,,, but if I select just florida and USA .. Can I get a list only showing florida for some as that can become confusing to user when you show sales for one yet the list shows all the states.. Thoughts ?

Sanjeev Soni

Hi
in RANS WE need to use Expression in the send argument, but you have mentioned as Table1[State], It is giving the error; please suggest do we need to try like MAX, Min …

Sanjeev Soni

How to use Table1[State] in the second argument ;since it is expecting expression.
=RANKX (
FILTER ( All ( Table1 ), Table1[Country] = EARLIER ( Table1[Country] ) ),
Table1[State], , 1, DENSE
)
please assist to fix my error

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.