Visualizing Merge Join Types in Power BI

Visualizing Merge Join Types in Power BI

July 27, 2017

Over the last couple of years, I have been actively involved in doing Power BI trainings for my clients. Because of that, I am always looking for new and easier ways to explain Power BI concepts to my attendees. Yesterday, I saw a blog post from Reza Rad on Merge Types in Power BI and realized that this is one concept I always have to explain by drawing on a whiteboard during my training sessions. That is when I started thinking – maybe I could create a Power BI report to explain the join types while merging queries. Being able to click through the different join types and seeing the results would definitely make it more easier to understand than me drawing or just talking talking about it. Also, to make it more useful, I wanted to keep the ability to add / modify / delete records in the two tables so that my attendees could see in real time how it will affect the resultant merged tables.

For demonstrating the solution, I made use of 2 simple tables: Table A consists of Customer ID and Customer Name, Table B consists of Customer ID and Email. We will be merging both of these tables using the Customer ID column. In the report, you can see Table A is on the left side, Table B is on the right side and the resultant merged table is on the bottom. We also have a slicer on the top to choose the Join Type, and just under it, we have a description for the Join Type as well as a Venn Diagram. For both Table A and Table B, we have a IsJoined column which denotes whether the corresponding row is present in the Merged table for the selected Join Type. I have embedded the report below, feel free to click and see for yourself (or click on this link to see the full page view).

* The initial version of the report had only the 6 join types available out of the box in Power BI. After sharing this version of the report on Twitter, Imke Feldmann (t | b) said that it would be nice to display the Full Anti Join also, which is not available by default but can be easily added with the help of simple M code (something along the lines of Table.Combine({LeftAntiJoin, RightAntiJoin})). So I added that to the latest version of the report, along with a message that it requires custom code when the Full Anti Join option is selected.‏

 

Now, if you are reading this, most probably you are interested in learning how this was done. To be honest, this report ended up being a little more tricky than I thought it would be and it has some hidden tips and tricks.

– Table A and Table B data is entered through Power BI. So you can add more records and see how the merged table would look like for the changed data. For eg, what happens when I have duplicate customer ids and how will it affect my merged table?

– How does the slicer have images under it? – Chiclet slicer

– How did I make the Venn Diagram to change based on the filter selection? – Use Synoptic Designer to create the Venn diagram, and then link it to your dataset using a DAX measure that will highlight the appropriate area.

– How does the IsJoined column show whether the source table’s record is present in the merged table? – DAX Measure

– How do I display the right results in the Merged table? – A combination of using Power Query and DAX measures.

I know I have answered the questions only on a high level, but if there is enough interest in knowing more about this, please let me know and I might end up writing a follow up for this post detailing all the techniques. Let me know if you have any further questions apart from what I have listed or feedback / bug reports on the same, and I will try to accommodate / fix it as much as I can. Meanwhile, feel free to download the report and play with it yourself. Also, you know where to point the next time someone asks you on the different join types in Merge Queries within Power BI.

I have also published this report to the Power BI Community Data Stories Gallery. Feel free to comment / like / simply interact with the other reports and users out there.

Update

The report has been updated with a second page explaining the join types using Join Diagrams. The join diagrams are inspired from this post – https://blog.jooq.org/2016/07/05/say-no-to-venn-diagrams-when-explaining-joins/ and was pointed out to me on twitter by @thesqlgrrrl.

Posted by SQLJason in Power BI, 15 comments
Visio Custom Visual (Preview) for Power BI – Quick Look

Visio Custom Visual (Preview) for Power BI – Quick Look

June 22, 2017

A week back, I was at the Data Insights Summit, where I got to hear in person many exciting updates for Power BI. One of the updates was the release of a preview version of a new custom visual – Visio for Power BI. At that time itself, I registered myself to try out the new custom visual but it took almost another week for the team to send me the download files for the Visio custom visual (pretty sure they were flooded with preview requests from excited users like me). That said, I have been trying out the visual for the last 2 days and decided to write down a quick review of the preview version.

How to get the Visio Custom Visual for Power BI

You can request the private preview for the Power VI Visio custom visual by clicking on this link – aka.ms/visio-new and filling in the form.

How to use the Custom Visual in Power BI

1) For the purpose of this report, I created a simple excel file (OrgData.xlsx) containing Name, Title, Reports To and Salary.

sample data

I also added some pictures of the employees in a folder.

employee headshots for org chart

2) I imported this data into Visio to create a Org Chart (follow the steps from this link).

Org chart in Visio

3)  Save the Visio diagram to One Drive for Business or SharePoint Online where your team also has access.

saved visio file in One Drive for Business

Click on the Visio diagram and then copy the link into a text file for future use.

4) Now open Power BI desktop and import the Excel file with the org data. After that, import the Visio custom visual and select it on the reporting canvas. Add Name to the ID field, and then you should see a dialog box to input the Visio diagram’s URL that we copied in the previous step. Click on connect after that, and also add the Salary in the Values field, so that we can see the Org Chart display the colors. Check out the gif below for more details.

Visio Custom visual in Power BI

5) Notice that the visio diagram is not coming up in Power BI Desktop. This is a limitation of the current preview version, and the diagram will only be visible when you view it in Power BI Web. Add a simple table with Name and Salary next to the Visio custom visual and then publish the report. Now you should see the Visio diagram in the report.

Visio custom visual in Power BI Web 

Note that you can click on the org chart and see the table getting filtered for the selection. However, it is not possible to make multiple selections using CTRL+Click in the Visio diagram, as we can do in the other native charts.

My Thoughts – The Good & The Bad

1) This visual provides a great way to make some cool visuals easily. Apart from the Org charts, I also experimented with Flow charts, network diagrams, floor plans and it was great to see how easy it was to make those charts in Visio and integrate them within Power BI.

2) This is more of a Visio feedback rather than for the Visio custom visual for Power BI. You can use Visio to make some charts that are not available natively in Power BI like Org Charts, Flow charts, etc. from Excel data (or other sources) automatically. So if something changes, it is easy to create a new one by importing the data again and then saving it in the same location in One Drive for Business / SharePoint Online. The Power BI report seems to pick up the latest version of the Visio diagram every time the browser is refreshed (even though the official documentation says that you might need to re-insert the custom visual sometimes).

However, it would have been better if the shapes were automatically added or deleted in Visio based on changes in data, rather than manually adding them or recreating them. Even though this feature is not present in most charts, I did notice that there are some like the “Cross Functional Flowchart using Data Visualizer” in Visio where the shapes get added/deleted by just clicking the Refresh button in Visio.

3) I am pretty sure this is just a limitation of the Preview version – the visual gets displayed only on Power BI Web version and not in the desktop.

4) Currently, it looks like you can’t do multiple select (using CTRL+Click) on the shapes with the Visio custom visual. It would have been nice if we could do that just like we do in all the other native visuals in Power BI.

5) The usefulness of this visual can be greatly enhanced if there was a way to automate the refresh of Visio diagrams based on the change of data, saving the changed Visio diagram to One Drive for Business/SharePoint Online and then seeing the latest version without any issues in Power BI. I am still investigating if there is a way for it.

Apart from what I have mentioned, the official documentation also mentions the following things about the Preview

1. Visio custom visual needs to access the Visio diagram so in cases where Power BI user’s sign-in information can’t be accessed via Single Sign-on, the user might be presented with a sign in prompt and they need to sign-in to authenticate themselves.

2. If clicking on sign in button doesn’t do anything then it could be due to a known IE/Edge browser behavior when Power BI and SharePoint are in different security zones, please add both the Power BI domain and the SharePoint domain to the same security zone and try again.

3. Data graphics applied to Visio diagram from Visio client are removed.

4. In case your diagram has complex styles, themes, fill patterns etc., you might notice some visual differences between the Visio diagram in the Visio client and the diagram rendered in the Visio custom visual.

5. Large diagrams with shape count over 2000 are not supported.

6. In case you need to add new shapes that map to your Power BI Data, or remove shapes that have been previously mapped please verify the report. In case you observe any issues, you might need to re-insert the Visio custom visual and map the shapes again.

It is pretty exciting to see all these features in the Preview version of this custom visual, and I can’t wait to see what else is going to be available once this is no longer in Preview. Also, the general trend of trying to integrate different products like Visio and Power Apps into Power BI is extremely heartening.

Posted by SQLJason in Office 365, Power BI, 4 comments
Dynamic Grouping in Power BI using DAX

Dynamic Grouping in Power BI using DAX

March 1, 2017

It has been quite a while since I posted something and was already thinking of dusting up my tools. That was when I was going through the Power BI Community forums, and found an interesting question –

Requirement: The user wants a report with a column chart. The X axis will have Subcategory Name and the value will be the sum of Internet Sales. Along with this chart, the user will have a slicer where they can select the Subcategory Names. The column chart should “update” showing one column for each selected subcategory, and another column named “Others” with the summed amount of the rest of the unselected categories.

Basically, they wanted a dynamic group called “Others” and the members within this group should change based on what is selected on the slicer.

This would be a good time to show a visual representation of what the requirement means.

1 Requirements

You can see that there is one individual (green) column for every selected Subcategory and also one (orange/red) column called “Other” which has the summed up value for the rest of the unselected categories.

For solving this, follow the steps below:-

1) The “Other” member is not available in any existing column. So we will have to create a new table having a column for all the subcategories, as well as an additional member for Others. For this, I made a new calculated table in Power BI using the formula below

ProdSubCat_List =
UNION (
    — get the existing values of subcategory name   
    VALUES ( ProductSubcategory[Product Subcategory Name] ),
    — add the other member
    ROW ( “SubCategoryName”, “Other” )
)

The Subcategory column from this table has to be used in the charts, since this is the only column which has the “Other” member. At the same time, this table is a disconnected table (which means that there is no relationship between this table and the rest of the fact/dimension tables), so we will not get any proper values if we just use the Sales measure with this column in a column chart. For that, we will have to create a custom measure.

2) The next step is to make a measure which will display the values

NewSalesMeasure =
VAR SelectedSales =
    CALCULATE (
        [Sales Amount],
        INTERSECT (
            VALUES ( ProductSubcategory[Product Subcategory Name] ),
            VALUES ( ProdSubCat_List[Product Subcategory Name] )
        )
    )
VAR UnSelectedSales =
    CALCULATE (
        [Sales Amount],
        EXCEPT (
            ALL ( ProductSubcategory[Product Subcategory Name] ),
            VALUES ( ProductSubcategory[Product Subcategory Name] )
        )
    )
VAR AllSales =
    CALCULATE (
        [Sales Amount],
        ALL ( ‘ProductSubcategory'[Product Subcategory Name] )
    )
RETURN
    IF (
        HASONEVALUE ( ProdSubCat_List[Product Subcategory Name] ),
        SWITCH (
            VALUES ( ProdSubCat_List[Product Subcategory Name] ),
            “Other”, UnSelectedSales,
            SelectedSales
        ),
        AllSales
    )

 

Note that we are making use of 3 variables – SelectedSales, UnSelectedSales and AllSales to handle the 3 conditions that can arise.

SelectedSales will match the member values in the our calculated table (ProdSubCat_List) with the Subcategory names in the original Subcategory table and get their corresponding Sales Amount.

UnSelectedSales will get the Sales Amount for all the unselected Subcategory names, and we make use of the EXCEPT function for this.

AllSales is the total Sales Amount for all the Subcategories, and is used for showing the grand total.

3) Create a column chart with ProdSubCat_List[Product Subcategory Name] on axis and NewSalesMeasure on values. Put a slicer which has ProductSubcategory[Product Subcategory Name]. Now you can see the required end result.

2 End Result

Posted by SQLJason in DAX, Power BI, 11 comments
Hex Tile Grid Maps for Power BI

Hex Tile Grid Maps for Power BI

April 21, 2016

I have always been fascinated by maps as a child, and could spend endless hours looking at the globe my parents got me as a present for my 5th birthday. I was so hooked on to it that my parents even considered removing it from my room fearing that it could hamper my social development (and this was in spite of my  parents being extremely proud that I could tell most of the countries and their capitals around that time!). Even though maps don’t intrigue me to that level anymore, I still follow them as part of my job and have written quite a number of blogs on getting spatial information in the Microsoft stack, starting from SSRS 2008 R2. So it was kind of natural that when I saw a couple of hex grid maps floating around my twitter feed a couple of months ago, I thought of reproducing it in Power BI as I knew it could be done.

Hex tile grid maps for Power BI

First of all, let us start with an introduction of hex maps and why they could be useful. Regular choropleth map is a tried and tested visualization for area maps but it carries the risk of under-representing some areas. For e.g., in a regular choropleth map of the US, DC is hardly visible along with some other North-Eastern states. A hex tile map solves this issue by giving each state equal weight. However, it comes with it’s own set of problems like balancing between depicting unique geographical features (like Texas and Florida being the southern most part of the country) versus depicting bordering states accurately. Because of this reason, you will find more than one version of hex grid maps and it is perfectly ok to choose the one that suits your need more appropriately. Now you can follow the steps below to reproduce a hex tile grid map in Power BI (and don’t forget to check out the Power BI report that I made with this technique at the bottom of my post):-

1) Choose a version of the hex tile grid map that you like from the internet. Or you can even make one easily in PowerPoint or any other image processing software (as it is just a collection of hexagons) based on the image that you get from the internet and save it as an image.

Making hexagons in powerpoint

2) Go to http://synoptic.design/ and upload the image to the synoptic designer by dragging and dropping the image to the designer.

Uploading to Synoptic designer

3) Ensure that the second icon on the bottom left is enabled (which helps us to automatically discover new areas). Now you can just click on the hexagons and the synoptic designer automatically discovers the areas for you, which is super cool.

Using the automatic discovery of areas icon in Synoptic designer

Now, for most people, this should be more than enough and the results come out really good. In my case, I decided to take a step further as I was planning to share the file for the community. If you notice carefully, you can see that more than 6 vertices are being plotted by the designer automatically (check out the multiple vertices in the section I highlighted).

Multiple vertices being recorded

To avoid this, I just wrote a bunch of formulas which would calculate the vertices in plain old excel and then just copied the 6 pairs for each of the 51 states manually.

Replacing it with just 6 vertices

Make sure that you map the areas to the appropriate state name / code also.

4) After this, your map is ready and you can just export it to Power BI, which would save the map data as a SVG file in your computer.

Export as SVG file for Power BI use

I would also request that if you make some interesting maps / shapes, please consider submitting it to the gallery so that other community members can also reuse it. I have submitted my map to the gallery and hopefully it will be approved by the SQLBI team (who created and still supports this wonderful tool).

5) Now open Power BI, and download the Synoptic Panel from here (if you don’t already have it) and import to Power BI. Once you have done that, click on the Synoptic Panel to add it to Power BI, and add the state code (which is the filed we are going to bind our dataset with the map) and a measure (like Total Votes) to it. Then click on the “Select Map” icon.

Synoptic Panel in Power BI

Then browse to the SVG file we just downloaded from Synoptic Designer and you should have your basic version of the hex tile map ready. Feel free to experiment by adding measure values to the Saturation or State values.

Basic hex tile grid map in Power BI

Now as a reward for making it till here, I thought of letting you play with this simple report that I created using this hex map. In this report, you can select any year from 1916 and see the winning party of each states (I only included the data for Republican and Democratic parties), as well as the nominees of the election. You can also see the number of electoral votes they won along with the popular vote %, which gives some pretty insights. For e.g., it is interesting to see that George W Bush won the election even though he got fewer popular votes than Al Gore in 2000. Click on the Expand icon to see the report in full screen.

Posted by SQLJason in Power BI, Spatial / Map Reports, 8 comments
NBA style Shot Charts in Power BI

NBA style Shot Charts in Power BI

February 11, 2016

Recently, I created a NBA shot chart in Power BI as part of my entry to the Microsoft Power BI Best report contest and I had got a lot of questions on how I made the visual. So I decided to write a quick post on how I made the shot chart as well as use this opportunity to present my entry, which got selected as one of the Top 10 finalists in the contest.

Note

My interactive contest entry is embedded in this post (thanks to the new Publish to Web feature in Power BI) and a full screen version of the same report can be obtained from here.

My entry is an analysis of the first 35 games played by Stephen Curry from the Golden State Warriors in NBA. The main feature of the entry is a Shot chart which shows the position from which he attempted his shots and the color denotes whether he made or missed it.

1 Shot Chart

To make a similar shot chart, follow the steps below:-

1) The most important part of any report is getting the data. I had a few sources for my data (www.nbastats.comwww.nbasavant.comwww.datavizdoneright.com) from where I directly got my (X,Y) position data. But if you are really serious, you might want to look at the following blog posts which show how to scrape data directly –

How to create NBA Shot Charts in R

How to create NBA Shot charts in Python

2) Once you get your (X,Y) location data, you can import the results into Power BI and then create a scatter chart from the data.

2 Scatter chart

3) Get a background image for the basketball court (I used one that I found from  www.datavizdoneright.com since it had the logo of Golden State Warriors and looked nice). Now you can import the image and place it behind the scatter chart. Make sure that you send the image to the back, as you need the scatter chart on top so that you can interact with the dots by clicking.

3 Arrange

4) Now the hard part is resizing the scatter chart to the size of the court. I turned off the X and Y axis, and then turned on the X,Y reference lines so that I know where the center needs to be.

4 Resizing court

5) Once you have found the right fit, you can turn off the reference lines also and then add the shot result to differentiate between made and missed attempt. You can also add a chiclet slicer with the opponent images to see the shots by teams as shown below.

4 Final shot chart

That said, there is already a custom visual called the Enhanced Scatter Plot which allows you to put an image behind a scatter plot chart. I couldn’t make my data line up with the image, and hence I had to do it the hard way. You might find it easier to use the Enhanced scatter plot directly. Hope you liked my version of the shot chart!

Posted by SQLJason in Power BI, 11 comments
Custom Indicators in Power BI using Chiclet Slicers

Custom Indicators in Power BI using Chiclet Slicers

November 6, 2015

First of all, happy Friday! As we get ready to enjoy the weekend, I thought of noting down a quick tip on how to use the totally awesome Chiclet Slicer to display custom indicators in Power BI. If you are hearing about the Chiclet Slicer for the first time, please do check out the official Microsoft blog on this as it is a very useful viz. For people who follow my blogs, you would remember that I had already written down a technique to create Indicators in Power BI before. But the main drawback in that approach was that there was no way to color the indicators, and also we were limited by the set of Unicode characters that could be used as indicators. With the advent of the chiclet slicers, we can now dynamically display any image as our indicator and this post will precisely show you how to do it. Custom Indicators in Power BI using Chiclet slicers For this demo, let’s say – I want to display a green up arrow or a red down arrow based on whether my measure is positive or negative. For that, follow the steps below:-

1) Open the Power BI desktop file where you want to add the indicator, and then go the data tab. Click on the New Table button.

Calculated table in Power BI

2) It is important to understand that the chiclet slicer, just like the regular slicer, can only display table fields or calculated columns (and not measures). So we have to create a table with a list of all the “states” or possible values. In my case, we can have only 2 states – Up and Down. Use DAX to create a table with 2 rows – Up and Down. Also, add the image url for each of the state (in my case, an image url for the up and down arrows).

Indicator =
UNION (
    ROW ( “Indicator”“Up”,
    “ImgURL”“http://www.clipartbest.com/cliparts/nTX/EGB/nTXEGBLTB.png” ),
    ROW ( “Indicator”“Down”,
    “ImgURL”“http://www.clker.com/cliparts/D/8/S/c/z/3/red-down-arrow-md.png” )
)

DAX for calculated table in Power BI for Indicator states

Note that we are making use of the calculated table feature in Power BI to create a table with a list of states.

3) Let us say that I have a measure called Metric which shows either positive or negative value. Right now, I am just hardcoding it to -30.

Add metric

4) Now create a new measure which will display 1 for Up if the measure Metric is >=0 or display 1 for Down if the measure Metric is < 0

LinkedMeasure =
SUMX (
    VALUES ( Indicator ),
    IF (
        (
            [Metric] >= 0
                && VALUES ( Indicator[Indicator] ) = “Up”
        )
            || (
                [Metric] < 0
                    && VALUES ( Indicator[Indicator] ) = “Down”
            ),
        1
    )
)

Add measure to display Indicator

5) On the Report tab, add the Indicator column and the Linked Measure to the canvas, and then convert it into a chiclet slicer (make sure you download and import this custom visualization from the Power BI Visuals Gallery before this step). Also add the ImgURL field to the Image field. You can change the Image Split property under the Image section to 100 from the default 50, so that the Image occupies 100% of the space

1 Add Chiclet Slicer

6) Hide the borders and also turn off the headers, so that only the image is visible.

2 Hide Borders

7) Now you can add a textbox besides the chiclet slicer to display the metric. Now go ahead and change the values of the metric, and you can see the chiclet slicer automatically update itself with the right indicator.

3 Dynamic indicator

The chiclet slicer is pretty good on it’s own as a way to slice data, but the ability to display custom images takes it to the next level. You can use it for a lot of tips and tricks, and I hope this post gets you thinking on what all you can do with this. And there goes your weekend, BOOM!

Note

As usual, make sure you look at the date at which this post was published and the version of Power BI. Since Power BI has a rapid release cycle, I would expect some of the features to change. Hence, always check whether a new feature makes it more easier to implement your scenarios like this one. The version I used is given below.

image

Posted by SQLJason, 6 comments
Performance Problems with IF statement execution in SSAS Tabular

Performance Problems with IF statement execution in SSAS Tabular

November 4, 2015

Due to the high compression rates and stellar in-memory architecture of SSAS Tabular, most people with smaller models do not experience performance problems (in-spite of employing bad data modeling techniques and inefficient DAX measures).  However, as the size of your models increase, you will start to see performance issues creep up, especially if you are not paying attention to data modeling and DAX measures. Last week, I gave a presentation at the PASS Summit 2015 on my experience of building a 150 GB Tabular model in SSAS 2012. During that, I shared my story on how some of the DAX measures with IF statements were causing performance issues and how to work-around that issue by rewriting your DAX measures. During that day, Microsoft also announced that they resolved this issue in SQL 2016, so I thought of demonstrating the issue, workaround and also the fix in SSAS 2016.

Performance problems with IF statement in SSAS Yabular

Issue in SSAS 2014 (& older versions)

For demonstrating the issue, I will be writing queries against the Adventure Works model in SSAS 2014 and using MDX Studio to show the server timings. Let me start with the below query

WITH MEASURE ‘Date'[test] = If ( 1 = 2, [Internet Total Sales], [Reseller Total Sales] )
SELECT NON EMPTY { [MEASURES].[Test] } ON COLUMNS,
NON EMPTY (
{ [Date].[Calendar Year].Children },
{ [Product].[Product ID].Children },
{ Geography.[Country Region Name].Children } ) ON ROWS
FROM [Model]

The above MDX query defines a DAX measure called Test, which depending on the condition displays either Internet Total Sales or the Reseller Total Sales (To make it simple, I just made a static condition 1=2 but that can be replaced by any dynamic condition also). The query results should display the Test measure for Year, Product ID and Country. Now, normally we would expect that the Test measure should only execute the true part of the IF statement. But let us execute this in MDX Studio and see what actually happens.

  Storage Engine scans against SSAS 2014 (Original query)

You can see that both the branches of the IF statement are being executed, even though we expect only the true part to be executed. For smaller models, it might not make a difference but for large models with expensive measures, this might cause severe performance issues.

Workaround in SSAS 2014 (& older versions)

The workaround for this issue is to rewrite your DAX such that we ensure that the measures get executed only if the condition is true.

WITH MEASURE ‘Date'[test] = CALCULATE([Internet Total Sales], FILTER(Currency, 1=2)) + CALCULATE( [Reseller Total Sales], FILTER(Currency, 1<>2))
SELECT NON EMPTY{[MEASURES].[Test]} ON COLUMNS,
NON EMPTY({[Date].[Calendar Year].children}, {[Product].[Product ID].children},{Geography.[Country Region Name].children}) ON ROWS
FROM [Model]

Note that the measure has been rewritten as the sum of two CALCULATE functions. The key is to use a table in the filter clause within the CALCULATE that satisfies the below conditions

  • Is related to the fact table of the measure
  • Is low in cardinality (you can also use a low cardinality column instead of a table)
  • Is not being used in the calculations for the measure/condition. If yes, do some extra testing to make sure the performance is not worse

The reasoning behind the table being connected to fact table is because the calculate() with the false condition has to evaluate to zero / BLANK so that the result of the Test measure would only be the one appropriate measure. If the table is not related, you will end up with the sum of both the measures. A low cardinality table or column is preferred because in this technique, you will see that there are 2 additional queries being sent to the storage engine, which evaluates the FILTER part for the two measures. If the tables have high cardinality, the time for the FILTER queries will take more time. The reason why I said that the table or column should not be used in the measure calculations or condition is because I have seen that in certain conditions, this could actually make the performance worse and still execute both the branches. So just make sure you do some extra testing.

Storage Engine scans against SSAS 2014 (Workaround query)

That said, let us look at the scans for the above query. You can see that only the Reseller Sales measure is executed. Also, if you notice carefully, there are 2 extra scans which basically check the filter condition for Currency. In large models, these scans for low cardinality dimensions will be almost negligible and the time for these extra scans will be much lesser than the time taken to execute the other measure also. In this case, the Adventure Works model is just 18 MB, so you won’t see much of a difference.

New Optimization in SSAS 2016

SSAS 2016 CTP2.3 (and newer versions) has a new optimization for this issue – Strict evaluation of IF / SWITCH statements. A branch whose condition is false will no longer result in storage engine queries. Previously, branches were eagerly evaluated but results discarded later on. To prove this, let us execute the original query against SSAS 2016 and see the results.

Storage Engine scans against SSAS 2016

Now we can see only the relevant measure is being executed. Also, it is faster compared to SSAS 2014 versions of both the original query as well as the workaround. I hope this article will help people who are not on SSAS 2016 to optimize IF statements, and also help understand what the new optimization in SSAS 2016 – Strict evaluation of IF / SWITCH statements actually means. There are also a bunch of new features and optimizations in SSAS 2016 and you should check them out!

Posted by SQLJason, 2 comments
Quick Intro to Power BI Visuals Gallery

Quick Intro to Power BI Visuals Gallery

October 19, 2015

I don’t usually look forward to Mondays (especially after spending a very exhausting though rewarding weekend organizing SQL Saturday Charlotte), but then today was different. Amir Netz had already spoiled my weekend by putting out a teaser on Power BI and I was actually waiting for Monday to come so that I could remind him of his promise.

Twitter conversation - Amir Netz

And well, he didn’t disappoint Smile… This is such a great news to the world of dataviz. Let me quote his announcement below

I’ll admit it. I am very excited… So deep breath. Here is exactly what we are introducing today:

  1. Custom visuals in the Power BI service and Desktop: The ability to upload and incorporate a custom visual, whether a broadly useful visual from our community gallery or a completely bespoke visual tailored for the needs of a single user, into the report and then share it with others. This is available in the Power BI service today, and in the Desktop next week.
  2. The Power BI visuals gallery: A community site (visuals.powerbi.com) that allows creators to upload new Power BI visuals and for users to browse, select and download those visuals.
  3. Power BI developer tools: With our developer tools every web developer can code, test and package new visuals directly in the Power BI service to be loaded to the gallery.

You can read more on this in the official blog post here. Let me use this moment to give a quick intro to the Power BI visuals gallery and how you can use some of the community examples to enhance your visualizations.

Quick introduction to Power BI Visiuals Gallery

First of all, note that this functionality is only available in the Power BI service as of now and will be available in Power BI desktop next week. Also, the custom visuals can not be pinned to a dashboard as of now, but that feature should also be coming soon. That said, follow the steps below:- 1) Head over to https://app.powerbi.com/visuals and feel free to choose any of the awesome visualizations created by our community. For now, I am going to choose Hexbin Scatterplot (which was created by my colleague David Eldersveld and won the third prize in the Power BI custom viz contest– you might also want to check out his thoughts on Power BI Custom Visualization here) and KPI Indicator with status.

Choosing custom visuals from Power BI Visuals gallery

2) For each of the selected visuals, click on the visual icon and then you will be presented with the Download Visual window. Click on Download Visual button.

Download Power BI custom visual

Read the terms of use in the next screen and then press the I agree button.

Agree to terms and conditions

3) This will begin the download of your pbiviz files (power bi custom visualization files). Once the download is over, sign in to Power BI service and then open a new report. Click on the Ellipsis symbol (…) to import the two pbiviz files that you downloaded.

Import Power BI custom visual file (pbiviz) in Power BI Service

4) Now you can use those custom visualizations just like the existing ones. For e.g., I can create a hexbin scatter plot chart by selecting Sales Amount, Sales Quantity and Store name. Note how I change the default visualization to a regular scatterplot and then to a hexbin scatterplot. Also look at the benefits that a hexbin scatterplot gives over a regular scatterplot – you can easily see where the concentration is more, and you also have rug marks on your axis to show where the dots are. Feel free to explore the chart, you can watch how it works from the video in this link.

Hexbin Scatterplot in PowerBI service

5) I added a regular bar chart for Sales Amount by Year on the bottom left to show the interactive features of the new charts. Then, I went ahead and added the Sales Amount and Sales Amount LY by Calendar Month and chose the KPI Indicator visual. Note how smoothly all the charts work with each other!

Addding KPI Status Indicator to Power BI service

6) Feel free to explore further. For e.g., I added a slicer for ClassName also to check out the interactivity

testing out the interactivity for custom visuals in Power BI Service

7) You can save this report and share with others now. When you share a report that contains a custom visualization, you may be greeted with a warping that the report contains custom visuals. Click on the Enable custom visuals button to see the report.

Enable custom visuals warning in Power BI service

You can see how easy it was for someone like me, who doesn’t know how to code, to incorporate these visualizations in my report. And for those who know to code, the possibilities are endless. As the community grows, we are going to get more and more of these awesome visualizations and this will greatly impact the lives of people in the data analytics industry. As for me, I can’t wait to see what all awesome stuff is going to come from the community and also what other surprises the Power BI team has in store for us (would be definitely tough to top this one though!)

Posted by SQLJason, 0 comments
My Thoughts on Cross Filtering in Power BI

My Thoughts on Cross Filtering in Power BI

September 30, 2015

It’s amazing how much of an influence your upbringing can have on you and your preferences. I was the youngest in my family for almost 10 years (till my brother came along) and not to say, growing up, I was very much pampered by my mom. However, my dad was more of a proponent of what I call tough love and back in those days, it was still legal to spank your child to set him straight. (Sometimes, I even feel like my father was the one who coined the proverb – Spare the rod and spoil the child). Looking back now, I feel those were the moments that really formed my character and helped me reach where I am, and I am really thankful to God almighty for giving me the perfect family. I know some of you might be nodding your head in agreement with me reading this post, while a lot of you might be getting really angry at what you are reading. This is perfectly understandable, and it’s ok, because we live in a free country, right? You might also be wondering the reason of such a lengthy introduction to my blog. Well, the reason is that today’s post is kind of tough love as I am criticizing (constructively) the way Cross Filtering works in Power BI, and also providing an example of how I think cross filtering should work in an efficient BI tool by showing Tableau as an example.

My thoughts on cross filtering in Power BI

Action Item – Please do it

For those who don’t have time to read this post, I would please request two things:-

1) Please vote for this issue (link given below) and tell others to also do it https://support.powerbi.com/forums/265200-power-bi/suggestions/6709520-drill-down-should-drill-or-cross-filter-other-visu

2) If you are reading this before 12 PM PST Oct 2, 2015, please take the below survey which will give the feedback directly to the Microsoft Power BI research team (and if you are in the US, you might win a $50 Amazon gift card also). Make sure that you mention “making cross-filtering more intuitive” and “ability to hold selections on more than one chart for cross-filtering” as two points for this question in the survey– “What would make Power BI Desktop a better experience for you?

Please please please do it Smile

How Cross Filtering in Power BI works

Before I start this post, I have to say that I am one of the biggest fans of Power BI, and I have never been as optimistic about a Microsoft BI product as I am right now. (and it’s not just me being a fan boy, the most recent Forrester Wave report shows Microsoft leading the BI pack)

Forrester Wave BI report

The Power BI team is also one of the most responsive teams and you can regularly see the product managers as well as the product team members interacting with the general community on twitter and the Power BI community. But they don’t have an infinite number of resources and time, and hence will be making changes to the product based on priority, and votes are one way we can help the team prioritize the feature requests. The more the votes, the higher the priority and the faster we can get this feature implemented by the product team and this is where I really need all of you to pitch in. Let’s start by taking a look at how cross filtering in Power BI works today:- Let’s say I have 3 charts – bar chart for sales by business lines, bar chart for sales by country and a bubble chart which shows some KPIs by countries.

Power BI dashboard

You can click on any chart and see the rest of the charts refresh for it. For eg, I can click on Nutrition business line and see the other charts refresh for it. However, there are a couple of things I can not do. For eg 1) I want to see the bubble chart for business line – Nutrition and country – Japan. Normally you would expect to click on Nutrition in the first chart, and Japan in the bottom chart to see the cross-filtered bubble chart. However, Power BI currently does not allow us hold selections on multiple charts. The workaround is to add slicers or filters for the needed fields, however it will break the flow when I am trying to find insights from my dashboard.

1 Power BI not able to hold multiple selections

2) I want to see where a particular country is in the bubble chart along with the rest of the countries, so I can assess it’s performance with respect to the other countries. For eg, when I click on Japan in the bottom bar chart, I want Japan to be highlighted in the bubble chart. Right now, you can see from the above image, that when I click on a particular country, only that country appears in the bubble chart because it is filtered. What we need is an option to specify whether we need a chart to be filtered or in this case, highlighted. 3) Cross-filtering during drill downs is the most counter-intuitive feature for me. I had raised this issue in 2013 for Power View and despite getting a lot of votes for the connect issue, the issue is still active. For eg, let’s say the bottom chart shows the sales by Region which can be drilled down into countries. When I drill down into the Greater China region (which only has 3 countries), I expect the top chart to cross filter for Greater China region, which it doesn’t. But if I manually select all the countries, it will cross-filter the top chart appropriately. So in a way, I can take a screenshot of the exact same report showing 2 different data points – which would be very confusing for end users. What we need is for the cross-filtering to work intuitively when we are drilling down.

2 Cross-filtering during Drill down is counter-intuitive

And looks like I am not the only one as I can see 4 comments in the last month on this.

Comment from Microsoft Connect

Comment from Power BI Community

How I feel Cross Filtering should work

I didn’t want to sound pompous by saying this is how it should work; everyone has their own ideas and most of the times, no-one is wrong. But this is why it is important for a BI tool to give options to the end user, so that they can manage the options and choose to use it the way they like. That said, let’s take a look at another popular BI tool – Tableau and see how it handles the above scenarios.

1) Note how the entire report cross-filters as I keep on holding multiple selections.

3 Tableau cross filtering

This experience is very important as I can see what are the top countries for each business line, and then I can also choose a particular country and then see the information for the selected country and business line to analyze in detail.

2) Also note how the country in the bubble chart gets highlighted, when I click on a country in the bar chart. As the number of bubbles increases, it is difficult to see where the selected bubble is unless we have a highlighting feature. The reason why highlighting is important is because it will help us identify patterns by comparing with the rest of the categories.

3) As far as the drill down is concerned, it is not that straight forward in Tableau, but we have options that will help us achieve the end result as I have shown below. You can see that when I click on a particular product category, it drills into the subcategory and all the other charts (the bar chart on the right and the table under it) are also getting filtered appropriately.

5 Tableau drill down

In reality, they are 2 different dashboard sheets, and clicking on the first sheet takes us to the second sheet with the drill down parameters intact (just like with SSRS). But an end user will not be able to get this difference and at the same time achieve the functionality.

Conclusion

It’s amazing how much ground has been covered in Power BI ever since the release, and there have been some really great decisions (like the ability to add custom visualizations – you just have to look at some of the contest entries to see some really great dataviz) as well as features (44 new features in the last monthly release!!!). For all that we know, the team might have already made this change in their next monthly release, or maybe it is still not in the priority list because enough customers do not want this. Either ways, I just wanted to put this out in case you also think the same way, and if yes, make your voice heard in the Survey. Now would be a good time to scroll up to the Action Item part! Smile

Note

As I mentioned before, Power BI versions change rapidly and there are a lot of new features coming in monthly. So it is important to check your version and see if there are any changes. The version at the time of writing this blog is given below-

image

Posted by SQLJason, 5 comments
My Thoughts on Calculated Tables in Power BI

My Thoughts on Calculated Tables in Power BI

September 24, 2015

Yesterday was a terrific day for all of Microsoft Power BI fans. Microsoft released updates for Power BI Service, Power BI Mobile and Power BI Desktop (with an unbelievable 44 new features) – which basically means no matter whether you are a developer, BI professional or an end user, all of you got something new to play with along with this release. The blogs give a lot of details on what those new features are, so I wouldn’t be going over them. But I wanted to take a moment to pen down a few moments on my thoughts on a new modeling feature within this release – Calculated Tables.

Calculated tables in Power BI

Chris Webb has already posted his thoughts on Calculated Tables in Power BI and I am pretty sure Marco Russo / Alberto Ferrari will post some on the same pretty soon (if not, this is an open request from my side for Marco & Alberto to post something on the topic, pretty please Smile) – [Update 9/28/2015 Read Transition Matrix using Calculated Tables]. As usual, a combination of posts from these folks are going to be the best resource for any topic in the Modeling side, and I learn most of my stuff from them. So you would be thinking – what exactly am I trying to achieve in this post? Well, I would like to add my 2 cents on the same and try to see if the community in general agrees with what I think and if not, to learn from the comments and the interaction this might generate.

I) How to build Calculated Tables in Power BI

Before I start on my thoughts on calculated tables, it might be a good idea to quickly see how we can create calculated tables.

1) First make sure that the version of Power BI is equal to or higher than 2.27.4163.351. (I am making a fair assumption that this feature will be enabled in all higher versions also released in the future). If not, download it from here

Power BI version

2) Now open any of your existing models in Power BI (or get some new data), and after that, click on the Data tab. You should be able to see the New Table icon in the Modeling tab on the top.

New table - calculated table

3) Click on the New Table icon, and then enter any DAX expression in the format that returns a table TableName = DAX Expression that returns a table Once you do that, then you should be able to see the resultant columns in the new table.

Calculated table

II) When is the data in a Calculated Table processed

The official blog says quite a few things on the same-

    • A Calculated Table is like a Calculated Column.
    • It is calculated from other tables and columns already in the model and takes up space in the model just like a calculated column.
    • Calculated Tables are re-calculated when the model is re-processed.

So based on this information, I am going to go a step further and assume that the data in a calculated table is processed during the ProcessRecalc phase of processing. Also, this means that every time any of the source tables changes (like a new calculated column or new data), the data in the calculated table will also change. To prove this, let us try a simple experiment-

1) Make a calculated table called Test which will be the same as the Date table (which currently has just the Year column).

make same calculated table

Note that measures from the source table are not brought along to the calculated table, which is as expected.

2) Now go to the Date table (which is our source table in this case) and then add a new calculated column called TestColumn with 1 as the value.

column replicated in calculated table

Note that when we added a calculated column in the source table, the column was replicated in the calculated Table also with the same name. The only difference is that the source table shows an icon for calculated column. This shows that the ProcessRecalc that happens in the source table when a new calculated column is made, also recalculates the calculated table.

III) My thoughts on Calculated Tables

Based on my understanding so far, there are times when I think I should use calculated tables and times when I should not use calculated tables. So here it goes –

a) When NOT to use calculated tables

If you have a way of replicating the calculated table in some form of ETL or source query (even a SQL View), you should not use a Calculated table. Why? A couple of reasons

  • If done from ETL / source query, the engine will see the result as a regular table, which means parallel processing of the tables can be done (unlike now, where the ProcessData phase of the source tables have to finish first before the calculated tables can be processed). So calculated tables could lead to slower data processing time.
  •  A ProcessRecalc happens every time you do a process full, and adding more calculated tables (as well as calculated columns) unnecessarily will increase the processing time. Also, during development of very large models, you might have to wait for a long time after each calculation is made for the data to appear, since all dependent tables will also have to be recalculated (unless you turn off the automatic calculation mode).
  • This one is more an assumption and further research will be needed to validate this, but I am putting it forward anyways. Just like a calculated column is not that optimized for data storage compared to a regular column, I suspect that a calculated table will also not be optimized for storage compared to a regular table. If this is true, this might affect query performance.

b) When to use calculated tables

There are a lot of scenarios where you would want to use calculated tables and I have listed a few of the scenarios below

  • During debugging complex DAX expressions, it might be easier for us to store the intermediate results in a calculated table and see whether the expressions ate behaving as expected.
  • In a lot of self-service BI scenarios or Prototype scenarios, it is more important to get the results out there faster and hence, it might be difficult or not worth the effort to fiddle with the source queries. Calculated tables can be a quick and easy method to get the desired table structures.
  • A kind of scenario that I see very often during prototyping / PoCs is when there are 2 facts in an excel file which are given for making reports. As seasoned users of Power Pivot / Power BI, we know that we will have to create a proper data model with the dimensions. Now, I might need to create a dimension which gets me the unique values from both the tables. For eg, in the below example, I need the Country dimension to get values from both the tables (USA, Canada, Mexico). It might be easier to write an expression for a calculated table like shown below
    Country = DISTINCT(UNION(DISTINCT(Table1[Country]), DISTINCT(Table2[Country]))) 

2 fact tables with no dimension data

  • There are times (like in some role playing dimensions) where you will need to replicate the columns / calculated columns (or even delete) in another table also, even if the columns are built at a later phase in the project. Calculated tables are a perfect match for that, as you will only need to make the changes in one table, and the changes will flow down to the calculated table during ProcessRecalc phase (remember our previous example where we created a TestColumn in the Date table, and the change was reflected in our calculated table also).
  • Calculated tables can help in speeding up DAX expressions. This is very similar to the technique of using aggregate tables in SQL database to speed up the calculations in SQL.
  • Quick and easy way to create calendar tables (like Chris showed in his blog)

This is still a very early stage as far as Calculated tables are concerned, and I am pretty sure we are going to see some very innovative uses as well as benefits of calculated tables in the days to come. I might also learn that some of my assumptions are wrong, and if I do, I will come back and update this post to the best I can. Meanwhile, feel free to comment on your thoughts / concerns / feedback.

Update – 9/28/2015

Transition Matrix using Calculated Tables – Alberto Ferrari
Use Calculated Table to Figure Out Monthly Subscriber Churn – Jeffrey Wang

Posted by SQLJason, 0 comments