Uncategorized

Power BI for Office 365

Power BI for Office 365

Great news!!! Today, at the Worldwide Partner Conference, Microsoft announced a new offering – Power BI for Office 365. Quoted from Microsoft – “Power BI for Office 365 is a cloud-based business intelligence (BI) solution that enables our customers to easily gain insights from their data, working within Excel to analyze and visualize the data in a self-service way. It works with Office 365 to help customers share insights, find answers and stay connected to their data from their favorite mobile devices.” Announcing Power BI for Office 365 For those of you who have the time, I would recommend going through the links that I have collected below. And for those who don’t have the time, let me do a very quick summary:- 1) Discover, analyze, and visualize with Power BI for Office 365, which includes Power Query (formerly known as Data Explorer), Power Map (formerly known as GeoFlow), Power Pivot (formerly known as PowerPivot – notice the space between the words) and Power View (for a change, I like that the names are standardized with a space after Power). 2) Create Power BI Sites to share live interactive reports. Power BI Sites are dedicated collaborative BI workspaces in Office 365 for sharing data and insights with colleagues. 3) Ask questions and get back instant answers with the new Q&A feature that uses natural language query technology to provide you with immediate answers in the form of interactive charts and graphs based on the questions you type into the speech bubble. 4) Browse Excel and Power View reports with HTML 5. Stay connected with the touch-optimized Power BI mobile app. BI users can access and receive live updates on their reports through their browser with HTML5 or through a mobile application designed for their tablet or touch-enabled device, either Microsoft Power BI for Windows or Microsoft Power BI for iPad. 5) Refresh your data on demand or define a schedule. The Data Management Gateway allows your IT department to enable data refresh from Office 365 to your favourite on-premises data sources. 6) No announcement on pricing or preview date (except for “later this summer”). The natural language feature looks pretty impressing even though I doubt whether the experience is going to be as seamless as shown in the demo. Also, the bubble chart shown in the demo (also shown in the image below) looked extremely jazzy but may end up just being eye-candy in real life. I would love to finally see the Mobile (and ofcourse iPad) solution come through with Power BI, although I am a tad bit disappointed that this feature is not there for the SharePoint version. Bubble chart from Amir Netz keynote demo - Power BI There are a couple of official posts on this offering and I have collated them below:- 1) What powers Power BI in Office 365?
2) Introducing Power BI for Office 365
3) Announcing Power BI for Office 365
4) Register for Power BI Preview Watch the Day 1 Vision Keynote from WPC 2013 on demand here! You can also read some of the initial reactions to Power BI below:- 1) Some Thoughts About Power BI – Chris Webb
2) Microsoft Office 365 Cloud Power BI – Jen Underwood
3) Microsoft Announces Power BI for Office 365 – Andrew Brust
4) Power Business Intelligence for Office365 – Jen Stirrup Exciting times indeed for Microsoft BI and Self Service BI in general. I can’t wait to get my hands on the preview.

Posted by SQLJason, 2 comments
Pie Charts on SSRS Map Reports

Pie Charts on SSRS Map Reports

Ok, I think I am getting a few puzzled looks now. All of you must be wondering why I am talking about pie charts (isn’t pie chart the untouchable outcast in the family of visualization techniques?). Also, you must be thinking that there is no possible way you can make pie charts in SSRS map reports (unless I am referring to Power View instead of SSRS). Well, to answer your questions, I am not recommending pie charts but ever since Power View gave the feature for displaying pie charts in map reports (read section III in Creating Maps in Excel 2013 using Power View), a lot of people have been asking whether it is possible to replicate the same in SSRS. You do have the ability to display bubble charts in SSRS map reports but there is no default way of achieving pie charts in SSRS map reports. That is when yours truly decided to take a shot at it and came up with this workaround (oh yes, you heard it right – there is a workaround!)

Pie charts on SSRS Map reports

To demonstrate this workaround, I am using the data from 2000 US Presidential Election (which I got using the online search feature in Data Explorer). Follow the steps below:-

1) Create a new report and then create a datasource pointing to your SQL database. After that, create a new dataset query like shown below

SELECT     State, Gore, Bush, Others, TotalVotes FROM EData

EData is the table which contains the election data that I pulled up before. This table contains the State name, the percentage of votes for Gore, percentage of votes for Bush, percentage of vote for all Others and finally the total number of votes.

Create dataset

2) The key to getting the pie charts on the SSRS map reports is to make use of the Pie Charts feature in Google chart. You can get an image of a pie chart as long as you pass in the required parameters in an url format. The basic syntax of the url we are going to use is

http:chart.googleapis.com/chart
?chs=300×225 — size of the image
&cht=p — type of chart, in this case, pie chart
&chd=t:20,30,50 — Data to be used in pie chart
&chco=FF0000|00FF00|0000FF — color to be used in pie chart

Google chart

To get a pie chart for each row / state, we need to construct the url for each row in the table. This can be done by the following query

SELECT State,
Gore,
Bush,
Others,
TotalVotes,
http://chart.googleapis.com/chart?chs=300×300&cht=p&chd=t:’ + CAST (Gore AS VARCHAR (20)) + ‘,’ + CAST (Bush AS VARCHAR (20)) + ‘,’ + CAST (Others AS VARCHAR (20)) + ‘&chco=06A2CB|DD1E2F|EBB035’ AS MarkerUrl
FROM   EData

3) Create a map report of US State from the map gallery and then choose the map visualization as Color Analytical Map. Choose the analytical dataset as Edata and then map the statename in the spatial dataset with the state field in the analytical dataset.

map state name

Click on Next and then finish to create a basic map report.

4) Even though it is not part of this requirement, I decided to color the states as blue or red based on whether Gore or Bush had got more votes respectively. For this, right click on the Polygon in Map Layers and then click on Polygon Color Rule.

polygon color rule

Write a simple expression to display blue or red based on who’s got more votes like shown below.

Color polygon rule

Now your map report should look like this when previewed

polygon color rule result

5) Right click on the Polygon layer and then enable the Show Center Points option.

show center points

6) Now select the Center Point Marker Rule.

center point marker rule

Now choose the Marker Type as Image, image source as External and the image as the Marker Url field as shown below.

change marker type rules

7) Also, we need to change the size of the pie chart based on Total votes field. For this, change the Center Point Size Rule.

Center Point size rule

Change the size rule option as shown below.

change size rule

Now we should get the basic version of the end result as shown below.

end reult - alpha version

8) After deleting the legends as well as modifying the colors so that my eyes don’t get blinded, this is what I came up with.

end result - final version

So aren’t you amazed at the flexibility SSRS gives you? As usual, comment and let me know what your thoughts are on this technique. Meanwhile, feel free to download the completed SSRS 2012 report that I made from here.

Posted by SQLJason, 9 comments
Split a Delimited Row into Multiple Rows using DAX Queries

Split a Delimited Row into Multiple Rows using DAX Queries

Recently, I got a question from one of my readers on whether there is a way to split a single delimited row into multiple rows using DAX. I had kind of demonstrated the opposite to Group Multiple Rows to Single Delimited Row in PowerPivot here and this was another interesting challenge. Read on for the solution.

Split a Delimited Row into Multiple Rows using DAX Queries

Business Requirement

Suppose there is a table which has two columns – StudentName and Subject.

Source Table

The end result should be

Required end result

Solution

We will be using DAX queries to generate a table that will have the desired output. For this demonstration, I will be creating a PowerPivot model and running DAX queries in DAX Studio on top of the Excel PowerPivot model.

1) Create a linked table for the Student.

Student table

2) Create a table called DummyTbl which is just a list of running numbers from 1 till n. The value of n should be equal to the max number of subjects that a student can have. In this demo, I am assuming that a student can have a max of 10 subjects, so my Dummy table will consist of numbers 1 – 10.

Dummy table

3) Now let us create the DAX queries in DAX Studio. First, we will create a computed column called SubCnt which will give the number of subjects for each row.

EVALUATE
SUMMARIZE (
    Student,
    [StudentName],
    [Subject],
“SubCnt”,
    1 + Len ( [Subject] ) – Len ( Substitute ( [Subject], “,”, “” ) )
  )

find number of subjects

4) The only way we can increase the number of rows of a table is by using the function CrossJoin. So let us CrossJoin the Student table with the DummyTbl.

EVALUATE
Crossjoin (
SUMMARIZE (
      Student,
      [StudentName],
      [Subject],
“SubCnt”,
      1 + Len ( [Subject] ) – Len ( Substitute ( [Subject], “,”, “” ) )
    ),
    DummyTbl
  )

crossjoined tables

5) Now filter the resultant table where SubCnt is less than or equal to Dummy column. Now we have got the exact number of rows needed for the end result.

EVALUATE
Filter (
Crossjoin (
SUMMARIZE (
        Student,
[StudentName],
[Subject],
“SubCnt”,
1 + Len ( [Subject] ) – Len ( Substitute ( [Subject], “,”, “” ) )
      ),
      DummyTbl
    ),
DummyTbl[Dummy] <= [SubCnt]
  )

filtered table

6) The only thing left is to split the delimited subjects to the single subject for each row. And the only way to split delimited values is to use the PATHITEM function. To use the PATHITEM function, we should substitute the commas (,) with the pipe (|) symbol.

EVALUATE
ADDCOLUMNS (
FILTER (
CROSSJOIN (
SUMMARIZE (
          Student,
          [StudentName],
          [Subject],
“SubCnt”,
          1 + Len ( [Subject] ) – Len ( Substitute ( [Subject], “,”, “” ) )
        ),
        DummyTbl
      ),
      DummyTbl[Dummy] <= [SubCnt]
    ),
“SubName”,
PATHITEM (
SUBSTITUTE ( Student[Subject], “,”, “|” ),
      DummyTbl[Dummy]
    )
  )

splitting delimitted subjects to individual subjects

7) Now all we need to do is to select the two required columns.

EVALUATE
SUMMARIZE (
ADDCOLUMNS (
FILTER (
CROSSJOIN (
SUMMARIZE (
            Student,
            [StudentName],
            [Subject],
“SubCnt”,
            1 + Len ( [Subject] ) – Len ( Substitute ( [Subject], “,”, “” ) )
          ),
          DummyTbl
        ),
        DummyTbl[Dummy] <= [SubCnt]
      ),
“SubName”,
pathitem (
Substitute ( Student[Subject], “,”, “|” ),
        DummyTbl[Dummy]
      )
    ),
    [StudentName],
    [SubName]
  )
ORDER BY [StudentName]

end result

Hopefully this post will help you when you encounter a similar situation!

Posted by SQLJason, 6 comments
A Sample SSRS Dashboard and some Tips & Tricks

A Sample SSRS Dashboard and some Tips & Tricks

The other day, I was browsing for some sample SSRS dashboards and one of the top images that came up made me cringe. You might wonder what could be so bad in a dashboard as to make me cringe. Well, apart from the factor that the dashboard was openly flouting all the best practices for visualizations and that it would have misled countless people in virtue of being one of the top five images for the search term – ‘SSRS Dashboard’, it was made by ME. To be precise, this dashboard was made by me around 3.5 years back, a time when I believed more in ‘eye-candy’ rather than effective visualizations. That was when I felt that I owed a proper SSRS dashboard to the community.

SSRS Sample dashboard and some tips and tricks

First of all, let me start by saying that this dashboard was inspired (a polite way of saying – a complete rip-off Smile) from this Dundas Dashboard example. Personally, I felt that this was a very good dashboard example and I decided to reproduce it in SSRS. The result is given below:-

SSRS Sample dashboard

Let me tell you some of the things that I like in this dashboard:-

1) The colours are carefully chosen. Note how the dashboard is not a kaleidoscopic combination of different colours like it’s predecessor. Light tones of grey are used to depict the axis as well as the gridlines, while the charts are more darker which gives them more attention.Also, in spite of having 4X times more information than the previous dashboard, the new one doesn’t look crowded (thanks to the right choice of colours)

2) The dashboard focuses on the three main KPIs but also has an executive summary on the right. I also like the way each line represents a story for that KPI.

3) I particularly like the KPI boxes on the left. Each box has a current value (which is highlighted by the size), difference from YTD (which is the small box on the top right), a sparkline on the bottom as well as conditional formatting to show whether the KPI is above or below the target. I think it gives a sexy eye-candy sort of feel to the dashboard without bending the visualization best practices. Also, I prefer the faded red and green colours to the normal traffic light red and green colours that we see in usual dashboards.

4) The labels are all formatted appropriately (for eg, 3M instead of 3,000,000). This ensures that we don’t waste a lot of space for labels. With that said, let me go ahead and show you some of the tips and tricks I employed for making this dashboard:-

I) Alternating Background for the sparkline

sparkline

You can see the above sparkline (which is for the four quarters) has an alternating background for the four quarters. By default, you can have alternating colours in SSRS chart background if you set the background color of the chart area to a color (say light grey) and the InterlacedColor property of the X axis to another color (say dark grey). Now the problem is that the line charts will start at the meeting point of the two colours (like shown below) instead of at the centre.

Normal alternating background for sparklines

One of the ways you can get this done is by adding 4 striplines to the X-axis. For the first and third stripline, ensure that the BackgroundColor is the same (lets say light grey) and the second and fourth should have the same BackgroundColor (which is dark grey). All the striplines should have a Stripwidth of 1. The IntervalOffset property should be 0.5, 1.5, 2.5, 3.5 respectively for the four striplines.

Stripline property

Click on the Y axis and press the delete button to remove it. For the X axis, click on it and set the LineStyle property to None, Major TickMarks–>LineStyle to None and HideLabels to True. This will give us the required result below. For use in the dashboard, remove the chart border also.

required alternate background for sparklines

II) Pretty KPI boxes

KPI boxes

The pretty KPI boxes are just a combination of textboxes and the sparkline we made above. This is how it actually looks in design mode

KPI boxes in design mode

Right now, I have hardcoded most of the stuff. But you can change it as required. For eg, to get the conditional formatting, you can write a simple expression in the Color property of the textbox. The only other thing is to have different font sizes / font styles for the $ and numeric value. To demonstrate that, I pull in a textbox and type $ in it. After that, I drag and drop a field from my dataset (SAC) to the textbox.

ssrs textbox

Now I can individually select $ and give it a different font style / font size and also do the same for SAC (whose expression I modified to =Sum(Fields!SAC.Value, “G1_B1”) from =Fields!SAC.Value)

ssrs textbox - selected text properties

III) Formatting the labels

To format your labels in thousandss, you can use the format – 0,’K’

Label format - thousands

Similarly, if you want to format in millions, you can use – 0,,’M’. Note that you are just increasing a comma.

Label format - millions

IV) Correlated column charts

correlated column charts in SSRS

For this, add a normal column chart with 3 measures.

column chart

Now change the PointWidth property of the second measure to 0.6 and third measure to 0.3.

PointWidth property

Now right click on the Chart Area, and click on Properties. Check the ‘Enable 3D option and set the rotation, inclination and wall thickness to 0.

3D option

After that, set the BackgroundColor property of the chart area to No Color. Now you should have the required correlated column chart.

final correlated chart

Well, there are a few more tips in the dashboard that I made (even though I have hardcoded a LOT of stuff, since this was just a demo) and if you want, you can download the rdl file of this report from here. Hopefully, I have redeemed myself by creating a better dashboard than the original one! Smile

Posted by SQLJason, 44 comments
And I’m Speaking at the PASS Summit 2013!

And I’m Speaking at the PASS Summit 2013!

10 days back, I was attending the speaker dinner for SQL Saturday Atlanta when I got a mail with the subject line – PASS Summit 2013 – Call for Speakers Results. Speaking at the PASS Summit was one of my goals for this year and I had submitted three abstracts. Holding my breath, I opened up my mail and guess what?  SQLJason - I'm speaking at the PASS Summit 2013 Yes, I am speaking at the PASS Summit 2013!!! And to make it doubly sweeter, I got two of my sessions accepted. Being a business intelligence professional, it was hard for me to ignore the parallels between SQL Saturday Atlanta and the PASS Summit
I was speaking at SQL Saturday Atlanta for the first time and I will be speaking at the PASS Summit also for the first time
I submitted 3 session abstracts for SQL Saturday Atlanta as well as the PASS Summit 2013.
Both SQL Saturday Atlanta and PASS Summit selected two abstracts, the only ones from over 15+ events that I have done so far. PASS Summit is the world’s largest, most-focused, and most-intensive conference for Microsoft SQL Server and BI professionals. Organized by and for SQL Server and BI users, PASS Summit delivers the most technical sessions, the largest number of attendees, the best networking, and the highest-rated sessions and speakers of any SQL Server event in the world. For a SQL Server speaker, the PASS Summit is the mother of all events and every year, top speakers from all over the world submit abstracts for this annual event. To cut it short, I am honoured to be among the chosen ones for this year (although I am a little sad that some of my favourite speakers are not there in the list) and am extremely thankful to the abstract review team as well as PASS for choosing my sessions. Now that the sessions have been made public, feel free to visit this link to see the amazing line-up of speakers and sessions for the event. Also, if you haven’t registered for this event, it is still not late – Register for PASS Summit 2013. My session abstracts are given below

Geospatial Analytics Using Microsoft BI

Speaker(s)Jason Thomas Duration: 75 minutes Track: BI Information Delivery

In this age of Big Data, location matters more than many people imagine. Almost every set of data has some geographic information tied to it, and it’s all the more pertinent to understand and unravel the patterns and trends behind that data. This session will start by explaining why geospatial analytics is important in today’s world and then continue by exploring how we can share, analyze, and visualize data using the comprehensive Microsoft BI toolset, including Power View, GeoFlow, SQL Server Reporting Services, and more. The session will feature practical demos for each of these tools, during which we’ll work on some of the publicly available data sets to uncover some interesting facts.
Visualizing John Snow’s Cholera Map Using Microsoft BI

Speaker(s)Jason Thomas Duration: 75 minutes Track: BI Information Delivery

Back in 1854, London was developing into one of the world's first modern cities, although without the essential sanitary infrastructures. This along with the increase in population made it the perfect breeding ground for many diseases, such as cholera. Around the same time, physician John Snow explored the correlation between diseases and contaminated water supplies and came up with one of the most famous maps in history. Relive the adventure of John Snow as we follow his journey in this session using Microsoft BI tools. As we recreate his map, we’ll also talk about the  steps and best practices for exploring geospatial data.
Posted by SQLJason, 7 comments
Ehh, What’s up, SqlJason?

Ehh, What’s up, SqlJason?

It’s been more than a month since the last post and you might be wondering – what’s up with SqlJason? Well, let me tell you that the last one month has been really busy and I thought of filling you guys in with some of the juicy stuff that’s been happening! Ehh, What’s up, SqlJason? 1) Speaking Events
Last year April was when I started my first speaking assignment in the USA and it was great to have so many different opportunities to present last year. This year April also turned out to be the starting month for my presentations, and I can’t believe that I already presented at five different places this month. Speaking events in April The PASS BA Conference was the first stop, and this was my first experiencing presenting in a big stage. Looks like it went good from the session evaluations (got a 4.6, my session ranks in the top 15 out of the 60+ sessions in the conference) Session Evals sorted by session ratings The attendance was also pretty decent, managed to pull in 98 people which is the highest I have ever got in any of my sessions. Session Evals sorted by attendance I had lots of fun with my #sqlfamily and also got to meet a lot of new people. Overall, a very fun experience and I am definitely going to attend PASS BA Conference next year. Next stop was SQL Saturday Chicago and this was just the next day after PASS BA Con. After all the partying and the awesome speaker dinner at the previous night, speaking was always going to be difficult task but luckily, everything including the demos went fine. I also got to see some great sessions by Mark V & Doug Lane, Julie Koesmarno, Leonard Murphy. In short, another great experience and a very well organized event. Next on the list was SQL Saturday Jacksonville which took place 2 weeks after the Chicago event. The Jacksonville event was one of my favourites from last year and it turned out to be even better this year as it was the first time all four of us CBIG buddies (me, Javier, Melissa and Rafael) got selected to speak at the same event. And I must say Jacksonville didn’t disappoint me, great crowd as usual and highly interactive (and very generous with the session evals too hehe). A couple of days later, I started off for my first user group speaking event of the year – Triad SQL BI User Group at Winston Salem. A big thanks to the user group leaders (Katherine Fraser and Doug Purnell) for giving me the opportunity, really enjoyed interacting with the user group members there. The next speaking event was the Carolina Code Camp 2013 which was in Charlotte itself (finally a home event! Open-mouthed smile). Unlike the other four events, the majority of the attendees was from a non-SQL Server background. So it was fun speaking to a different profile of attendees and I must say that this was one of those sessions where the attendees laughed at all of my jokes (either they got all of my jokes or were being REALLY nice with me Smile). I remember someone saying that event speaking is really addictive, I couldn’t agree more! Just to prove that, I have 2 more events coming up in the next week:- a) SQL Saturday Atlanta – I will be speaking on the topic “GeoSpatial Analytics using Microsoft BI” at 1 pm, this weekend. This is going to be another really great event with registrations hitting record levels for a SQL Saturday (800+ including waiting list). If you are coming for the event, definitely drop by and introduce yourself, I would love to meet you. b) Mariner Webinar – If you are not able to attend my session at SQL Saturday Atlanta, don’t worry. I am doing the same session (“GeoSpatial Analytics using Microsoft BI”) as a Webinar for my company – Mariner on May 21, 2013 12:00 PM-01:00 PM Eastern. You can register for this FREE event by clicking on this link – Click to Register  2) CBIG Updates As some of you recall, me and three others (Javier, Melissa and Rafael) started a new BI user group this year in Charlotte – CBIG. We are having a great run with attendees steadily increasing from 35 in the first meeting to 50+ in the last meeting (65 being the highest as of now). Even though five meetings are too early to call the user group a success, the popularity of CBIG is evident from the 87 registrations (including waiting list) we received in our fifth meeting (where our seating capacity is only 60). We had already moved from the initial meeting room after our second meeting (where our seating capacity was 30) and now, due to the increasing number of registrations, we are moving our venue from New Horizons to the Microsoft Campus (which has a seating capacity of 85) in July. That said, the numbers are secondary. The primary aim for us is to cultivate a culture in CBIG which mainly revolves around the community spirit as well as a passion for technology. If you are in or around Charlotte, don’t miss out on our user group meetings on the first Tuesdays of every month. We have got meetings lined up till September as of now and the RSVP links can be found at http://charbigroup.com/. Also, don’t hesitate to contact me if you want to speak at our user group, we are always looking for speakers! 3) SQL Saturday Charlotte – BI Edition Announced And yes, we have also announced the date for SQL Saturday Charlotte – BI Edition. The event will take place on Saturday, October 19, 2013 (which happens to be the Saturday after the PASS Summit 2013, which also is in Charlotte). The planning is going on full steam and we are confident that we will have a great event laid out for all of you guys. Ensure that you register today to extend your PASS Summit 2013 experience.   4) And now the BIG one – Personal Updates For those of you who know me personally, this might be old news. And for the others, you might recall me saying at the start of the year that I might not match up to my previous year’s activities this year due to personal reasons. Well, the personal reason is given below. Me & my wife with the baby bump Yups, I am going to be a father to a baby girl this August 12 and I have no shortage of adjectives to describe my condition right now – excited, anxious, nervous, hyperactive, charged and so on. I am looking forward to those sleepless nights and all the other joys that come with parenthood. For the moment, I am going to stop writing and enjoy a good night’s sleep when I can.

Posted by SQLJason, 1 comment
Download GeoFlow Preview for Excel 2013

Download GeoFlow Preview for Excel 2013

Over the last couple of weeks, many people have been asking me for the download links for GeoFlow. Well, I am pleased to say that GeoGlow has made it’s public preview today and is available for download now. GeoFlow Preview for Excel 2013 Find below some useful links (including the download link)

And I guess it is time to cut down on some of the earlier limitations that I had mentioned about the beta release of GeoFlow:-
Related tables are not supported in this release.All the data should be in one table.
Backend data changes are not reflected in the visualization in this release and this will mean deleting and recreating the visualizations.
No Undo/Redo is supported in this release.

No way to slice and dice data within the visualization
No drill down functionalities from a higher level (say, states) to a lower level (say, counties) which is there in tools like Power View. Not bad huh? Some new features are also there like the Top / Bottom 100 Chart shown below. top / bottom 100 chart in geoflow It is interesting to note that GeoFlow is not available to all versions of Excel 2013. The supported Microsoft Office versions are:

  • Office Professional Plus 2013
  • Office 365 ProPlus
  • Office 365 Midsize
  • Office 365 E3
  • Office 365 E4

Enough of me telling, it’s time to explore this add-in on your own now. Download GeoFlow today and take your data for a 3-D drive! Smile

Posted by SQLJason, 6 comments
GeoSpatial Analytics, Microsoft BI & John Snow’s Cholera Map

GeoSpatial Analytics, Microsoft BI & John Snow’s Cholera Map

Throughout my childhood, I have always been fascinated by the idea of maps. It could be attributed to the number of pirate and treasure hunter stories that I used to voraciously devour. My dreams were filled with the protagonists in those books and their endless struggles to decipher the coded information in the maps. What used to strike me was that the location of the treasure was always present in the maps but still countless men were misled by them. Most of the times, it took the right person or the right context to get to the bottom of the map. As I grew older, my association with pirate / treasure hunter stories decreased (apart from watching the occasional Pirates of the Caribbean releases) but my bond with maps just went on getting stronger. Though my friends might joke that the reason for this is because I am so spatially disoriented (I still use a navigation system to travel from my home to the office which is just a mile away for fear of getting lost), the real rationale is because I see a lot of value in geospatial analytics as a BI consultant. In today’s world, when more and more product companies are bringing forth tools to easily analyse location based data, the excuses for not trying to analyse and explore geographic patterns in your business data are running dry. The Microsoft BI platform has a very able set of tools for geospatial analysis which includes SQL Server Reporting Services, Power View and Geo Flow (this is without including some of the map apps that are present in Office 2013) and helps me a lot in evangelizing geospatial analytics amongst my clients. While I was reading the news, I came to know that the last week was the 200th birth anniversary of one of the pioneers of geospatial analysis as well as the father of modern epidemiology – John Snow. I decided to pay my homage to him by following his adventure and reproducing his famous 1854 Broad Street Cholera Outbreak map using Microsoft BI.

GeoSpatial Analytics, Microsoft BI & John Snow’s Cholera Map

Before we start, it might be interesting to set the context to the 19th century London. London was evolving into a big city with a large influx of people, but without the modern day facilities and sanitary services. Houses used to have cesspools below their basements and human as well as animal wastes were accumulated there. Initially, people seemed to forget about the cesspools below and somehow thought that the wastes would get disposed magically, till the cesspools started overrunning. That is when the London government decided to dispose the waste into River Thames (which was also the source for water supply). You can imagine London at that time – stinking with all the human and animal wastes and it was no wonder that diseases were rampant at that time. The reason for the diseases were all attributed to the pollution and foul air emanating from those wastes. John Snow was a big sceptic of this theory though he had no evidence to prove otherwise. That is around when there was a major outbreak of Cholera in the Soho district of London and he decided to investigate it further. I am not 100% sure of how he actually did his investigation, but I am pretty sure that he must have at least gone through some variations (if not exact) of the four main sections in any treasure-hunting plot :-

1) Collecting the Data (or acquiring the Map!)
The first section of most treasure hunter stories weave through the part where the protagonist goes through shady alleys and dilapidated shops to find the treasure map from some ancient hawker. Likewise, John Snow might have gone through the disease laden streets to pick up the statistics on the cholera deaths and their locations. Luckily for me, I just had to search the internet and got the data (thanks to Robin Wilson). You can download a copy of the data from here. I did some simple changes to the source data to extract the longitude and the latitude columns from the geometry column as shown below (pro tip: use the Flash Fill feature of Excel 2013).

Cholera death stats

2) Exploring the data patterns (or deciphering the Map!)
The plot continues where the protagonist tries to make sense out of the map. Treasure maps are never straightforward and he takes a lot of time to see whether there are any hidden clues or markers in the map. Similarly, John Snow must have spent a lot of time studying the statistics as well as exploring the data visually for geographic patterns. I decided to visualize the data using Power View initially to check for any patterns.

Cholera death visualization using Power View

Just as the protagonist tries different approaches, I decided to also use GeoFlow to visualize the same data.

Cholera Deaths layer

The heatmap visualization in GeoFlow indicated something that Power View didn’t show that explicitly – unusually large number of deaths around the red area.

3) Investigating the Outliers / Patterns (or Validating the potential Treasure locations)
This is one of the most crucial part of the story, where the protagonist tries to cross check and re-validate the patterns or code that he has found within the map. This is what is going to distinguish him from the countless men that have lost their lives in search of the treasure. He makes sure that the clues that he has got from the map is not a red-herring. This is what John Snow did too, as is his evident from his writings – “On proceeding to the spot, I found that nearly all the deaths had taken place within a short distance of the [Broad Street] pump. There were only ten deaths in houses situated decidedly nearer to another street-pump. In five of these cases the families of the deceased persons informed me that they always sent to the pump in Broad Street, as they preferred the water to that of the pumps which were nearer. In three other cases, the deceased were children who went to school near the pump in Broad Street… With regard to the deaths occurring in the locality belonging to the pump, there were 61 instances in which I was informed that the deceased persons used to drink the pump water from Broad Street, either constantly or occasionally…”. As for all of us, we already know the cause and hence all I had to do currently was to display the pumps also as another layer in GeoFlow and to see the correlation of deaths and proximity of pump.

Cholera Deaths layer with the Pumps layer

Isn’t it amazing to see how accurately GeoFlow has plotted the problem causing Broad Street pump right in the area of the maximum deaths!

4) Visualizing and Sharing the Results (or finding the Treasure!)
This is the final part of the story where the protagonist undertakes the arduous journey to find (and hopefully claims) the treasure. The journey is not easy and there are every chance that he might be misled like the countless many before him. Similarly, in geospatial analysis, it is very easy to get carried away and share wrong results.

xkcd

However, John Snow’s map (shown below) was very instrumental in convincing the London authorities to shut down the problem causing Broad Street Pump and history says that this helped in containing the outbreak.

File:Snow-cholera-map-1.jpg

If done correctly, geospatial analysis is extremely powerful and with technologies like Power View and GeoFlow, this can be done pretty easily and quickly. If you are interested in geospatial analysis or Business Analytics in general, it’s still not late to register for the PASS BA Conference happening on Apr 10-12, 2013. And if you do register, don’t forget to attend my session on GeoSpatial Analytics using Microsoft BI.

GeoSpatial Analytics using Microsoft BI

You can download the completed Excel 2013 file with the Power View and GeoFlow visualizations from here.

Posted by SQLJason, 9 comments
Unpivoting Data in Data Explorer

Unpivoting Data in Data Explorer

Data Explorer is generating a lot of positive buzz within the community and everyone I speak to seems to be really excited about this. (If you are wondering what Data Explorer is, you might want to quickly catch up on my previous post – Introduction to Data Explorer Preview for Excel). And just today, the Data Explorer team announced A new build of “Data Explorer”, and an Auto Update feature (must say I really like the idea of Update button). This was followed by a post from Jamie Thomson aka SSIS Junkie (blog | twitter) on the query language ( M ) in Data Explorer and you can read more about that here. All this made me really excited and I also decided to contribute something through this post. Unpivot rows in data explorer It is a pretty common requirement to unpivot data, especially when you are scourging the net for open data. Data Explorer currently does not have any feature to unpivot data but the query language looked pretty solid and I decided to give it a try (actually, this exercise started as a way to test the new query editing functionality in Data Explorer). Well, I did succeed though it might not look that easy (if you do find an easier way, please share it!). Read on for the solution:- 1) The source for my post is given below source data I just made a simple table in the excel spreadsheet which gives the Sales by State for the years 2010 to 2012. I then click on the From Table option in Data Explorer tab to get the query window as shown below. Import data from excel 2) I add a new column called JCol with a value of 1 using the expression below

= Table.AddColumn(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content], “JCol”, each 1)

Add join column to first table  Now this is my first table, and I store the expression with me. 3) My next objective is to make a transpose of the original table. This can be done with the expression below

= Table.PromoteHeaders(Table.Transpose(Table.DemoteHeaders(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content])))

Transpose table 4) Now we need to rename the State Column to Year.

= Table.RenameColumns(Table.PromoteHeaders(Table.Transpose(Table.DemoteHeaders(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content]))), {“State”,”Year”})

Rename column 5) Now, we need to add a column to this resultant table called JCol (just as we did to the first table in Step 2).

= Table.AddColumn(Table.RenameColumns(Table.PromoteHeaders(Table.Transpose(Table.DemoteHeaders(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content]))), {“State”,”Year”}), “JCol”, each 1)

Add join column to second table 6) You might be wondering at this stage why we created the JCol column in both the tables. The reason is that we need to cross join both of these tables so that we get extra rows, and the join column is going to be JCol. Since the values for the join column are all the same, we get a cross join. To do the join of both the tables, delete the existing expression and enter the expression below

= Table.Join(Table.AddColumn(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content], “JCol”, each 1), “JCol”, Table.AddColumn(Table.RenameColumns(Table.PromoteHeaders(Table.Transpose(Table.DemoteHeaders(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content]))), {“State”,”Year”}), “JCol”, each 1), “JCol”)

Join both the table expressions 7) Now add a new calculated column which will give you the value in NC column if State is NC, SC column if State is SC and FL column if State is FL. Add custom column for Sales The expression for the calculated column is given below

if [State] = “NC” then [NC] else if [State]= “SC” then [SC] else [FL]

custom column expression Now your result should look like shown below end result of custom calculation 8) Now all you have to do is to rename the Custom column to Sales and hide the unnecessary columns. Then you will be able to see the unpivoted data as shown below rename and hide unnecessary columns 9) Also, have a look at how the actual query looks like in the advanced query editor Advanced query editor I would like to see a word-wrap option for the advanced query editor so that I don’t have to scroll over to the right to see long formulas. Apart from that, looks great! Don’t forget to share your comments as well as your posts / experiences with Data Explorer and it’s query language. Update I got quite some requests asking for the actual query. So here it is

let
    Source = Table.Join(Table.AddColumn(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content], “JCol”, each 1), “JCol”, Table.AddColumn(Table.RenameColumns(Table.PromoteHeaders(Table.Transpose(Table.DemoteHeaders(Excel.CurrentWorkbook(){[Name=”Table37″]}[Content]))), {“State”,”Year”}), “JCol”, each 1), “JCol”),
   InsertedCustom = Table.AddColumn(Source, “Custom”, each if [State] = “NC” then [NC] else if [State]= “SC” then [SC] else [FL]),
    RenamedColumns = Table.RenameColumns(InsertedCustom,{{“Custom”, “Sales”}}),
    HiddenColumns = Table.RemoveColumns(RenamedColumns,{“2010”, “2011”, “2012”, “JCol”, “NC”, “SC”, “FL”})
in
    HiddenColumns

Remember to change the name of the table name when you use it.

Posted by SQLJason, 5 comments
Introduction to Data Explorer Preview for Excel

Introduction to Data Explorer Preview for Excel

It’s just been a couple of days since this add-in has been released and I can’t seem to stop using it. I guess I haven’t been this excited about a Microsoft offering since SSRS 2008 R2 (well, I am really passionate about Tabular and PowerPivot now but I wasn’t that hooked onto those technologies when they were released). As a BI consultant, blogger and speaker, I traverse through lots and loads of open data – sometimes trying to get that extra piece of open data which will add value to my clients’ existing data or to get some interesting stuff for my blogs and presentations. Data Explorer might not be the perfect solution, but it sure does a splendid job of reducing my efforts in searching, shaping and preparing my data. I thought of sharing my experiences in the form of a quick introduction to Data Explorer Preview for Excel. Introduction to Data Explorer Preview for Excel 1) What is Data Explorer Preview for Excel and where can I get it from? Microsoft “Data Explorer” Preview for Excel is an add-in for Excel 2013 which provides an intuitive user interface for data discovery, data transformation and enrichment. You can download the add-in from here. 2) Basic Tutorial for Data Explorer You can import data into excel from a wide variety of sources using Data Explorer and the complete list of sources is given here. For the purpose of this demo, I am going to be connecting to my favourite source which is the ‘web page’ source. a) After installing the add-in, you should be able to see the Data Explorer tab in Excel (if not, go to File—>Options—>Add-Ins—>Com Add-Ins and enable the Data Explorer add-in). Click on the From Web option and enter the url – http://www.nuforc.org/webreports/ndxloc.html as shown below. Import From Web b) Now click on Table 0 in the Navigator pane and you should be able to see the UFO sightings by states. The table has some non-US data and so to filter them, click on the dropdown in the reports column and unselect them. Click on OK when you are done. Unselect unnecessary data c) Rename the columns as States and UFO Sightings respectively. Notice that the Steps pane on the right shows the modifications you are making and you can expand the pane. rename columns d) Click on Done and now you can see the data in Excel. This data can be used as a source for your charts / tables and in the image below, I have used the “Geographic Heat Map” Office app to visualize the same. Looks like the aliens have taken a liking for California! Data visualized using Heat Map 3) Interesting Features a) Online Search:- I spend a lot of time trying to search for open datasets and this feature is surely going to reduce that time. You can search for data right from Excel by clicking on the Online Search button. For eg, if I search for ‘richest states’, the results as shown below. Online Search option Just scroll your mouse over the results and click on Use to add the data. b) Filter & Shape Data:- The ability to filter and transform your data is what makes this so useful. You can just click on the Filter & Shape button to start the process as shown below. Filter & shape data For eg, if you just want to show the income for 2011, you can hide the rest of the columns as shown below. Hide unnecessary columns You can also do a lot of other operations like Splitting columns, removing duplicates, replacing values, changing types, group-by, etc. Types of transforms available You can read the complete list from here. Hopefully, there will be more additions to this in the future (I could definitely use a Unpivot / Pivot option). c) Merging & Appending from Multiple Sources:- This is another killer feature in Data Explorer. You can merge or append from multiple sources. For eg, if we have to merge the two sheets that we just created, click on the Merge button and then select the primary and secondary tables from the dropdown. Merge data from two sources Then select the matching columns from both the tables and click on Apply. select matching columns Now, to display the additional columns, click on the expand icon on the New Column and select the UFO sightings. Click on OK. Expand columns Now you have combined both the sources and can use it for your visualization. I have visualized the same data using GeoFlow below. Data visualized using GeoFlow 4) Further Reading There’s lot of interesting stuff you can do with Data Explorer and if this post caught your interest, make sure to check the below ones too

1) Data Explorer Team – Announcing Microsoft “Data Explorer” Preview for Excel

2) Chris Webb – Importing Data From Multiple Log Files Using Data Explorer

3) Chris Webb – Calling A Web Service From Data Explorer, Part 1

4) Jamie Thomson – Traversing the Facebook Graph using Data Explorer

5) Matt Masson – Access the Windows Azure Marketplace from Data Explorer

6) Jake Smillie – Best Oscar winning Film? My first Data Explorer adventure…

7) Ian Morrish – SharePoint OData and the Excel Data Explorer

8) Data Explorer Help

Updates

9) Dan English – Installing Data Explorer Preview & Demo with IMDB Data

10) Alan Koo – Introduction to Microsoft Data Explorer Preview for Excel 2013 – Part 1

Posted by SQLJason, 1 comment