Interview Questions

Performing a Right Outer Join with a Left Outer Join clause

Performing a Right Outer Join with a Left Outer Join clause

August 24, 2010

Most of the interviewers I have seen tend to prefer people with better problem solving skills rather than better knowledge on a particular technology. No wonder considering the fact that technologies change with each fleeting moment, while a good problem solving mentality stays forever. With this prelude, I put forward this question that I came to hear from one of friends recently – How do you perform a right outer join with a left outer join clause? And no, you can’t swap the tables here. Disclaimer : This question is purely to test your knowledge of how the said Joins work and to test your problem solving skills. It may not serve any real purpose in a practical project scenario. Question Lets say, I have the following tables Demo and Demo2 Query : select * from Demo Output Query Result - Demo Query : select * from Demo2 Output Query Result - Demo2 Now using a left outer join, the result should be equivalent to the output of SELECT A.attribute,
       B.attribute
FROM   Demo2 AS A
       RIGHT OUTER JOIN
       Demo AS B
       ON A.attribute = B.attribute; Output Demo2 Right outer join Demo
And you can’t swap the order of the tables i.e. Demo2 should be on the left hand side and Demo should be on the right side. Answer Now this exercise is basically to inculcate the importance of problem solving and hence I will try to detail out the steps as much as possible. Lets go step by step:- 1) The thing to understand here is that we can’t use the left outer join with the normal join condition like A.attribute = B.attribute, the reason being that we would get all the rows of the left side table (Demo2). Eg: SELECT A.attribute,
       B.attribute
FROM   Demo2 AS A
       LEFT OUTER JOIN
       Demo AS B
       ON A.attribute = B.attribute; Output Demo2 left outer join Demo So somehow we need to bring the required rows in. 2) Now that we have understood that the above join condition won’t work, we need to think of another join condition. Let’s say, what would happen if we give a condition that is always true on the ON clause, something like 1=1? SELECT A.attribute,
       B.attribute
FROM   Demo2 AS A
       LEFT OUTER JOIN
       Demo AS B
       ON 1=1; Output join on 1=1 Cool, now we have got a cross product. 3) Now the thing to do is to understand what a right outer join is really. If the column on the right hand side matches with the column on the left hand side based on the join condition, then both the columns are displayed. Else, a NULL will be displayed on the left hand side with the correct column on the right hand side. For this first we can write a subquery in the columns like shown below SELECT (SELECT attribute
        FROM   Demo2 AS c
        WHERE  c.attribute = b.attribute) AS attr,
       B.attribute
FROM   Demo2 AS A
       LEFT OUTER JOIN
       Demo AS B
       ON 1 = 1; Output Subquery result 4) Now all we need to do is to take the distinct and then we get the output as required. SELECT DISTINCT (SELECT attribute
                 FROM   Demo2 AS c
                 WHERE  c.attribute = b.attribute) AS attr,
                B.attribute
FROM   Demo2 AS A
       LEFT OUTER JOIN
       Demo AS B
       ON 1 = 1; Output Demo2 Right outer join Demo So, all is well? Not really. This particular query would break when we get duplicates. Suppose the TH row was duplicated in Demo table. Now, when we take the distinct after applying the subquery on the cross product, we will still get only one row for TH when a right outer join would have given 2 rows. So I came up with the below query to solve this particular scenario:- SELECT attr,
       attribute
FROM   (SELECT row_number() OVER (PARTITION BY CASE WHEN A.attribute = b.attribute THEN A.attribute ELSE NULL END, b.attribute ORDER BY a.attribute) AS rnk,
               CASE WHEN A.attribute = b.attribute THEN A.attribute ELSE NULL END AS attr,
               B.attribute,
               row_number() OVER (PARTITION BY b.attribute ORDER BY CASE WHEN A.attribute = b.attribute THEN A.attribute ELSE NULL END DESC) AS rnk2
        FROM   Demo2 AS A
               LEFT OUTER JOIN
               Demo AS B
               ON 1 = 1) AS Outr
WHERE  (Outr.rnk = 1
        AND Outr.rnk2 = 1)
       OR (attr = attribute); Please post your queries in the comments if you have got alternate solutions, I would be pleased to have a look at them.

Posted by SQLJason, 0 comments
SSAS Dimension Attribute Properties : Part 2

SSAS Dimension Attribute Properties : Part 2

June 23, 2010

Yesterday, I was reading through some articles and happened to glance through a very memorable quote by  Berthold Auerbach – “ The little dissatisfaction which every artist feels at the completion of a work forms the germ of a new work “. That was when I realized I haven’t yet completed my dimension attribute article and so here I am with the second part. In SSAS Dimension Attribute Properties : Part 1, we reviewed the Advanced and Basic properties of dimension attributes. This article would be continuing with the Misc, Parent-Child and Source properties.

Dimension Attribute Property

The properties are explained below:-

MISC

1) AttributeHierarchyOrdered : This particular property specifies whether the members of the attribute are ordered or not. The values of this property can just be True or false. If the order of the members do not matter, setting this property to false for attributes where the attribute hierarchy is enabled or high cardinality attributes can significantly increase the processing performance.

2) GroupingBehavior :  This property is used to give a hint to the client application whether to encourage or discourage users to group on this attribute and does not affect any of the structures on disk. The values are EncurageGrouping and DiscourageGrouping.

3) InstanceSelection : This property also is used to give a hint to the client application’s UI on the recommended means of selection of a member from the attribute. The options available are

  • None – (Default) no selection used.
  • DropDown – Appropriate for situations where the number of items is small enough to display within a dropdown list.
  • List – Appropriate for situations where the number of items is too large for a dropdown list, but not large enough to require filtering.
  • Filtered List – Most useful in scenarios where the number of items is large enough to require users to filter the items to be displayed.
  • Mandatory Filter – Appropriate for situations where the number of items is so large that the display must always be filtered.

4) MemberNamesUnique : This property indicates whether the member names are unique across the attribute hierarchy and this property affects the way member unique names are generated. The available options are True or False.

Parent-Child

1) MembersWithData : In a parent-child hierarchy, some of the non-leaf members may also have data associated with them (unlike normal hierarchies, where the non-leaf members have a value equal to the sum of it’s leaf values). These members are called data members and are present only for parent-child hierarchies. This particular property is used to set the visibility of the data members in parent-child hierarchies and the available options are NonLeafDataHidden and NonLeafDataVisible. This MSDN article – Working with Attributes in Parent-Child Hierarchies does a very nice job of explaining this property with an example.

2) MembersWithDataCaption : This particular property is used as a naming template for the system generated data members. For eg, if we have the MembersWithData property set to NonLeafDataVisible, then a leaf member representation of the data member is added. For eg, if Jason is a data member, with Thomas and Tom as his leaf members, then there would be an additional Jason added as a leaf member. Now to differentiate between the leaf member and the data member (in this case, both are Jason) would be difficult and hence we can use a template like *(leaf member) as the value of this property. The asterisk symbol is a placeholder for the original name. So our example would become Jason for the data member and Jason(leaf member) for the leaf member.

3) NamingTemplate : This property specifies how levels in a particular parent-child hierarchy would be named. Click the ellipsis button (..) in this property’s cell and then you should be able to view a popup window as shown below:-

Level Naming Template

You can specify a name for the level by clicking on the Name column of the second row and entering for eg, Employee *. This will ensure that instead of Level 02, Level 03, etc., you will be getting Employee 02, employee 03 and so on. For more details, refer to the MSDN article – Defining Parent Attribute Properties in a Parent-Child Hierarchy.

4) RootMemberIf : This property is used to specify the criteria by which we can identify the members of the highest level (excluding the ALL level). Again quoting from an article of William Pearson – “ The four selection options include the following:

  • ParentIsBlankSelfOrMissing – (Default) Only members that meet one or more of the conditions described for ParentIsBlank, ParentIsSelf, or ParentIsMissing are treated as root members.
  • ParentIsBlank – Only members with a null, a zero, or an empty string in the key column or columns are treated as root members.
  • ParentIsSelf – Only members with themselves as parents are treated as root members.
  • ParentIsMissing – Only members with parents that cannot be found are treated as root members.

The behavior of the RootMemberIf property in determining how the root or topmost members of a parent-child hierarchy are identified, is, therefore, dependent upon which of the selections above is made. The default, as noted above, is ParentIsBlankSelfOrMissing.

5) UnaryOperatorColumn : We can control how members on a hierarchy (usually parent/child, but not necessarily) aggregate up to their parents by defining unary operators. To do this, you must create a new column in your dimension table to hold the unary operator values and then set the attribute’s UnaryOperatorColumn property to point to it. This property specifies the column which holds the unary operator. You will find a very good example under the Unary Operators and Weights heading of this article – Measures and Measure Groups in Microsoft Analysis Services: Part 1. The values are (none) and (new) for this property. On clicking New, a dialog box opens which will prompt us to select the binding type, source table and the source column.

Source

1) CustomRollupColumn : Unary operators do not give enough flexibility for rollup, and in such cases, we can write our own rollup formulas as MDX expressions. This property is used to specify a column which will contain the custom rollup formula. A valid expression will ensure that the aggregation logic defined in the AggregateFunction property of the measure would be overridden for this attribute.

2) CustomRollupPropertiesColumn : This property is used to contain the properties of a custom rollup column. Refer Custom Member Formulas heading of this article – Measures and Measure Groups in Microsoft Analysis Services: Part 1 to learn more about the above two properties.

3) KeyColumns : This property contains the column/columns that constitute the key for the attribute, which is the column in the underlying relational table in the data source view to which the attribute is bound. If the NameColumn property is not defined, the value of KeyColumns property would be used to display the attribute members.

4) NameColumn : In most of the cases, the key of the attribute would be a integer value, and this would not make any sense to the user who is viewing the attribute members. For this, we can specify a column in this property which will have the user friendly name of the attribute member.

5) ValueColumn : This property identifies the column that provides the value of the attribute. If the NameColumn element of the attribute is specified, the same DataItem values are used as default values for the ValueColumn element. If the NameColumn element of the attribute is not specified and the KeyColumns collection of the attribute contains a single KeyColumn element representing a key column with a string data type, the same DataItem values are used as default values for the ValueColumn element. In KeyColumns and NameColumn property, there are additional properties which can be got on expanding the plus symbol on the left. This article gives a pretty good overview on them. With this info, you should be all set and ready to give that killer answer in your next interview. Godspeed and good luck to you! 🙂

Posted by SQLJason, 6 comments
SSAS Dimension Attribute Properties : Part 1

SSAS Dimension Attribute Properties : Part 1

June 13, 2010

There could not be any SSAS interview where at least one question is not asked about the dimension, attribute or measure properties. Many at times, I myself have kept on searching the web to find more information about a particular property only to find that the information is pretty scattered and you really have to spend some effort to consolidate all of them. Through this post, I will consolidate all relevant information regarding the attribute properties and try to put it under one roof. A dimension attribute contains a limited list of key values. Each unique key is associated with a dimension member and these dimension members included in the attribute are called attribute members. In the multidimensional model, the attribute also contains properties that define different characteristics of the attribute members called attribute properties. All attribute members have the same properties. To see all the properties of an attribute, click on a particular attribute and press F4. On doing so, you would be able to see the property panel as shown below

SSAS Dimension Properties

Before you go forward and read, you might want to understand some of the terms that are going to keep coming in the explanations below:-

  • Discrete Attributes and Contiguous Attributes : Quoted from Introduction to Attribute Discretization (William Pearson) – “ Whenever we work with attributes, we can expect to encounter two general types of values:
    • Discrete attributes: Discrete values stand apart distinctly, and have clearly defined logical “boundaries” between themselves. The possible values are naturally discrete for the lion’s share of attributes occurring in the business world.
      Example: The Gender attribute, within the Customer dimension of the Adventure Works sample UDM, is (at least for purposes of the sample cube) considered to have only one of two discrete values, female or male.
    • Contiguous attributes: Contiguous values do not stand apart distinctly, but flow along, as if in a continuous line. Contiguous values, especially within large populations, can have very large numbers of possible values. Information consumers can find it difficult to work effectively and efficiently within such wide ranges of values.
      Example: the Vacation Hours attribute, within the Employee dimension of the Adventure Works sample UDM, can have a wide range of possible values, depending upon how many employees are involved, whether there are limits on how many vacation days they can accumulate, and considerations of this nature. The member values are based directly upon the unique values contained within the VacationHours column of the DimEmployee table (with many of the unique values shared among multiple employees). The sheer number of values might make working with them cumbersome for information consumers, if they are simply made available in their existing state.

The properties are explained below:-

ADVANCED

1) AttributeHierarchyDisplayFolder : This property is used to group attribute hierarchies into a particular folder. For eg, if you assign this particular property to Stocking for the attributes Class and Color in Product dimension, you would be able to see as shown below when you browse through the dimension

AttributeHierarchyDisplayFolder

2) AttributeHierarchyEnabled : The values of this property can just be True or False and is used to determine the absence or presence of attribute hierarchies. By default, all the attributes in the dimension would be assigned a value of true which would mean that attribute hierarchies would be enabled by default. From a performance point of view, this property is pretty important which is quite evident from the quote given below from OLAP Design Best Practices for Analysis Services 2005 Attributes add to the complexity and storage requirements of a dimension, and the number of attributes in a dimension can significantly affect performance.  This is especially of attributes which have AttributeHierachyEnabled set to True. Although SQL Server 2005 Analysis Services can support many attributes in a dimension, having more attributes than are actually used decreases performance unnecessarily and can make the end-user experience more difficult. It is usually not necessary to create an attribute for every column in a table. Even though the wizards do this by default in Analysis Services 2005, a better design approach is to start with the attributes you know you’ll need, and later add more attributes.  Adding attributes as you discover they are needed is generally better a better practice than adding everything and then removing attributes.

3) AttributeHierarchyOptimizedState : The values of this property can be FullyOptimized and NotOptimized (beats me why it shouldn’t have been a True or False instead) and is used to enable or disable the optimization of the hierarchy. Turning this property to a NotOptimized state would save resources when the hierarchy is being built during the processing and hence, should be turned off for those attributes which wouldn’t be used frequently for analysis. Turning this off would mean a slowdown in the requests that reference this attribute hierarchy which is the reason why we should just turn this off for less frequently used attributes.

4) AttributeHierarchyVisible :The values of this property can be True or False and is used to set the visibility of the attribute to a client application. If it is set as false, the client application would not be able to determine the presence of this attribute hierarchy (which means it would be invisible when the dimension is browsed through the application) but it would still be able to use the attribute hierarchy in queries. It is a best practice to make this property as False for those attributes which are also used in user defined hierarchies as anyways, you would be able to access that attribute from the user defined hierarchy and you can eliminate redundancy.

5) DefaultMember : This property can be used to specify the default member of the attribute. If no value is specified, then by default the ALL level is used. MDX expressions can also be used to set the default member of the attribute to any attribute member by clicking on the button besides the property which will open up the pop-up window as shown below

DefaultMember

6) DiscretizationBucketCount : Discretization is basically the process of grouping contiguous values into sets of discrete values.

Attribute Discretization

Analysis Services supports several variations of attribute discretizations based on different algorithms and to enable it, DiscretizationBucketCount is one of the two properties that you have to set. As the name suggests, this property defines the number of groups that the values will be placed in.

7) DiscretizationMethod : This property basically determines which discretization algorithm should be applied. The values of this property are

  • None – Discretization would not be done.
  • Automatic – Automatically chooses the best grouping technique among the available values.
  • EqualArea – If the distribution of contiguous values are plotted as a curve, then the areas under the curve covered by wach range would be equal.
  • Cluster – Uses the K-Means algorithm to find ranges on the input values.

Apart from the above algorithms present in the Dimension Editor, it is also possible to specify a user defined discretization method through Data Definition Language (DDL) by defining the boundaries of every group.

8) EstimatedCount : This property specifies the estimated number of members in the attribute. This is usually populated when aggregations are designed (which could mean that when you change environments, the counts could be of the previous environment unless you count again) or specified by the user. This property is used when aggregations are being built and helps the server to make decisions on how to make the aggregations.

9) IsAggregatable : This property is used to determine whether the data associated with the attribute members can be aggregated. If True, then the system will define an ALL level which will have the aggregated data which will be used as the default member when queries which doesn’t reference a particular member of this attribute are executed. For the same reason, if this property is set to False, then a default member should be specified.

10) OrderBy : This property specifies the method by which the attribute members should be sorted and the options are Name, Key, AttributeName and AttributeKey.

11) OrderByAttribute : This property is used to select the attribute on which the ordering should happen if AttributeKey or AttributeName has been used in the OrderBy property.

BASIC

1) Description : This property specifies the description of the attribute and any free text can be entered.

2) ID : This property specifies the unique identifier of the dimension and is non-editable usually in the dimension editor.

3) Name : This property specifies the name of the attribute.

4) Type : I found the best explanation of this property from an article of William Pearson which is quoted below :- “ The value of the Type property for an attribute determines the attribute type – and specifies the type of information contained by – that attribute. Within Analysis Services 2005, attribute types help to classify an attribute based upon its business utility or functionality. Many of the available options represent types which are used by client applications to display or support an attribute. However, some attribute types also have specific meaning to Analysis Services. Some attribute types identify attributes that represent time periods in various calendars for time dimensions. Many attribute types for dimensions or attributes are set via the associated wizard that we use when defining these objects. Attribute types can also be set when we employ wizards, such as the Business Intelligence Wizard, to add functionality to dimensions within Analysis Services. A good example is the application of various attribute types to attributes in a dimension when we use the Business Intelligence Wizard to add Account Intelligence to a given dimension: the wizard applies several attribute types to attributes in the affected dimension, for example, to identify attributes that contain the names, codes, numbers, and structure of accounts within the dimension. Attribute types in Analysis Services fall in into five categorized groups. These attribute type groups include:

  • General: These values are available to all attributes, and exist only to enable classification of attributes for client application purposes.
  • Account Dimension: These values identify an attribute that belongs to an account dimension.
  • Currency Dimension: These values identify an attribute that belongs to a currency dimension.
  • Slowly Changing: These values identify an attribute that belongs to a slowly changing dimension.
  • Time Dimension: These values identify an attribute that belongs to a time dimension. “

5) Usage : This property defines whether the attribute is a key attribute, an additional attribute for the dimension or a parent attribute. (The rest of the properties would be continued in SSAS Dimension Attribute Properties : Part 2)

Posted by SQLJason, 5 comments
Static Named Sets v/s Dynamic Named Sets

Static Named Sets v/s Dynamic Named Sets

May 24, 2010

It is no big secret that most of the questions that are asked in interviews today are repeated. One of those questions which interviewers just love to ask again and again is the difference between  a static named set and a dynamic named set. The popularity of this question is further aided by the fact that this particular feature of dynamic named sets was newly introduced in SQL Server 2008. So here I continue my interview question series on this topic. Many times, it can happen that we end up writing very complex MDX expressions revolving sets. To simplify such expressions or to improve the performance of some queries, we can extract the definitions of some sets into a separate named set expression which are called named sets. A named set can be created either using a CREATE SET statement or a WITH SET statement (For syntax, refer Building Named Sets in MDX). In SQL Server 2005, there was only the Static named sets feature but in SQL Server 2008, both Dynamic and Static named sets are present. Static Named Sets The value of a static named set is evaluated either when the CREATE SET statement is executed (if it is defined inside the cube) or right after the WHERE clause is resolved (if the set is defined within a WITH clause). Hence, when a query references a static named set, the set is not resolved again in the context of the current coordinates which explains why it is known as a static named set. To explain this better, let us consider a scenario where we want to count the number of years for a particular country where the order is greater than 4000. WITH 
  MEMBER [measures].[country count] AS 
    Count
    (
      Filter
      (
        [Date].[Calendar Year].[Calendar Year].ALLMEMBERS
       ,
          (
            [Customer].[Customer Geography].CurrentMember
           ,[Measures].[Internet Order Quantity]
          )
        > 4000
      )
    ) 
SELECT 
  {
    [Measures].[Internet Order Quantity]
   ,[measures].[country count]
  } ON 0
,{
      [Customer].[Customer Geography].[Country].ALLMEMBERS
    * 
      [Date].[Calendar Year].Children
  } ON 1
FROM [Adventure Works]; This will give us the following output result 1  Now let us simulate creating a static named set in the Adventure Works cube and see the result by running the following query in SQL Server Management Studio (SSMS). CREATE 
  STATIC SET [Adventure Works].[Filtered Year] AS 
    Filter
    (
      [Date].[Calendar Year].[Calendar Year].ALLMEMBERS
     ,
        (
          [Customer].[Customer Geography].CurrentMember
         ,[Measures].[Internet Order Quantity]
        )
      > 4000
    ) ; GO
WITH 
  MEMBER [measures].[country count] AS 
    Count([Filtered Year]) 
SELECT 
  {
    [Measures].[Internet Order Quantity]
   ,[measures].[country count]
  } ON 0
,{
      [Customer].[Customer Geography].[Country].ALLMEMBERS
    * 
      [Date].[Calendar Year].Children
  } ON 1
FROM [Adventure Works]
WHERE 
  [Customer].[Country].&[Canada]; Note that we are using GO statement in between, else you will get an error in SSMS saying that multiple statements are not allowed. As per the explanation above, the set would be evaluated when the CREATE SET statement is executed and the coordinates at that context for Customer Geography is ALL. So now when the set is referenced in the query below, it will the count for the ALL member even if Canada is selected. Static set result To verify this, I queried the cube to just return the count of countries when no country is selected and the country count matches with the above result. Total result for all countries Now what do we do if I just wanted to consider the coordinates in the where condition and return the result based on it (in this eg, consider the filter condition on Canada and return just the periods having order greater than 4000 for Canada which is 1)? That is where Dynamic sets come to our aid. Dynamic Named Sets With the help of dynamic named sets, we can create a named set which would be revaluated in context of each query that references it. It is evaluated in the context of WHERE clause and SubSelect of every query but are not evaluated in the context of every cell. The most visible difference would be in the case of dynamic named sets created in the cube. Let us use the previous example and see what happens when we use a dynamic set instead of a static set. CREATE 
  DYNAMIC SET [Adventure Works].[Filtered Year] AS 
    Filter
    (
      [Date].[Calendar Year].[Calendar Year].ALLMEMBERS
     ,
        (
          [Customer].[Customer Geography].CurrentMember
         ,[Measures].[Internet Order Quantity]
        )
      > 4000
    ) ; GO
WITH 
  MEMBER [measures].[country count] AS 
    Count([Filtered Year]) 
SELECT 
  {
    [Measures].[Internet Order Quantity]
   ,[measures].[country count]
  } ON 0
,{
      [Customer].[Customer Geography].[Country].ALLMEMBERS
    * 
      [Date].[Calendar Year].Children
  } ON 1
FROM [Adventure Works]
WHERE 
  [Customer].[Country].&[Canada]; The output is shown below Dynamic Set result Finally we get the result we want taking the filter conditions in the WHERE clause (for Canada). We can also test what happens when we remove Canada from the WHERE clause and introduce it on the rows. CREATE 
  DYNAMIC SET [Adventure Works].[Filtered Year] AS 
    Filter
    (
      [Date].[Calendar Year].[Calendar Year].ALLMEMBERS
     ,
        (
          [Customer].[Customer Geography].CurrentMember
         ,[Measures].[Internet Order Quantity]
        )
      > 4000
    ) ; GO
WITH 
  MEMBER [measures].[country count] AS 
    Count([Filtered Year]) 
SELECT 
  {
    [Measures].[Internet Order Quantity]
   ,[measures].[country count]
  } ON 0
,{
      [Customer].[Customer Geography].[Country].&[Canada]    * 
      [Date].[Calendar Year].Children
  } ON 1
FROM [Adventure Works] As mentioned, the set will be evaluated using the current coordinates which would be ALL for Country as there is nothing on the WHERE clause. The output is shown below Dynamic set result with Canada on rows  Now that you have understood that the difference between a static named set and dynamic named set is not only the difference in their syntax, you might want to go to the following blogs by Mosha (who is called the father of MDX) and understand how a named set improves performance

 

Posted by SQLJason, 6 comments
Converting a single comma separated row into multiple rows

Converting a single comma separated row into multiple rows

May 20, 2010

Ever since I wrote Converting multiple rows into a single comma separated row, I was trying to find a SQL command which will do the reverse, which is converting the single comma separated row back to multiple rows. I checked up a few blogs and I found out that it was possible to do with the help of custom functions or stored procedures, but I was not interested in all of them. Finally I got the answer that I was looking for, and now that I did, I did not waste precious time in noting it down here, lest I forget. Let me try to show you what we are trying to achieve here with the help of an image

Convert comma seperated row to multiple rows

This can be done with the help of the below query

 SELECT A.[State],  
     Split.a.value('.', 'VARCHAR(100)') AS String  
 FROM  (SELECT [State],  
         CAST ('<M>' + REPLACE([City], ',', '</M><M>') + '</M>' AS XML) AS String  
     FROM  TableA) AS A CROSS APPLY String.nodes ('/M') AS Split(a);

Again, no stored procedures or function, just plain old SQL 🙂

Update (23/05/2010) : I saw this blog by Brad Schulz (MVP) explaining this scenario and the performance implications in detail. So people who are serious about getting this in their code, please read his blog before you do so – Delimited String Tennis Anyone?

Posted by SQLJason, 58 comments
How to add a Document Map to your reports

How to add a Document Map to your reports

May 12, 2010

One of the most common things I do during taking interviews is to give real life scenarios and assess the person on how he tackles the situation. Most of the times, what I really look for is how he goes forward in solving the problem – the approach and the way of thinking rather than the answer itself. Another thing that I look for is whether he tries to put some extra effort to implement the solution (especially the reporting part) in an aesthetic and user-friendly way. Most of the candidates would know the answer to some simple question like how to add data labels to their charts but they stutter when I ask what are smart labels or how can we prevent the cluttering of data labels. I would easily give brownie points to someone who details his solution with some extra attention to details. It must be my frequent rendezvous with creating big reports that make me extra fond of document maps in SSRS and an essential part of my SSRS interview questions. It is hard to imagine going through a big book that has no table of contents. But many users and developers have no problem at all in designing big reports without any navigational aid (although at some point of their wretched life, the users are going to curse the person who developed those reports), even though Microsoft has provided this feature. Document Map is a navigational feature in SSRS that, when implemented, allows the user to navigate through the document and its hierarchies. With this feature, you can add a panel to the left of the reports where you can have the list of “contents” of the report. What makes it more special is that on clicking the “content” in the list, you will be directly taken to the page where the content is present. Cool feature, huh? Requirement Suppose you already have a matrix or a table with some groupings as shown below Matrix - Subcategory grouping Suppose the number of subcategories are very large and spans into multiple pages and the requirement is such that each user has to specifically search for a set of subcategories at a particular time of the day. It would make his life a lot easier if there was a document map listing the subcategories on clicking which he would be directly taken to the clicked subcategory Solution 1) Go to the design mode of the report and click on the group properties of subcategory. Subcategory group properties 2) Go to the advanced tab of the properties and set the Document map as the field Subcategory (you can select the appropriate field name from the drop-down). Advanced group properties 3) Save the report, deploy it and then preview it . Report DM Preview Already you can see the panel at the left hand side which contains the list of subcategories. To hide/unhide this panel, you can click on the icon that has been highlighted in red in the above image. Notice that the “Document Map” text highlighted in blue is the report name. 4) Now you can click on any of the subcategories, say Handlebars and you would be taken to the page where the clicked subcategory is. On clicking Handlebars 5) (a) You can also make document map labels for each of your report item. For eg, if you have 2 tables, you can click on the table and press F4 to see the properties. Table document map label 5) (b) You can give the required name in the DocumentMapLabel property as shown above and then see the required result when you preview the report. DM Tables 6) Now that you have seen the toggle symbol, you would have already guessed that it is possible to create hierarchies also in the document map label, for e.g., Subcategory—>Product. For that, all you need to do is to enable Document Map property for the Product group also. Subcategory - Product Hierarchy DM Note Quoting a section from Packtub Learning SQL Server 2008 Reporting Services – “ Document Map is mostly for HTML rendering. Other renderers render differently.

  • PDF: Uses Bookmarks.
  • Excel: Uses named worksheet with hierarchical links. Report sections appear in other sheets.
  • Word document also has a document map like the table of contents.
  • Tiff, CSV, and XML ignore this setting.
  • Recursive data is related to the idea of self joins where the relationship between parent and child is represented by fields in the dataset. “

So there ends the part 3 of my interview series. In my next blog, I would be taking a momentary break from my interview question series and presenting a very interesting application of Document Maps, wait for it 🙂

Posted by SQLJason, 5 comments
OR operation in MDX

OR operation in MDX

May 10, 2010

A seemingly harmless question which makes you ignore it thinking you know the answer, but the moment you give it a second look, you start scratching your head. This was my first reaction when I was asked to implement an OR operation in MDX and I really had to think to come up with an answer, quite contrary to the no-brainer I thought this question was. Ever since, this has been in my list of interview questions and this blog would be the second part of the series where I unveil some of my favourite questions. Problem Lets face a scenario where you have a relation – Month<—Date & Week<—Date. Now the requirement is such that the Date field should be displayed such that it should either belong to the Month selected or the Week selected. Answer Follow the steps below to get the answer 1) The query below will display the Date member captions for all dates WITH 
  MEMBER [measures].[nam] AS 
    [Date].[Date].CurrentMember.Member_Caption 
SELECT 
  [measures].[nam] ON 0
,[Date].[Date].Children ON 1
FROM [Adventure Works]; 2) Suppose we need to display all the Dates such that it either belongs to the month March,2007 or Week 27 CY 2005. P.S. : Just giving a where clause with a set like the query given below will not work as it will perform an AND operation rather than an OR operation. WITH 
  MEMBER [measures].[nam] AS 
    [Date].[Date].CurrentMember.Member_Caption 
SELECT 
  [measures].[nam] ON 0
,[Date].[Date].Children ON 1
FROM [Adventure Works]
WHERE 
  (
    {[Date].[Calendar Weeks].[Calendar Week].&[27]&[2005]}
   ,{[Date].[Calendar].[Month].&[2007]&[3]}
  ); 3) The correct way is to bring both the conditions inside a set but as of now, they are of different dimensionality and can’t be used together as the members of the same set. Hence we create a tuple of the calendar week and month such that the valid condition of one attribute would be used with the default member of the other dimension (which is usually the All member). WITH 
  MEMBER [measures].[nam] AS 
    [Date].[Date].CurrentMember.Member_Caption 
SELECT 
  [measures].[nam] ON 0
,[Date].[Date].Children ON 1
FROM [Adventure Works]
WHERE 
  {
    (
      [Date].[Calendar].DefaultMember
     ,[Date].[Calendar Weeks].[Calendar Week].&[27]&[2005]
    )
   ,(
      [Date].[Calendar].[Month].&[2007]&[3]
     ,[Date].[Calendar Weeks].DefaultMember
    )
  }; This query would give the correct output as shown below OR operation in MDX This query makes use of the fact that ([Date].[Calendar].DefaultMember,[Date].[Calendar Weeks].[Calendar Week].&[27]&[2005]) = (Date].[Calendar Weeks].[Calendar Week].&[27]&[2005]) We use this technique to make the dimensionality of both the week and month conditions the same. Once this concept is understood, implementing OR conditions in MDX would be a piece of cake. 

Posted by SQLJason, 0 comments
MDX : Non Empty v/s NonEmpty

MDX : Non Empty v/s NonEmpty

May 6, 2010

Now that the recession pains have allayed, most of my colleagues are having a very busy time either taking interviews or giving interviews. As for myself, I thought of sharing some thoughts on my interview questions through my blogs and here goes my first attempt where I have tried to differentiate between Non Empty and NonEmpty. Find the link to my blog here – MDX : Non Empty v/s NonEmpty Non Empty vs NonEmpty Title MDX

Posted by SQLJason, 1 comment