VSJ
Visual Studio 2010 available now - click for details
The independent source for software developers
Home
Email Newswire
.NET Zone
Java Zone
XML & Web Services Zone
Database Development Zone
Architecture Zone
BlackBerry Zone
News
Articles
Free Downloads
Training Courses
Books
Institution of Analysts & Programmers
Code Bin
DevWeek & SQL Server DevCon
About VSJ
Advertising Information
Contacts
Follow VSJ on Twitter
Database Development Zone
Beyond the SQL Server basics

Developing SQL Server 2005 OLAP applications with ADO MD.NET.

By Kay Ewbank

Published: 6 March 2006

If you’re a database developer, you’re probably pretty familiar with the techniques and problems that you need to solve in order to create the ‘usual’ order processing type of database. Change the language, change the underlying database, and given a few hours of coding and configuration you can probably turn out a working system.

SQL Server 2005 offers a lot more than simply being a good relational database, though, and sooner or later the chances are that some of your users will want to take advantage of its more sophisticated facilities. Many of these can be used ‘out of the box’, and you can probably get away with ignoring them. However, if you want to put together an application that really impresses your users, it’s worth remembering that those same advanced facilities are available for you to use as a developer.

For example, ask a database administrator why SQL Server 2005 is a good idea and one of the phrases that will crop up quite quickly is SQL Server’s “BI” facilities. Business Intelligence sounds impressive, the sort of thing that managers are likely to leap on enthusiastically. Ask them just what BI really means, and things might get a little vaguer, but offer them BI facilities in your custom applications, and you’re on to a winner.

A manager can be vague about what BI means, but writing an application that works means you need to know exactly what you’re talking about, and the first thing you need to make sure you understand is OLAP. OnLine Analytical Processing is usually defined as the processing and analysis of shared multidimensional data. In real life, OLAP systems take data from some form of relational database, and aggregate and organise the business information into an accessible, easy to use multidimensional structure. This means that ‘real’ work can carry on live while the analyses are happening to the aggregated data. OLAP systems store some or all of this aggregated information either within tables in a relational database (relational OLAP, or ROLAP) or in multidimensional databases (multidimensional OLAP, or MOLAP). Queries that are run against the OLAP system hopefully run more quickly than a similar query against the original relational data because the OLAP system has already carried out the underlying data aggregation, so you’re working with the derived values. The data in the OLAP systems are arranged in arrays to allow rapid analysis, and the arrays are called cubes. The structures are called cubes because the user can view three dimensions of the data at any one time, can rotate the view to alter which ‘face’ they are viewing, and can swap other dimensions in and out to get other views of the data.

Anyone who has written a complex query that brought a production database to a halt will appreciate that working with large amounts of data can be expensive in terms of processor time and resources. To avoid this, OLAP systems don’t run against online transaction processing (OLTP) systems. Instead, OLAP systems are designed to make use of data warehouses – essentially copies of the data in the OLTP systems that are updated only as necessary to keep the data in the warehouse ‘enough’ up to date.

SQL Server 2000 Analysis Services provided a range of OLAP features that was very good, but a bit tricky to develop against. You could develop applications that contained an OLAP front-end, but you had to make use of ADO MD, which was the original COM API for OLAP, and XML For Analysis (XMLA), a Web Service interface that returned data in the form of raw XML. In SQL Server 2005, this has been superseded by ADO MD.NET, which is a managed wrapper for XMLA so you can avoid the dependency on COM.

Those of you who have been following OLAP may be looking in disbelief at the mention of XMLA in this article. For several years, XMLA looked like a technology that was dying quietly in the corner. It was originally dreamed up by Microsoft and Hyperion in 2001, and the SAS Institute joined a little later. However, everything went very quiet, and the lack of activity did make it seem that XMLA was going to go the way of so many technologies.

What’s changed is the fact that SQL Server 2005 provides its BI via Analysis Services, and XMLA is the only protocol supported by Analysis Services, and all the client object models and applications communicate with Analysis server through XMLA. Analysis Services has native support for XMLA, so there’s no need for any middle tier services, or for an XMLA SDK.

You can play with the Adventure Works cubes within Visual Studio
You can play with the Adventure Works cubes within Visual Studio

Sample cubes

One of the tricky parts of learning to program against OLAP cubes is that by definition, you need to have a cube available. If you have SQL Server 2005 Analysis Services, there’s a sample Adventure Works database that has cubes that you can use. If you’ve not installed the samples, run the scripts in the Samples folder. You’ll also need to create the AdventureWorks Analysis Services Project, and run the Visual Studio solution file that will most probably be located at:
C:\Program Files\Microsoft SQL Server
	\90\Samples\Analysis Services\
	Programmability\AMO\
	AMOAdventureWorks\CS\
	AmoAdventureWorks.sln
If you don’t have SQL Server Analysis Services, you can still get the sample database to play with, but you’ll need to download it.

XML for Analysis (XMLA)

XMLA is the standard that defines the interaction between the client application and the analytical data provider. It’s a SOAP-based XML protocol that is based on OLE DB, and it has just two methods – Discover and Execute. Essentially, Execute runs a query and gets an XML result set.

Execute is used, as the name suggests, for executing commands. XMLA defines just one command for use in Execute – Statement. Statement is then used to execute a query, which can be formed using MDX, DMX, or SQL. Execute returns either an MDDataSet or a Rowset. This is pretty much the heart of OLAP reporting using XMLA, and that’s fine if you know exactly what ‘questions’ your users will want to ask when you’re creating your application.

If, on the other hand, you want to create a user interface that you can provide for your users so they can construct their own queries without learning MDX, DMX or SQL, you need to be able to find out what data is available from your OLAP server, then offer the options to your users. To find out what’s available, you use the Discover method to find the list of available data sources, and information about the data source providers. Discover has many parameters, which you can use to find information about your databases, cubes, and the cube dimensions. You can also discover information about your OLEDB (for OLAP/DM) schema rowsets. Discover methods return a tabular resultset.

The syntax of the Discover method is:

<Discover>
	<RequestType>...</RequestType>
	<Restrictions>...</Restrictions>
	<Properties>...</Properties>
</Discover>
For example, the following code gets a list of the cubes from the Adventure Works data warehouse sample discussed earlier:
<Discover xmlns=”urn:schemas-microsoft-com:xml-analysis”>
	<RequestType>
		MDSCHEMA_CUBES
	</RequestType>
	<Restrictions>
		<RestrictionList>
			<CATALOG_NAME>
				Adventure Works DW
			</CATALOG_NAME>
		</RestrictionList>
	</Restrictions>
	<Properties>
		<PropertyList>
			<DataSourceInfo>
				Provider=MSOLAP;Data Source=local;
			</DataSourceInfo>
			<Catalog>Adventure Works DW</Catalog>
			<Format>Tabular</Format>
			</PropertyList>
	</Properties>
</Discover>
You can alter what data will be obtained by changing the details in the PropertyList element of the code. The elements in the PropertyList let you define pretty much everything about the way data is handled. For instance, you can specify the format of the axes of your data set, with choices for cross products or tuples (a tuple is an ordered collection of members, where a member is an item in a hierarchy that represents one or more records in the underlying relational database. A member is the lowest level of reference used when accessing cell data in a cube). You can specify the catalog, the schema, the name of the data cube, and the connection details. There are also equivalents for the OLE DB properties such as DB_MODE and DBPROP in their various incarnations.

The full list of properties you can set runs over 100 – see the complete list.

While Discover lets you find out things about your data source, Execute can be used in your application to run provider-specific commands against XML for Analysis data sources, in this case to an instance of Microsoft SQL Server 2005 Analysis Services (SSAS). The commands you can send are requests for the retrieval of data from the server or updating of data on the server.

For example, the following code selects all the members of the measures in Adventure Works. In OLAP cubes, a measure is a set of values that are based on a column in the cube’s fact table. Measures are the central values that are aggregated and analysed, and a version of this code is pretty much always used as the starting point for finding out what data is available to you:

<Execute xmlns=”urn:schemas-microsoft-
	com:xml-analysis”>
	<Command>
	<Statement>
		SELECT [Measures].MEMBERS ON
		COLUMNS FROM [Adventure Works]
	</Statement>
	</Command>
	<Properties>
		<PropertyList>
			<DataSourceInfo>
				Provider=MSOLAP;
				Data Source=local;
			</DataSourceInfo>
		<Catalog>Adventure Works
			DW</Catalog>
		<Format>
			Multidimensional</Format>
		<AxisFormat>
			ClusterFormat</AxisFormat>
		</PropertyList>
	</Properties>
</Execute>

ADO MD.NET

While XMLA is the underlying technology used to communicate with SQL Server 2005 Analysis Services, from Visual Studio .NET, you need to use ADO MD.NET. This is the .NET Framework data provider for communicating with Analysis Services. It then uses the XML for Analysis protocol, and can be used with either TCP/IP or HTTP connections to transmit and receive SOAP requests and responses that XMLA will recognise.

You can send ADO MD.NET commands in MDX, DMX, Analysis Services Scripting Language (ASSL), or SQL. You may or may not get data returned, depending on the query you send.

When you’re developing client applications, you need to use the Microsoft.AnalysisServices.AdomdClient object. This sends XMLA requests to Analysis server, and parses XMLA responses into the .NET object model. The AdomdClient is a replacement for ADOMD/OLEDB. If you want to carry on using ADOMD/OLEDB, it is still supported, and is suitable for unmanaged apps. However, the new object model is pretty similar to ADOMD, so you shouldn’t have too much difficulty in swapping. The main difference lies in a collection of new objects that you have available.

To set up and execute a successful query, you first of all need to make a connection to your Analysis Services database. You do this using the AdomdConnection object. The connection object expects a string separated by semicolons, and you can specify some or all of Data source; catalog; user id; and password:

Dim strConnectionString As String
Dim objConnection As AdomdConnection
strConnectionString =
	“Data Source= localhost;
	Catalog = AdventureWorksAS;
	User ID=kay”
objConnection = New AdomdConnection()
objConnection.ConnectionString =
	strConnectionString
objConnection.Open()
Once open, you can retrieve the session ID:
strSessionID = objConnection.SessionID
Once you’ve set up your connection, you can run commands against it, and we’ll attempt to retrieve some data. The first thing to do is to create your command. There are two ways to do this; you can use the AdomdCommand constructor, or you can use the CreateCommand method of the AdomdConnection object. You set the text of the command that you are going to run using the CommandText property:
AdomdCommand cmd = objConnection.CreateCommand()
cmd.CommandText=
“<Execute xmlns=”urn:schemas-microsoft-com:xml-analysis”>
	<Command>
		<Statement>
			SELECT [Measures].MEMBERS ON COLUMNS
				FROM [Adventure Works]
		</Statement>
	</Command>
	<Properties>
		<PropertyList>
			<DataSourceInfo>
				Provider=MSOLAP;Data Source=local;
			</DataSourceInfo>
			<Catalog>Adventure Works DW</Catalog>
			<Format>Multidimensional</Format>
			<AxisFormat>ClusterFormat</AxisFormat>
		</PropertyList>
	</Properties>
</Execute>”
Once you’ve created your command object, you can use a variety of Execute commands depending on the action you want to carry out. In many cases, you would use ExecuteNonQuery, which sends a command but doesn’t return any results.

If you want to return a stream of data, you can use ExecuteReader. ExecuteCellSet gives you a CellSet object, and Execute XmlReader returns an XMLReader object that contains the data in an XMLA compliant format. Of these methods, the CellSet object has the highest overheads but gives you maximum information through its object model. XmlReader is the fastest for data retrieval, but offers least interactivity.

If we assume that we want the information returned as an XML stream, we’d use:

Command.executeXMLReader
In this case, you’d set up an XMLReader object that could then be used to handle the data:
Dim reader As System.Xml.XmlReader =
	command.ExecuteXmlReader
The XMLReader class gives you fast, non-cached, forward-only access to XML data, which you can then use in your application to present the information to your users so they can select which data they want to play with.

Conclusion

Database programming on its own is hard enough; moving to business intelligence applications makes the underlying concepts even trickier, and when you throw in new programming facilities, the whole thing can appear overwhelmingly complex. However, if you play around with the samples, you’ll discover that much of ADO MD.NET is actually quite straightforward, and you should be able to add BI facilities to your applications without spending an eternity making it all work.


Kay Ewbank is the editor of Server Management magazine and is a highly experienced database analyst who has followed the development of database technology from dbase through to today’s SQL servers.

What’s new in ADO MD.NET

If you’ve used ADO MD in its earlier incarnations, there are several new objects to enable you to write better applications. In particular, there are some data mining collections, a NamedSet object, and a KPI object.
  • NamedSet – If you were driven mad by the need to write long set expressions every time you wanted to create MDX queries, you’ll be glad to know that ADO MD.NET supports named sets. These are persisted set expressions that are associated with an alias and defined as part of the structure of a cube. In addition to providing an easier way to manage your set expressions, named sets also offer a way of maintaining the set expression independently of the queries that use it.
  • KPI – Key Performance Indicators (KPIs) are metadata wrappers around measures and Multidimensional Expressions (MDX) expressions. KPIs define up to four expressions for measures, and you can use them to make the information more understandable to your users. The four expressions that you can define for a KPI are the actual value; the goal value; the status and the trend. The status is a normalised value between 1 and 1 that shows the status of the actual value versus the goal, where 1 is ‘very bad,’ and 1 is ‘very good’. The trend is another normalised value between 1 and 1 that shows the trend over time, from 1 (getting a lot worse), to 1 (getting a lot better).
  • Data Mining – ADO MD comes with three collections that you can use to retrieve mining model metadata. These are the MiningModelCollection, which gives you a list of every mining model in the data source; the MiningServiceCollection, which returns provides information about the available mining algorithms; and finally, the MiningStructureCollection, which exposes information about the mining structures on the server.

You can also work with the new Key Performance Indicators from within Visual Studio
You can also work with the new Key Performance Indicators from within Visual Studio


Return to Database Development Zone

NetAdvantage Free Trial - click for details