Understanding Analysis Services

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

So far in this course you’ve focused on getting data into a SQL Server database and

then later getting the same data out of the database. You’ve seen how to create
tables, insert data, and use SQL statements, views, and stored procedures to retrieve
the data. This pattern of activity, where individual actions deal with small pieces of
the database, is sometimes called online transaction processing, or OLTP.
But there’s another use for databases, especially large databases. Suppose you run an online
book store and have sales records for 50 million book sales. Maybe books on introductory
biology show a strong spike in sales every September. That’s a fact that you could use to your
advantage in ordering stock, if only you knew about it.
Searching for patterns like this and summarizing them is called online analytical
processing, or OLAP. Microsoft SQL Server 2005 includes a separate program called
Microsoft SQL Server 2005 Analysis Services to perform OLAP analysis. In this
chapter you’ll learn the basics of setting up and using Analysis Services.
Understanding Analysis Services
The basic idea of OLAP is fairly simple. Let’s think about that book ordering data for
a moment. Suppose you want to know how many people ordered a particular book
during each month of the year. You could write a fairly simple query to get the
information you want. The catch is that it might take a long time for SQL Server to
churn through that many rows of data.
And what if the data was not all in a single SQL Server table, but scattered around in
various databases throughout your organization? The customer info, for example,
might be in an Oracle database, and supplier information in a legacy xBase database.
SQL Server can handle distributed heterogeneous queries, but they’re slower.
What if, after seeing the monthly numbers, you wanted to drill down to weekly or
daily numbers? That would be even more time -consuming and require writing even
more queries.
This is where OLAP comes in. The basic idea is to trade off increased storage space
now for speed of querying later. OLAP does this by precalculating and storing
aggregates. When you identify the data that you want to store in an OLAP database,
Analysis Services analyzes it in advance and figures out those daily, weekly, and
monthly numbers and stores them away (and stores many other aggregations at the
same time). This takes up plenty of disk space, but it means that when you want to
explore the data you can do so quickly.

Later in the chapter, you’ll see how you can use Analysis Services to extract summary
information from your data. First, though, you need to familiarize yourself with a new vocabulary.
The basic concepts of OLAP include:

Cube

Dimension table

Dimension

Level

Fact table

Measure

Schema
Cube
The basic unit of storage and analysis in Analysis Services is thecu be. A cube is a
collection of data that’s been aggregated to allow queries to return data quickly. For
example, a cube of order data might be aggregated by time period and by title,
making the cube fast when you ask questions concerning orders by week or orders
by title.
Cubes are ordered intodimensions andmeasures. Dimensions come fromdimension
tables, while measures come from fact tables.
Dimension table
A dimension table contains hierarchical data by which you’d like to summarize.
Examples would be an Orders table, that you might group by year, month, week,
and day of receipt, or a Books table that you might want to group by genre and title.
Dimension
Each cube has one or moredimensions, each based on one or more dimension tables. A
dimension represents a category for analyzing business data: time or category in the examples
above. Typically, a dimension has a natural hierarchy so that lower results can be “rolled up”
into higher results. For example, in a geographical level you might have city totals aggregated
into state totals, or state totals into country totals.
Level
Each type of summary that can be retrieved from a single dimension is called alevel.
For example, you can speak of a week level or a month level in a time dimension.

Analysis Services
Fact table
A fact table contains the basic information that you wish to summarize. This might be
order detail information, payroll records, drug effectiveness information, or
anything else that’s amenable to summing and averaging. Any table that you’ve
used with aS u m or A v g function in a totals query is a good bet to be a fact table.
Measure
Every cube will contain one or moremeasures, each based on a column in a fact table
that you’d like to analyze. In the cube of book order information, for example, the
measures would be things such as unit sales and profit.
Schema
Fact tables and dimension tables are related, which is hardly surprising, given that
you use the dimension tables to group information from the fact table. The relations
within a cube form aschem a. There are two basic OLAP schemas: star and
snowflake. In a star schema, every dimension table is related directly to the fact table.
In a snowflake schema, some dimension tables are related indirectly to the fact table.
For example, if your cube includes O r d e r D e t a i l s as a fact table, with C u s t o m e r s
and Or d e r s as dimension tables, and C u s t o m e r s is related to Or d e r s , which in
turn is related to O r d e r D e t a i l s , then you’re dealing with a snowflake schema.

There are additional schema types besides the star and snowflake
schemas, including parent-child schemas and data-mining
schemas. However, the star and snowflake schemas are the most
common types in normal cubes.
Introducing Business Intelligence Development
Studio
Business Intelligence Development Studio (BIDS) is a new tool in SQL Server 2005
that you can use for analyzing SQL Server data in various ways. You can build three
different types of solutions with BIDS:

Analysis Services projects

Integration Services projects (you’ll learn about SQL Server Integration


Services in Chapter 16)

Reporting Services projects (you’ll learn about SQL Server Reporting Services
in Chapter 18)
To launch Business Intelligence Development Studio, select Microsoft SQL Server
2005 SQL Server Business Intelligence Development Studio from the Programs
menu. BIDS shares the Visual Studio shell, so if you have Visual Studio installed on
your computer, this menu item will launch Visual Studio complete with all of the
Visual Studio project types (such as Visual Basic and C# projects).
Creating a Data Cube
To build a new data cube using BIDS, you need to perform these steps:

Create a new Analysis Services project


Define a data source


Define a data source view


Invoke the Cube Wizard


We’ll look at each of these steps in turn.
You’ll need to have the AdventureWorksDW sample database
installed to complete the examples in this chapter. This database is
one of the samples that’s available with SQL Server.
Creating a New Analysis Services Project
To create a new Analysis Services project, you use the New Project dialog box in
BIDS. This is very similar to creating any other type of new project in Visual Studio.
Try It!
To create a new Analysis Services project, follow these steps:
1. Select Microsoft SQL Server 2005 SQL Server Business Intelligence
Development Studio from the Programs menu to launch Business Intelligence
Development Studio.
2. Select File New Project.
Analysis Services
3. In the New Project dialog box, select the Business Intelligence Projects project
type.
4. Select the Analysis Services Project template.
5. Name the new project AdventureWorksCube1 and select a convenient location to
save it.
6. Click OK to create the new project.
Figure 15-1 shows the Solution Explorer window of the new project, ready to be
populated with objects.
Figure 15-1: New Analysis Services project
Defining a Data Source
To define a data source, you’ll use the Data Source Wizard. You can launch this
wizard by right-clicking on the Data Sources folder in your new Analysis Services
project. The wizard will walk you through the process of defining a data source for
your cube, including choosing a connection and specifying security credentials to be
used to connect to the data source.
Try It!
To define a data source for the new cube, follow these steps:
1. Right-click on the Data Sources folder in Solution Explorer and select New Data
Source.
2. Read the first page of the Data Source Wizard and click Next.
3. You can base a data source on a new or an existing connection. Because you
don’t have any existing connections, click New.

You might also like