Lecture 2 - Power Excel BI Workflow

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

CSC4304: Data Management II

Power Excel (In-Memory) BI Workflow

Abdullahi Ahamad Shehu


[email protected]
“Lecture Time”: Wednesdays (2-4, TH A)
“Office” : Faculty of Computing Extension Wing A

1
The DW Development Cycle
1. Building the DW
• Requirements analysis
• DW Design
• Conceptual (Multidimensional) modelling: facts, dimensions…
• Logical modelling: star and snowflake schemas (Fact tables, dimension tables)
• Physical modelling: indexing, partitioning…
2. Loading Data into the DW: ETL (Extraction, Transformation, Loading)
3. Exploiting the DW: Querying, OLAP, Data Mining…
• Query languages: MDX (MultiDimensional eXpressions), DAX (Data Analysis
Expressions), SQL…

2
Conceptual Design: Multidimensional Model

• Data is, conceptually, viewed in an n-dimensional space: A data cube


• A data cube is composed of dimensions and facts (measures)
• Example: A three-dimensional cube for sales data with dimensions Product, Time, and Customer, and
a measure Quantity
• Dimensions: Perspectives used to analyse the data
• Attributes describe dimensions
• Product dimension may have attributes ProductNumber, Name, Category and UnitPrice
• The cells or facts of a data cube have associated numeric values called measures
• Example: each cell represents Quantity of units sold and amount of Sales (only Quantity shown in figure)

3
Conceptual Design: Multidimensional Model
• Data granularity: level of detail at which measures are represented for each dimension of the cube
• Example: sales figures aggregated to granularities Category, Quarter, and City
• Dimension Hierarchies: allow viewing data at several levels of granularity:

4
Example OLAP Operations

5
From Conceptual to Logical Design: Relational DW (Star Schema)
Dimension
Table
Fact
Table

Measures

6
Power Excel Landscape
Power Query/Power Pivot workflow

20 December 20 7
24
Motivation: Why Excel BI? (Real-World Case)
• A Bayero University, Kano conducts 3 yearly student surveys the results of
which are stored in separate Excel files in folders (a type of survey stored in
one folder). The University will then prepare reports and disseminate them
to various stakeholders. This process is manual and time-consuming (1
person takes 2 to 3 weeks) because it involves copying new data into a
master spreadsheet that has all the formulae for data transformation,
calculations…
• The University does not have the resources to purchase Enterprise SQL
Server licence (or equivalent)

20 December 20 8
24
Motivation: Why Excel BI? (Real-World Case)

20 December 20
24
Source: Karen Jewel (2020) MSc Project Report. RGU. 9
Excel in the news!
• Covid: Test error 'should never have happened' - BBC News (5 Oct 2020)
• Developers used xls format instead of xlsx, limiting the number of rows to
65,000 instead of 1m+
• https://www.bbc.co.uk/news/uk-54422505

20 December 20 10
24
Power Excel Workflow
• BI tools available directly in Excel (provided you have compatible version); no additional
software required. Many organisations have licences for Microsoft Office, including Excel.

RAW DATA POWER QUERY DATA MODEL POWER PIVOT & DAX
Flat files (csv, txt), Excel tables, (aka “Get & Transform”) Explore and analyze the entire
Create table relationships, add
databases (SQL, Azure), folders, Connect to sources, import calculated columns, define data model, and create powerful
streaming sources, web data, etc. data, and apply shaping and hierarchies and perspectives, etc. measures using Data Analysis
transformation tools (ETL) Expressions (DAX)

Source: Excel Maven & SecondLens Consulting, LLC.


20 December 20 11
24
Power Excel Workflow - Example
• Superstore keeps orders records in single Excel sheet.

• Requirement: analyse sales data by customer, product and date.


• Identify measures and dimensions
• Measures: Profit, Quantity, Sales
• Dimensions and their Hierarchies:
• Customers - (customer name, city, state and region)
• Products - (product name, sub-category and category)
• Dates - (day, month, quarter and year)
20 December 20 12
24
Power Query
• Connect to, and import, source data
• Shape the data: create queries that transform source data
• Normalise your tables

20 December 20 13
24
Power Pivot
• Create Data Model:
• Relationships between tables
• Hierarchies within Dimension tables
• Note the filter directions (shown as arrows) in each
relationship; these will point from the “one” side of the
relationship (dimension table) to the “many” side (fact
table)
• Keep fact tables narrow
• ideally only contain quantitative values
(measures) and foreign keys; any extra
descriptive columns should live in a related
dimension table
20 December 20 14
24
Pivot Tables / Pivot Charts

20 December 20 15
24
Power Query
Types of Data Connections

From File From Database FromAzure From Online Services From Other Sources

20 December 20 Source: Excel Maven & SecondLens Consulting, LLC. 17


24
Data Loading Options

When you load data from Power Query, you have several options:
• Table
• Stores the data in a new or existing worksheet
• Requires relatively small data sets (<1mm rows)

• Connection Only
• Saves the data connection settings and applied steps
• Data does not load to a worksheet

• Add to Data Model


• Compresses and loads data to Excel’s Data Model
• Makes data accessible to Power Pivot for further analysis

20 December 20 18
24
Query Editor Tools
The HOME tab includes general settings and common table transformation tools

The TRANSFORM tab includes tools to modify existing columns (splitting/grouping, transposing, extracting text, etc.

The ADD COLUMN tools create new columns based on conditional rules, text operations, calculations, dates, etc.

20 December 20 Source: Excel Maven & SecondLens Consulting, LLC. 19


24
Modifying Queries
• Merging queries allows you to join tables based on a common column
• Merging adds columns to an existing table
• Just because you can merge tables, doesn’t mean you should.
• In general, it’s better to keep tables separate and define relationships
between them

• Appending queries allows you to combine (or stack) tables that share a
common structure and set of columns
• Appending adds rows to an existing table
• Use the “From Folder” query option to automatically append all files from
within the same folder

20 December 20 20
24
DAX - Data Analysis Expressions

20 December 20 21
24
DAX
• Useful formula language that drives Power Pivot, Power BI and SQL Server
Analysis Services
• Adds calculated columns and measures to your data model
• Calculated columns: add new columns to tables, typically in dimension tables.
Example: Year column
• Only create them when absolutely necessary (they increase file size). Alternative is to build them
at the source or using Power Query.
• Measures: add aggregated values, typically to fact tables. Example: Sum of Quantity

20 December 20 22
24
Common DAX functions
MATH & STATS LOGICAL TEXT FILTER DATE & TIME
Functions Functions Functions Functions Functions
Basic aggregation Functions for returning Functions to manipulate Lookup functions based Basic date and time
functions as well as information about values text strings or control on related tables and functions as well as
“iterators” evaluated at in a given conditional formats for dates, times filtering functions for advanced time
the row-level expression or numbers dynamic calculations intelligence operations

Common Examples: Common Examples: Common Examples: Common Examples: Common Examples:
• SUM • IF • CONCATENATE • CALCULATE • DATEDIFF
• AVERAGE • IFERROR • FORMAT • FILTER • YEARFRAC
• MAX/MIN • AND • LEFT/MID/RIGHT • ALL • YEAR/MONTH/DAY
• DIVIDE • OR • UPPER/LOWER • ALLEXCEPT • HOUR/MINUTE/SECOND
• COUNT/COUNTA • NOT • PROPER • RELATED • TODAY/NOW
• COUNTROWS • SWITCH • LEN • RELATEDTABLE • WEEKDAY/WEEKNUM
• DISTINCTCOUNT • TRUE • SEARCH/FIND • DISTINCT
• FALSE • REPLACE • VALUES Time Intelligence Functions:
Iterator Functions: • REPT • EARLIER/EARLIEST • DATESYTD
• SUBSTITUTE • HASONEVALUE • DATESQTD
• SUMX
• TRIM • HASONEFILTER • DATESMTD
• AVERAGEX
• UNICHAR • ISFILTERED • DATEADD
• MAXX/MINX
• USERELATIONSHIP • DATESINPERIOD
• RANKX
• COUNTX

20 December 20 Source: Excel Maven & SecondLens Consulting, LLC. 23


24
Time to practice
• No point covering each DAX function here. Better to learn this
through practice.
• So please jump into Lab 1 and 2…

20 December 20 24
24

You might also like