Lecture 2 - Power Excel BI Workflow
Lecture 2 - Power Excel BI Workflow
Lecture 2 - Power Excel BI Workflow
1
The DW Development Cycle
1. Building the DW
• Requirements analysis
• DW Design
• Conceptual (Multidimensional) modelling: facts, dimensions…
• Logical modelling: star and snowflake schemas (Fact tables, dimension tables)
• Physical modelling: indexing, partitioning…
2. Loading Data into the DW: ETL (Extraction, Transformation, Loading)
3. Exploiting the DW: Querying, OLAP, Data Mining…
• Query languages: MDX (MultiDimensional eXpressions), DAX (Data Analysis
Expressions), SQL…
2
Conceptual Design: Multidimensional Model
3
Conceptual Design: Multidimensional Model
• Data granularity: level of detail at which measures are represented for each dimension of the cube
• Example: sales figures aggregated to granularities Category, Quarter, and City
• Dimension Hierarchies: allow viewing data at several levels of granularity:
4
Example OLAP Operations
5
From Conceptual to Logical Design: Relational DW (Star Schema)
Dimension
Table
Fact
Table
Measures
6
Power Excel Landscape
Power Query/Power Pivot workflow
20 December 20 7
24
Motivation: Why Excel BI? (Real-World Case)
• A Bayero University, Kano conducts 3 yearly student surveys the results of
which are stored in separate Excel files in folders (a type of survey stored in
one folder). The University will then prepare reports and disseminate them
to various stakeholders. This process is manual and time-consuming (1
person takes 2 to 3 weeks) because it involves copying new data into a
master spreadsheet that has all the formulae for data transformation,
calculations…
• The University does not have the resources to purchase Enterprise SQL
Server licence (or equivalent)
20 December 20 8
24
Motivation: Why Excel BI? (Real-World Case)
20 December 20
24
Source: Karen Jewel (2020) MSc Project Report. RGU. 9
Excel in the news!
• Covid: Test error 'should never have happened' - BBC News (5 Oct 2020)
• Developers used xls format instead of xlsx, limiting the number of rows to
65,000 instead of 1m+
• https://www.bbc.co.uk/news/uk-54422505
20 December 20 10
24
Power Excel Workflow
• BI tools available directly in Excel (provided you have compatible version); no additional
software required. Many organisations have licences for Microsoft Office, including Excel.
RAW DATA POWER QUERY DATA MODEL POWER PIVOT & DAX
Flat files (csv, txt), Excel tables, (aka “Get & Transform”) Explore and analyze the entire
Create table relationships, add
databases (SQL, Azure), folders, Connect to sources, import calculated columns, define data model, and create powerful
streaming sources, web data, etc. data, and apply shaping and hierarchies and perspectives, etc. measures using Data Analysis
transformation tools (ETL) Expressions (DAX)
20 December 20 13
24
Power Pivot
• Create Data Model:
• Relationships between tables
• Hierarchies within Dimension tables
• Note the filter directions (shown as arrows) in each
relationship; these will point from the “one” side of the
relationship (dimension table) to the “many” side (fact
table)
• Keep fact tables narrow
• ideally only contain quantitative values
(measures) and foreign keys; any extra
descriptive columns should live in a related
dimension table
20 December 20 14
24
Pivot Tables / Pivot Charts
20 December 20 15
24
Power Query
Types of Data Connections
From File From Database FromAzure From Online Services From Other Sources
When you load data from Power Query, you have several options:
• Table
• Stores the data in a new or existing worksheet
• Requires relatively small data sets (<1mm rows)
• Connection Only
• Saves the data connection settings and applied steps
• Data does not load to a worksheet
20 December 20 18
24
Query Editor Tools
The HOME tab includes general settings and common table transformation tools
The TRANSFORM tab includes tools to modify existing columns (splitting/grouping, transposing, extracting text, etc.
The ADD COLUMN tools create new columns based on conditional rules, text operations, calculations, dates, etc.
• Appending queries allows you to combine (or stack) tables that share a
common structure and set of columns
• Appending adds rows to an existing table
• Use the “From Folder” query option to automatically append all files from
within the same folder
20 December 20 20
24
DAX - Data Analysis Expressions
20 December 20 21
24
DAX
• Useful formula language that drives Power Pivot, Power BI and SQL Server
Analysis Services
• Adds calculated columns and measures to your data model
• Calculated columns: add new columns to tables, typically in dimension tables.
Example: Year column
• Only create them when absolutely necessary (they increase file size). Alternative is to build them
at the source or using Power Query.
• Measures: add aggregated values, typically to fact tables. Example: Sum of Quantity
20 December 20 22
24
Common DAX functions
MATH & STATS LOGICAL TEXT FILTER DATE & TIME
Functions Functions Functions Functions Functions
Basic aggregation Functions for returning Functions to manipulate Lookup functions based Basic date and time
functions as well as information about values text strings or control on related tables and functions as well as
“iterators” evaluated at in a given conditional formats for dates, times filtering functions for advanced time
the row-level expression or numbers dynamic calculations intelligence operations
Common Examples: Common Examples: Common Examples: Common Examples: Common Examples:
• SUM • IF • CONCATENATE • CALCULATE • DATEDIFF
• AVERAGE • IFERROR • FORMAT • FILTER • YEARFRAC
• MAX/MIN • AND • LEFT/MID/RIGHT • ALL • YEAR/MONTH/DAY
• DIVIDE • OR • UPPER/LOWER • ALLEXCEPT • HOUR/MINUTE/SECOND
• COUNT/COUNTA • NOT • PROPER • RELATED • TODAY/NOW
• COUNTROWS • SWITCH • LEN • RELATEDTABLE • WEEKDAY/WEEKNUM
• DISTINCTCOUNT • TRUE • SEARCH/FIND • DISTINCT
• FALSE • REPLACE • VALUES Time Intelligence Functions:
Iterator Functions: • REPT • EARLIER/EARLIEST • DATESYTD
• SUBSTITUTE • HASONEVALUE • DATESQTD
• SUMX
• TRIM • HASONEFILTER • DATESMTD
• AVERAGEX
• UNICHAR • ISFILTERED • DATEADD
• MAXX/MINX
• USERELATIONSHIP • DATESINPERIOD
• RANKX
• COUNTX
20 December 20 24
24