B561 Advanced Database Concepts: 0 Introduction
B561 Advanced Database Concepts: 0 Introduction
B561 Advanced Database Concepts: 0 Introduction
Concepts
§0 Introduction
Qin Zhang
1-1
Self introduction: my research interests
2-1
Self introduction: my research interests
2-2
Self introduction: my research interests
3-1
How to represent data?
4-1
How to represent data?
4-2
How to operate on data?
Product
PName Price Category Manufacturer
Gizmo 19.99 Gadgets GizmoWorks
Powergizmo 29.99 Gadgets GizmoWorks
SingleTouch 149.99 Photography Canon
MultiTouch 203.99 Household Hitachi
Company
cName StockPrice Country
GizmoWorks 25 USA
Canon 65 Japan
Hitachi 15 Japan
Q: Find all products under price 200 manufactured in Japan?
5-1
How to operate on data? (cont.)
Product
PName Price Category Manufacturer
Gizmo 19.99 Gadgets GizmoWorks
Powergizmo 29.99 Gadgets GizmoWorks
SingleTouch 149.99 Photography Canon
MultiTouch 203.99 Household Hitachi
Company
CName StockPrice Country
• SQL GizmoWorks 25 USA
Canon 65 Japan
SELECT x.PName, x.Price Hitachi 15 Japan
FROM Product x, Company y
WHERE x.Manufacturer=y .CName
AND y .Country=‘Japan’
AND x.Price ≤ 200
• Relational Algebra
πPName, Price
(σPrice≤200∧Country=‘Japan’ (Product 1Manufacturer=CName Company))
6-1
How to speed up the operation?
7-1
How to speed up the operation?
8-1
How to deal with transactions?
9-1
How to deal with transactions?
Database =
Logic
(express the query)
System
(implementation)
Algorithm
(solve the query)
10-1
Summarize
Database =
Logic
(express the query)
System
(implementation)
Algorithm
(solve the query)
Implementation
Concept (our focus) (see B662 Database System
and Internal Design)
10-2
Summarize
Database =
Logic
(express the query)
Data Representation, Relational
Algebra, SQL (Datalog), etc. System
(implementation)
Algorithm
(solve the query)
Indexing, Query Optimization,
Concurrency Control, etc.
Implementation
Concept (our focus) (see B662 Database System
and Internal Design)
10-3
Summarize
Database =
And you need math!!
Logic
(express the query)
Data Representation, Relational
Algebra, SQL (Datalog), etc. System
(implementation)
Algorithm
(solve the query)
Indexing, Query Optimization,
Concurrency Control, etc.
Implementation
Concept (our focus) (see B662 Database System
and Internal Design)
10-4
What’s more in this course?
11-1
Advanced topics
12-1
Other important topics in databases
13-1
Tentative course plan
Part 0 : Introductions
Part 1 & 2 : Basics
– SQL, Relational Algebra
– Data Models, Storage, Indexing
Part 3 : Optimization
Part 4 : Trasactions
Part 5 : Data Privacy
Part 6 : I/O-Efficient Algorithms
Part 7 : Streaming Algorithms
Part 8 : Data Integration
Part 9 : MapReduce
14-1
Tentative course plan
Part 0 : Introductions
Part 1 & 2 : Basics
– SQL, Relational Algebra
– Data Models, Storage, Indexing
Part 3 : Optimization
Part 4 : Trasactions
Part 5 : Data Privacy
Part 6 : I/O-Efficient Algorithms
Part 7 : Streaming Algorithms
Part 8 : Data Integration
Part 9 : MapReduce
We will also have some student presentations at the end of the course
14-2
Resources
ccontrol.aspx
16-1
Resources (cont.)
17-1
Resources (cont.)
Associate Instructors:
• Erfan Sadeqi Azer
• Le Liu
• Yifan Pan
• Ali Varamesh
• Prasanth Velamala
Office hours: Posted on course website
18-1
Grading
19-1
Grading
20-1
LaTeX
21-1
Prerequisite
22-1
Frequently asked questions
23-1
Frequently asked questions
24-1
The goal of this course
24-2
Big Data
25-1
Big Data
26-1
Big Data
Magazine covers
Source
• Retailer databases: Amazon, Walmart
• Logistics, financial & health data: Stock prices
• Social network: Facebook, twitter
• Pictures by mobile devices: iphone
• Internet traffic: IP addresses
• New forms of scientific data: Large Synoptic Survey Telescope
27-1
Source and challenge
Source
• Retailer databases: Amazon, Walmart
• Logistics, financial & health data: Stock prices
• Social network: Facebook, twitter
• Pictures by mobile devices: iphone
• Internet traffic: IP addresses
• New forms of scientific data: Large Synoptic Survey Telescope
Challenge
• Volume
• Velocity
• Variety (Documents, Stock records, Personal profiles,
Photographs, Audio & Video, 3D models, Location data, . . . )
27-2
Source and challenge
Source
• Retailer databases: Amazon, Walmart
• Logistics, financial & health data: Stock prices
• Social network: Facebook, twitter
• Pictures by mobile devices: iphone
• Internet traffic: IP addresses
• New forms of scientific data: Large Synoptic Survey Telescope
Challenge
• Volume
• Velocity
} The main technical challenges
• Variety (Documents, Stock records, Personal profiles,
Photographs, Audio & Video, 3D models, Location data, . . . )
27-3
What does Big Data really mean?
28-1
What does Big Data really mean?
28-2
What does Big Data really mean?
28-3
What does Big Data really mean?
28-4
What does Big Data really mean?
29-1
Big Data:
A marketing buzzword??
A good reading topic
29-2
Popular models for big data
(see another slides)
30-1
Summary for the introduction
31-1
Thank you!
Questions?
A few introductory slides are based on Rasmus
Pagh’s slides
http://www.itu.dk/people/pagh/ADBT06/
32-1