15. いろいろな商用並列RDB
Database Vendor Since
Teradata Teradata 1983
Teradata Aster Teradata 2005
PureData System for Analytics IBM 2000
Exadata Oracle 2008
Greenplum Pivotal 2003
SQL Server PDW Microsoft 2010くらい
Redshift Amazon 2012
Various parallel RDBs
21. MapReduce
k1, v1
k2, v2
k3, v3
k4, v4
k5, v5
k6, v6
M
a
p
k'1, v'1
k'1, v'2
k'1, v'3
k'2, v'4
k'3, v'5
k'3, v'6
k''1, v''1
k''2, v''2
k''3, v''3
R
e
d
u
ce
38. SQLからMapReduce呼べる
select count(distinct user_id)
from npath(
on clicks
partition by user_id
order by timestamp
mode(overlapping)
pattern( H.S.P )
symbols(
page_type = home AS H,
page_type = search AS S,
page_type = product AS P)
result(first(user_id of H) as user_id)
);
最近、Hiveにもnpath入りました
(MatchPath)
You can combine MapReduce with SQL