This project includes the data set for Redshift benchmark.
- SlideShare
- Test Data
- s3://hapyrus-examples/redshift-benchmark/ad-network-examples/case-01/ad_campaigns
- s3://hapyrus-examples/redshift-benchmark/ad-network-examples/case-01/advertisers
- s3://hapyrus-examples/redshift-benchmark/ad-network-examples/case-01/publishers
- s3://hapyrus-examples/redshift-benchmark/ad-network-examples/case-01/imp_logs
- s3://hapyrus-examples/redshift-benchmark/ad-network-examples/case-01/click_logs
- Redshift cluster
- required to a launch a Redshift cluster
- minimum instance type is enough
- dw.hs1.xlarge single node
- see Amazon Redshift Getting Started Guide
- Local environment
- postgresql client
- Create tables
- run sql/create_tables_redshift.sql on your Redshift cluster
- Copy the test data set on our s3 to your Redshift cluster(it took over 17 hours in our case)
- edit sql/copy_all_[data-size].sql and write your own [aws-access-key-id] and [aws-secret-access-key].
- run sql/copy_all_[data-size].sql on your Redshift cluster
- Run the test sql
- run sql/test-query.sql on your Redshift cluster
- See sql/create_tables_hadoop_hive.sql to create Hive tables.
- Run the test sql
- see sql/test-query.sql
This project is liscensed under the Apache License, Version 2.0 and powered by Hapyrus Inc.