Finance 🏦 Data Builder 🛠️ @ postgres 🐘
The finance data builder extracts data from several sources, loads it into a postgres database and transforms it via dbt into beautiful models.
The data sources are:
I use Airflow to manage the whole ELT process:
For Google News:
For yahoo! finance:
For PayPal:
I use DBT to transform the data into models:
To run this project, simply add a .env
file to the project root directory, fill it with the following environment
variables:
DBT_POSTGRES_HOST=fdb_dbt_db
DBT_POSTGRES_USER=dbt
DBT_POSTGRES_PASSWORD=dbt
DBT_POSTGRES_DB=dbt
DBT_POSTGRES_PORT=5432
AIRFLOW_POSTGRES_HOST=fdb_airflow_db
AIRFLOW_POSTGRES_USER=airflow
AIRFLOW_POSTGRES_PASSWORD=airflow
AIRFLOW_POSTGRES_DB=airflow
AIRFLOW_POSTGRES_PORT=5432
AIRFLOW_USER=airflow
AIRFLOW_PASSWORD=airflow
and then run it via docker-compose
:
docker-compose up -d
NOTE: To retrieve PayPal data you must authenticate. First create a PayPal App with LIVE APP SETTINGS Transaction Search enabled and then add an Airflow connection with the following information:
Conn Id: http_paypal
Conn Type: HTTP
Host: https://api.paypal.com
Login: <enter-your-CLIENT-ID-here>
Password: <enter-your-SECRET-here>
You should then be able to retrieve your personal PayPal transactions.
I am using a storage
folder for storing data files locally. Normally you probably want the storage to be a remote
storage that is designed to store large amount of data, such as S3, GCP or Blob Storage.