DataHub is Linkedin's generalized metadata search & discovery tool. To learn more about DataHub, check out our Linkedin blog post and Strata presentation. This repository contains the complete source code to be able to build DataHub's frontend & backend services.
- Install docker and docker-compose.
- Clone this repo and make sure you are at the
datahub
branch. - Run below command to download and run all Docker containers in your local:
cd docker/quickstart && docker-compose pull && docker-compose up --build
- After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
./gradlew :metadata-events:mxe-schemas:build && cd metadata-ingestion/mce-cli && pip install --user -r requirements.txt && python mce_cli.py produce -d bootstrap_mce.dat
Note: Make sure that you're using Java 8, we have a strict dependency to Java 8 for build.
- Finally, you can start
DataHub
by typinghttp://localhost:9001
in your browser. You can sign in withdatahub
as username and password.
- Add user profile page
- Deploy DataHub to Azure Cloud