Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(docker): make docker files easier to use during development. #1777

Merged
merged 9 commits into from
Aug 6, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,8 @@ metadata-events/mxe-registration/src/main/resources/**/*.avsc
.java-version

# Python
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
.mypy_cache/
Expand Down
27 changes: 0 additions & 27 deletions docker/README.md

This file was deleted.

1 change: 1 addition & 0 deletions docker/README.md
6 changes: 6 additions & 0 deletions docker/broker/env/dev.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
KAFKA_BROKER_ID=1
jplaisted marked this conversation as resolved.
Show resolved Hide resolved
KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0
File renamed without changes.
16 changes: 16 additions & 0 deletions docker/datahub-frontend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# DataHub Frontend Docker Image

[![datahub-frontend docker](https://github.com/linkedin/datahub/workflows/datahub-frontend%20docker/badge.svg)](https://github.com/linkedin/datahub/actions?query=workflow%3A%22datahub-frontend+docker%22)

Refer to [DataHub Frontend Service](../../datahub-frontend) to have a quick understanding of the architecture and
responsibility of this service for the DataHub.

## Checking out DataHub UI

After starting your Docker container, you can connect to it by typing below into your favorite web browser:

```
http://localhost:9001
```

You can sign in with `datahub` as username and password.
5 changes: 5 additions & 0 deletions docker/datahub-frontend/env/dev.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
DATAHUB_GMS_HOST=datahub-gms
DATAHUB_GMS_PORT=8080
DATAHUB_SECRET=YouKnowNothing
DATAHUB_APP_VERSION=1.0
DATAHUB_PLAY_MEM_BUFFER_SIZE=10MB
6 changes: 3 additions & 3 deletions docker/gms/Dockerfile → docker/datahub-gms/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM openjdk:8 as builder
COPY . /datahub-src
RUN cd /datahub-src && ./gradlew :gms:war:build \
&& cp gms/war/build/libs/war.war /gms.war
&& cp gms/war/build/libs/war.war /war.war
mars-lan marked this conversation as resolved.
Show resolved Hide resolved


FROM openjdk:8-jre-alpine
Expand All @@ -10,8 +10,8 @@ RUN apk --no-cache add curl tar \
&& curl https://repo1.maven.org/maven2/org/eclipse/jetty/jetty-runner/9.4.20.v20190813/jetty-runner-9.4.20.v20190813.jar --output jetty-runner.jar \
&& curl -L https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz | tar -C /usr/local/bin -xzv

COPY --from=builder /gms.war .
COPY docker/gms/start.sh /start.sh
COPY --from=builder /war.war /datahub/datahub-gms/bin/war.war
COPY --from=builder /datahub-src/docker/datahub-gms/start.sh /start.sh
RUN chmod +x /start.sh

EXPOSE 8080
Expand Down
22 changes: 22 additions & 0 deletions docker/datahub-gms/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# DataHub Generalized Metadata Store (GMS) Docker Image
[![datahub-gms docker](https://github.com/linkedin/datahub/workflows/datahub-gms%20docker/badge.svg)](https://github.com/linkedin/datahub/actions?query=workflow%3A%22datahub-gms+docker%22)

Refer to [DataHub GMS Service](../../gms) to have a quick understanding of the architecture and
responsibility of this service for the DataHub.

## Other Database Platforms

While GMS defaults to using MySQL as its storage backend, it is possible to switch to any of the
[database platforms](https://ebean.io/docs/database/) supported by Ebean.

For example, you can run the following command to start a GMS that connects to a PostgreSQL backend.

```
(cd docker/ && docker-compose -f docker-compose.yml -f docker-compose.postgre.yml -p datahub up)
```

or a MariaDB backend

```
(cd docker/ && docker-compose -f docker-compose.yml -f docker-compose.mariadb.yml -p datahub up)
```
9 changes: 9 additions & 0 deletions docker/datahub-gms/debug/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
FROM openjdk:8-jre-alpine
mars-lan marked this conversation as resolved.
Show resolved Hide resolved
ENV DOCKERIZE_VERSION v0.6.1
RUN apk --no-cache add curl tar \
&& curl https://repo1.maven.org/maven2/org/eclipse/jetty/jetty-runner/9.4.20.v20190813/jetty-runner-9.4.20.v20190813.jar --output jetty-runner.jar \
&& curl -L https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz | tar -C /usr/local/bin -xzv

EXPOSE 8080

CMD cd /datahub/datahub-gms/scripts && chmod +x start.sh && ./start.sh
13 changes: 13 additions & 0 deletions docker/datahub-gms/env/dev.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
EBEAN_DATASOURCE_USERNAME=datahub
EBEAN_DATASOURCE_PASSWORD=datahub
EBEAN_DATASOURCE_HOST=mysql:3306
EBEAN_DATASOURCE_URL=jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8
EBEAN_DATASOURCE_DRIVER=com.mysql.jdbc.Driver
KAFKA_BOOTSTRAP_SERVER=broker:29092
KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
ELASTICSEARCH_HOST=elasticsearch
ELASTICSEARCH_PORT=9200
NEO4J_HOST=neo4j:7474
NEO4J_URI=bolt://neo4j
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=datahub
13 changes: 13 additions & 0 deletions docker/datahub-gms/env/dev.mariadb.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
EBEAN_DATASOURCE_USERNAME=datahub
EBEAN_DATASOURCE_PASSWORD=datahub
EBEAN_DATASOURCE_HOST=mariadb:3306
EBEAN_DATASOURCE_URL=jdbc:mariadb://mariadb:3306/datahub
EBEAN_DATASOURCE_DRIVER=org.mariadb.jdbc.Driver
KAFKA_BOOTSTRAP_SERVER=broker:29092
KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
ELASTICSEARCH_HOST=elasticsearch
ELASTICSEARCH_PORT=9200
NEO4J_HOST=neo4j:7474
NEO4J_URI=bolt://neo4j
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=datahub
13 changes: 13 additions & 0 deletions docker/datahub-gms/env/dev.postgres.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
EBEAN_DATASOURCE_USERNAME=datahub
EBEAN_DATASOURCE_PASSWORD=datahub
EBEAN_DATASOURCE_HOST=postgres:5432
EBEAN_DATASOURCE_URL=jdbc:postgresql://postgres:5432/datahub
EBEAN_DATASOURCE_DRIVER=org.postgresql.Driver
KAFKA_BOOTSTRAP_SERVER=broker:29092
KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
ELASTICSEARCH_HOST=elasticsearch
ELASTICSEARCH_PORT=9200
NEO4J_HOST=neo4j:7474
NEO4J_URI=bolt://neo4j
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=datahub
2 changes: 1 addition & 1 deletion docker/gms/start.sh → docker/datahub-gms/start.sh
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ dockerize \
-wait http://$ELASTICSEARCH_HOST:$ELASTICSEARCH_PORT \
-wait http://$NEO4J_HOST \
-timeout 240s \
java -jar jetty-runner.jar gms.war
java -jar /jetty-runner.jar /datahub/datahub-gms/bin/war.war
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ ENV DOCKERIZE_VERSION v0.6.1
RUN apk --no-cache add curl tar \
&& curl -L https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz | tar -C /usr/local/bin -xzv

COPY --from=builder /mae-consumer-job.jar /mae-consumer-job.jar
COPY docker/mae-consumer/start.sh /start.sh
RUN chmod +x /start.sh
COPY --from=builder /mae-consumer-job.jar /datahub/datahub-mae-consumer/bin/
COPY docker/datahub-mae-consumer/start.sh /datahub/datahub-mae-consumer/scripts/
RUN chmod +x /datahub/datahub-mae-consumer/scripts/start.sh

EXPOSE 9091
EXPOSE 9090

CMD /start.sh
CMD /datahub/datahub-mae-consumer/scripts/start.sh
5 changes: 5 additions & 0 deletions docker/datahub-mae-consumer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# DataHub MetadataAuditEvent (MAE) Consumer Docker Image
[![datahub-mae-consumer docker](https://github.com/linkedin/datahub/workflows/datahub-mae-consumer%20docker/badge.svg)](https://github.com/linkedin/datahub/actions?query=workflow%3A%22datahub-mae-consumer+docker%22)

Refer to [DataHub MAE Consumer Job](../../metadata-jobs/mae-consumer-job) to have a quick understanding of the architecture and
responsibility of this service for the DataHub.
8 changes: 8 additions & 0 deletions docker/datahub-mae-consumer/debug/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM openjdk:8-jre-alpine
ENV DOCKERIZE_VERSION v0.6.1
RUN apk --no-cache add curl tar \
&& curl -L https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz | tar -C /usr/local/bin -xzv

EXPOSE 9090

CMD chmod +x /datahub/datahub-mae-consumer/scripts/start.sh && /datahub/datahub-mae-consumer/scripts/start.sh
8 changes: 8 additions & 0 deletions docker/datahub-mae-consumer/env/dev.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
KAFKA_BOOTSTRAP_SERVER=broker:29092
KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
ELASTICSEARCH_HOST=elasticsearch
ELASTICSEARCH_PORT=9200
NEO4J_HOST=neo4j:7474
NEO4J_URI=bolt://neo4j
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=datahub
2 changes: 1 addition & 1 deletion docker/mae-consumer/start.sh → docker/datahub-mae-consumer/start.sh
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ dockerize \
-wait http://$ELASTICSEARCH_HOST:$ELASTICSEARCH_PORT \
-wait http://$NEO4J_HOST \
-timeout 240s \
java -jar mae-consumer-job.jar
java -jar /datahub/datahub-mae-consumer/bin/mae-consumer-job.jar
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ ENV DOCKERIZE_VERSION v0.6.1
RUN apk --no-cache add curl tar \
&& curl -L https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz | tar -C /usr/local/bin -xzv

COPY --from=builder /mce-consumer-job.jar /mce-consumer-job.jar
COPY docker/mce-consumer/start.sh /start.sh
RUN chmod +x /start.sh
COPY --from=builder /mce-consumer-job.jar /datahub/datahub-mce-consumer/bin/
COPY docker/datahub-mce-consumer/start.sh /datahub/datahub-mce-consumer/scripts/
RUN chmod +x /datahub/datahub-mce-consumer/scripts/start.sh

EXPOSE 9090

CMD /start.sh
CMD /datahub/datahub-mce-consumer/scripts/start.sh
5 changes: 5 additions & 0 deletions docker/datahub-mce-consumer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# DataHub MetadataChangeEvent (MCE) Consumer Docker Image
[![datahub-mce-consumer docker](https://github.com/linkedin/datahub/workflows/datahub-mce-consumer%20docker/badge.svg)](https://github.com/linkedin/datahub/actions?query=workflow%3A%22datahub-mce-consumer+docker%22)

Refer to [DataHub MCE Consumer Job](../../metadata-jobs/mce-consumer-job) to have a quick understanding of the architecture and
responsibility of this service for the DataHub.
8 changes: 8 additions & 0 deletions docker/datahub-mce-consumer/debug/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM openjdk:8-jre-alpine
ENV DOCKERIZE_VERSION v0.6.1
RUN apk --no-cache add curl tar \
&& curl -L https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz | tar -C /usr/local/bin -xzv

EXPOSE 9090

CMD chmod +x /datahub/datahub-mce-consumer/scripts/start.sh && /datahub/datahub-mce-consumer/scripts/start.sh
4 changes: 4 additions & 0 deletions docker/datahub-mce-consumer/env/dev.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
KAFKA_BOOTSTRAP_SERVER=broker:29092
KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
GMS_HOST=datahub-gms
GMS_PORT=8080
2 changes: 1 addition & 1 deletion docker/mce-consumer/start.sh → docker/datahub-mce-consumer/start.sh
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
dockerize \
-wait tcp://$KAFKA_BOOTSTRAP_SERVER \
-timeout 240s \
java -jar mce-consumer-job.jar
java -jar /datahub/datahub-mce-consumer/bin/mce-consumer-job.jar
17 changes: 17 additions & 0 deletions docker/dev.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

# Launches dev instances of DataHub images. See documentation for more details.
# YOU MUST BUILD VIA GRADLE BEFORE RUNNING THIS.
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
cd $DIR && \
docker-compose \
-f docker-compose.yml \
-f docker-compose.override.yml \
-f docker-compose.dev.yml \
pull \
&& \
docker-compose -p datahub \
-f docker-compose.yml \
-f docker-compose.override.yml \
-f docker-compose.dev.yml \
up
38 changes: 38 additions & 0 deletions docker/docker-compose.dev.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Default overrides for running local development.
mars-lan marked this conversation as resolved.
Show resolved Hide resolved

# Every app that can be debugged needs A) a debug dockerfile, to create a minimal image that doesn't build any code, and
# B) needs to configure bind mounts to mount code. Note this assumes code is portable (JS, Java). There can be debugging
# solutions, that is just the general pattern (e.g. maybe Python doesn't build a binary, so just mount source code).

# To make a JVM app debuggable via IntelliJ, go to its env file and add JVM debug flags, and then add the JVM debug
# port to this file.
---
# TODO mount + debug docker file for frontend
version: '3.8'
services:
datahub-gms:
image: linkedin/datahub-gms:debug
build:
context: datahub-gms/debug
dockerfile: Dockerfile
volumes:
- ./datahub-gms/start.sh:/datahub/datahub-gms/scripts/start.sh
- ../gms/war/build/libs/:/datahub/datahub-gms/bin

datahub-mae-consumer:
image: linkedin/datahub-mae-consumer:debug
build:
context: datahub-mae-consumer/debug
dockerfile: Dockerfile
volumes:
- ./datahub-mae-consumer/start.sh:/datahub/datahub-mae-consumer/scripts/start.sh
- ../metadata-jobs/mae-consumer-job/build/libs/:/datahub/datahub-mae-consumer/bin/

datahub-mce-consumer:
image: linkedin/datahub-mce-consumer:debug
build:
context: datahub-mce-consumer/debug
dockerfile: Dockerfile
volumes:
- ./datahub-mce-consumer/start.sh:/datahub/datahub-mce-consumer/scripts/start.sh
- ../metadata-jobs/mce-consumer-job/build/libs/:/datahub/datahub-mce-consumer/bin
23 changes: 23 additions & 0 deletions docker/docker-compose.mariadb.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Override to use MariaDB as a backing store for datahub-gms.
mars-lan marked this conversation as resolved.
Show resolved Hide resolved
---
version: '3.8'
services:
postgres:
container_name: postgres
hostname: postgres
image: postgres:12.3
env_file: postgres/env/dev.env
restart: always
ports:
- '5432:5432'
volumes:
- ./postgres/init.sql:/docker-entrypoint-initdb.d/init.sql

datahub-gms:
env_file: datahub-gms/env/dev.mariadb.env
depends_on:
- postgres

networks:
default:
name: datahub_network
24 changes: 24 additions & 0 deletions docker/docker-compose.mysql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Override to use MySQL as a backing store for datahub-gms.
---
version: '3.8'
services:
mysql:
container_name: mysql
hostname: mysql
image: mysql:5.7
env_file: mysql/env/dev.env
restart: always
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
ports:
- "3306:3306"
volumes:
- ./mysql/init.sql:/docker-entrypoint-initdb.d/init.sql
- mysqldata:/var/lib/mysql

datahub-gms:
env_file: datahub-gms/env/dev.env
depends_on:
- mysql

volumes:
mysqldata:
Loading