Terrafrom Amundsen Deployment

This stack deploys Amundsen solution and consist of tree Terraform modules:

VPC - baseline infrastructure stack
Elasticsearch - single instance (dev) of AWS Elasticsearch deployment
ECS - ECS Fargate cluster deployment which deploys tree Amundsen services:
- Search
- Metadata
- Frontend
- Neo4j

All Amundsen componenets are deployed using docker-compose files form official Amundsen GitHub repository.

Deployment

Terrafrom stacks need to be deployed in the following order:

VPC
Elasticsearch
ECS

Prerequisites

Create a versioned S3 bucket in your AWS account.

Modify provider.tf file and replace S3 bucket in Terraform backend configuration with the one you just created. This will allow Terraform to store stacks deployment state in remote S3 bucket in your account.

VPC

To deploy a baseline VPC use the following commands:

cd vpc
terraform init
terraform apply

Elasticsearch

cd elasticsearch
terraform init
terraform apply

ECS

ECS stack consists of the following services:

Search
Metadata
Frontend
Neo4j

Search, Metadata and Frontend are independent containers which are configured automatically.

Neo4j database configuration defaults to the following folder structure which needs to be created at EFS filesystem:

/neo4j/data
/backup
/conf

To create this folder structure, create an Amazon Linux 2 EC2 instances, set the Network to amundsen-prod-vpc, set subnet to amundsen-prod-vpc-public-subnet-001. After launch, edit security group, add amundsen-prod-sg-amundsen.

SSH to the instance host and mount EFS share. The efs file_system_id is taken from your efs_file_system_id. You can find that by going to Amazon EFS then replacing that id with the one below:

sudo su -
yum install -y amazon-efs-utils
mkdir efs
mount -t efs fs-12345678:/ efs
mkdir -p efs/conf
mkdir -p efs/backup
mkdir -p efs/neo4j/data

Place neo4j.conf file to efs/conf/neo4j.conf.

Change ownership for all created folders and files to UID=1000 and GUID=1000:

chown -R 1000:1000 efs/*

Now you can deploy ECS cluster:

cd ecs
terraform init
terraform apply

Test data upload

From the same EC2 instance clone Amundsen repository:

yum install git
git clone --recursive https://github.com/amundsen-io/amundsen
cd amundsen/amundsendatabuilder/

Deploy required Python libs:

python3 -m venv venv
source venv/bin/activate
pip3 install --upgrade pip
pip3 install -r requirements.txt
python3 setup.py install

Now, you need to patch example/scripts/sample_data_loader.py file. Modify Elasticsearch client:

es = Elasticsearch([
    {'host': es_host, 'port': 443, 'scheme': 'https'},
])

Now, you can upload test data, update your ES endpoint in the command, which you can find here:

python example/scripts/sample_data_loader.py vpc-amundsen-prod-es-addshp7cgl2jt66zg5flg33zge.us-east-1.es.amazonaws.com neo4j.prod.amundsen.local

Now you can connect to Frontend service using ALB and play with the data.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
ecs		ecs
elasticsearch		elasticsearch
vpc		vpc
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Terrafrom Amundsen Deployment

Deployment

Prerequisites

VPC

Elasticsearch

ECS

Test data upload

About

Releases

Packages

Languages

iblaine/amundsen-terraform

Folders and files

Latest commit

History

Repository files navigation

Terrafrom Amundsen Deployment

Deployment

Prerequisites

VPC

Elasticsearch

ECS

Test data upload

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages