Welcome to this ready to run repository to get started with the Apache Airflow Kafka provider! 🚀
This repository assumes you have basic knowledge of Apache Kafka and Apache Airflow. You can find resources on these tools in the Resouces section below.
This repository contains 3 DAGs:
produce_consume_treats
: This DAG will produce NUMBER_OF_TREATS messages to a local Kafka cluster. Run it manually to produce and consume new messages.listen_to_the_stream
: This DAG will continuously listen to a topic in a local Kafka cluster and run theevent_triggered_function
whenever a message causes theapply_function
to return a value. Unpause this DAG to have it continuously run.walking_your_pet
: This DAG is the downstream DAG the givenevent_triggered_function
in thelisten_to_the_stream
DAG will trigger. Unpause this DAG to have it ready to be triggered by a TriggerDagRunOperator in the upstream DAG.
This repository is designed to spin up both a local Kafka cluster and a local Astro project and connect them automatically. Note that it sometimes takes a minute longer for the Kafka cluster to be fully started.
Run this Airflow project without installing anything locally.
-
Fork this repository.
-
Create a new GitHub codespaces project on your fork. Make sure it uses at least 4 cores!
-
After creating the codespaces project the Astro CLI will automatically start up all necessary Airflow components as well as the local Kafka cluster, using the instructions in the
docker-compose.override.yml
. This can take a few minutes. -
Once the Airflow project has started access the Airflow UI by clicking on the Ports tab and opening the forward URL for port 8080. You can log in using
admin
as the username andadmin
as the password. -
Unpause all DAGs. Manually run the
produce_consume_treats
DAG to see the pipeline in action. Note that a random function is used to generate parts of the message to Kafka which determines if thelisten_for_mood
task will trigger the downstreamwalking_your_pet
DAG. You might need to run theproduce_consume_treats
several times to see the full pipeline in action!
Download the Astro CLI to run Airflow locally in Docker. astro
is the only package you will need to install.
- Run
git clone https://github.com/astronomer/airflow-quickstart.git
on your computer to create a local clone of this repository. - Install the Astro CLI by following the steps in the Astro CLI documentation. Docker Desktop/Docker Engine is a prerequisite, but you don't need in-depth Docker knowledge to run Airflow with the Astro CLI.
- Run
astro dev start
in your cloned repository. - After your Astro project has started. View the Airflow UI at
localhost:8080
. You can log in usingadmin
as the username andadmin
as the password. - Unpause all DAGs. Manually run the
produce_consume_treats
DAG to see the pipeline in action. Note that a random function is used to generate parts of the message to Kafka which determines if thelisten_for_mood
task will trigger the downstreamwalking_your_pet
DAG. You might need to run theproduce_consume_treats
several times to see the full pipeline in action!