Data science and engineering teams at Lyft maintain several big data pipelines that serve as the foundation for various types of analysis throughout the business. \n\n
Apache Airflow sits at the center of this big data infrastructure, allowing users to âprogrammatically author, schedule, and monitor data pipelines.â Airflow is an open source tool, and âLyft is the very first Airflow adopter in production since the project was open sourced around three years ago.â \n\n
There are several key components of the architecture. A web UI allows users to view the status of their queries, along with an audit trail of any modifications the query. A metadata database stores things like job status and task instance status. A multi-process scheduler handles job requests, and triggers the executor to execute those tasks. \n\n
Airflow supports several executors, though Lyft uses CeleryExecutor to scale task execution in production. Airflow is deployed to three Amazon Auto Scaling Groups, with each associated with a celery queue. \n\n
Audit logs supplied to the web UI are powered by the existing Airflow audit logs as well as Flask signal. \n\n
Datadog, Statsd, Grafana, and PagerDuty are all used to monitor the Airflow system.","rawContent":"Data science and engineering teams at Lyft maintain several big data pipelines that serve as the foundation for various types of analysis throughout the business. \n\nApache Airflow sits at the center of this big data infrastructure, allowing users to âprogrammatically author, schedule, and monitor data pipelines.â Airflow is an open source tool, and âLyft is the very first Airflow adopter in production since the project was open sourced around three years ago.â \n\nThere are several key components of the architecture. A web UI allows users to view the status of their queries, along with an audit trail of any modifications the query. A metadata database stores things like job status and task instance status. A multi-process scheduler handles job requests, and triggers the executor to execute those tasks. \n\nAirflow supports several executors, though Lyft uses CeleryExecutor to scale task execution in production. Airflow is deployed to three Amazon Auto Scaling Groups, with each associated with a celery queue. \n\nAudit logs supplied to the web UI are powered by the existing Airflow audit logs as well as Flask signal. \n\nDatadog, Statsd, Grafana, and PagerDuty are all used to monitor the Airflow system.\n","publishedAt":"2018-12-22T12:00:00Z","commentsCount":0,"private":false,"upvotesCount":6,"upvoted":false,"flagged":false,"bookmarked":false,"viewCount":542043,"draft":false,"createdAt":"2018-12-22T12:00:00Z","decisionType":null,"showAutoGeneratedTag":false,"permissions":{"type":"id","generated":true,"id":"$StackDecision:102366436228454685.permissions","typename":"Permissions"},"subjectTools":[],"fromTools":[],"toTools":[],"link":{"type":"id","generated":true,"id":"$StackDecision:102366436228454685.link","typename":"Link"},"company":{"type":"id","generated":false,"id":"Company:101231709499257139","typename":"Company"},"topics":[],"stack":null,"services":[{"type":"id","generated":false,"id":"Tool:101231773840406851","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231774519122717","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231773813299908","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231774778599369","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231773456856070","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231773703014119","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231773871944512","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231776031457655","typename":"Tool"}],"user":{"type":"id","generated":false,"id":"User:102366402938755636","typename":"User"},"rootComments":[],"__typename":"StackDecision","answers({\"first\":2})":{"type":"id","generated":true,"id":"$StackDecision:102366436228454685.answers({\"first\":2})","typename":"StackDecisionConnection"}},"$StackDecision:102366436228454685.permissions":{"edit":false,"delete":false,"__typename":"Permissions"},"$StackDecision:102366436228454685.link":{"url":"https://eng.lyft.com/running-apache-airflow-at-lyft-6e53bb8fccff","title":"Running Apache Airflow At Lyft â Lyft Engineering","imageUrl":"https://cdn-images-1.medium.com/fit/c/304/304/1*[email protected]","__typename":"Link"},"Tool:101231774519122717":{"id":"101231774519122717","name":"Grafana","slug":"grafana","title":"Open source Graphite & InfluxDB Dashboard and Graph Editor","verified":false,"imageUrl":"https://img.stackshare.io/service/2645/default_8f9d552b144493679449b16c79647da5787e808b.jpg","canonicalUrl":"/grafana","path":"/grafana","votes":415,"fans":17459,"stacks":17864,"followingTool":false,"followContext":null,"__typename":"Tool"},"Tool:101231774778599369":{"id":"101231774778599369","name":"Airflow","slug":"airflow","title":"A platform to programmaticaly author, schedule and monitor data pipelines, by Airbnb","verified":true,"imageUrl":"https://img.stackshare.io/service/3130/airflow.png","canonicalUrl":"/airflow","path":"/airflow","votes":128,"fans":3081,"stacks":1695,"followingTool":false,"followContext":null,"__typename":"Tool"},"Tool:101231773456856070":{"id":"101231773456856070","name":"PagerDuty","slug":"pagerduty","title":"Incident management with powerful visibility, reliable alerting, and improved collaboration","verified":true,"imageUrl":"https://img.stackshare.io/service/107/GtwgsQj5_400x400.jpg","canonicalUrl":"/pagerduty","path":"/pagerduty","votes":119,"fans":926,"stacks":1015,"followingTool":false,"followContext":null,"__typename":"Tool"},"Tool:101231773703014119":{"id":"101231773703014119","name":"Datadog","slug":"datadog","title":"Unify logs, metrics, and traces from across your distributed infrastructure.","verified":true,"imageUrl":"https://img.stackshare.io/service/669/default_34b3b9b42d07c33ac47ecdff75dd6f4f82aa70ee.jpg","canonicalUrl":"/datadog","path":"/datadog","votes":860,"fans":9903,"stacks":9358,"followingTool":false,"followContext":null,"__typename":"Tool"},"Tool:101231773871944512":{"id":"101231773871944512","name":"Celery","slug":"celery","title":"Distributed task queue","verified":false,"imageUrl":"https://img.stackshare.io/service/1075/celery.png","canonicalUrl":"/celery","path":"/celery","votes":280,"fans":2024,"stacks":1592,"followingTool":false,"followContext":null,"__typename":"Tool"},"Tool:101231776031457655":{"id":"101231776031457655","name":"AWS EC2","slug":"aws-ec2","title":null,"verified":false,"imageUrl":"https://img.stackshare.io/service/5283/no-img-open-source.png","canonicalUrl":"/aws-ec2","path":"/aws-ec2","votes":0,"fans":2,"stacks":4,"followingTool":false,"followContext":null,"__typename":"Tool"},"User:102366402938755636":{"id":"102366402938755636","username":"stackbot","path":"/stackbot","imageUrl":"https://img.stackshare.io/user/299852/default_e6c5c079eeed2f178ff111094f79fc759338903b.png","displayName":"StackShare Editors","title":null,"companyName":null,"__typename":"User"},"$StackDecision:102366436228454685.answers({\"first\":2})":{"count":0,"pageInfo":{"type":"id","generated":true,"id":"$StackDecision:102366436228454685.answers({\"first\":2}).pageInfo","typename":"PageInfo"},"edges":[],"__typename":"StackDecisionConnection"},"$StackDecision:102366436228454685.answers({\"first\":2}).pageInfo":{"hasNextPage":false,"endCursor":null,"__typename":"PageInfo"},"$StackProfile:101231778534857432.stackDecisions({\"currentStackOnly\":true,\"first\":5}).edges.0":{"node":{"type":"id","generated":false,"id":"StackDecision:102366436228454685","typename":"StackDecision"},"__typename":"StackDecisionEdge"},"StackDecision:102366433133631527":{"id":"102366433133631527","publicId":"102366433133631527","htmlContent":"
By 2014, the DevOps team at Lyft decided to port their infrastructure code from Puppet to Salt. At that point, the Puppet code based included around \"10,000 lines of spaghetti-code,â which was unfamiliar and challenging to the relatively new members of the DevOps team. \n\n
âThe DevOps team felt that the Puppet infrastructure was too difficult to pick up quickly and would be impossible to introduce to [their] developers as the tool theyâd use to manage their own services.â \n\n
To determine a path forward, the team assessed both Ansible and Salt, exploring four key areas: simplicity/ease of use, maturity, performance, and community. \n\n
They found that âSaltâs execution and state module support is more mature than Ansibleâs, overall,â and that âSalt was faster than Ansible for state/playbook runs.â And while both have high levels of community support, Salt exceeded expectations in terms of friendless and responsiveness to opened issues. ","rawContent":"By 2014, the DevOps team at Lyft decided to port their infrastructure code from Puppet to Salt. At that point, the Puppet code based included around \"10,000 lines of spaghetti-code,â which was unfamiliar and challenging to the relatively new members of the DevOps team. \n\nâThe DevOps team felt that the Puppet infrastructure was too difficult to pick up quickly and would be impossible to introduce to [their] developers as the tool theyâd use to manage their own services.â \n\nTo determine a path forward, the team assessed both Ansible and Salt, exploring four key areas: simplicity/ease of use, maturity, performance, and community. \n\nThey found that âSaltâs execution and state module support is more mature than Ansibleâs, overall,â and that âSalt was faster than Ansible for state/playbook runs.â And while both have high levels of community support, Salt exceeded expectations in terms of friendless and responsiveness to opened issues. ","publishedAt":"2014-08-13T12:00:00Z","commentsCount":0,"private":false,"upvotesCount":6,"upvoted":false,"flagged":false,"bookmarked":false,"viewCount":457665,"draft":false,"createdAt":"2014-08-13T12:00:00Z","decisionType":null,"showAutoGeneratedTag":false,"permissions":{"type":"id","generated":true,"id":"$StackDecision:102366433133631527.permissions","typename":"Permissions"},"subjectTools":[],"fromTools":[],"toTools":[],"link":null,"company":{"type":"id","generated":false,"id":"Company:101231709499257139","typename":"Company"},"topics":[],"stack":null,"services":[{"type":"id","generated":false,"id":"Tool:101231773700872359","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231773628223844","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231773700412232","typename":"Tool"}],"user":{"type":"id","generated":false,"id":"User:102366402938755636","typename":"User"},"rootComments":[],"__typename":"StackDecision","answers({\"first\":2})":{"type":"id","generated":true,"id":"$StackDecision:102366433133631527.answers({\"first\":2})","typename":"StackDecisionConnection"}},"$StackDecision:102366433133631527.permissions":{"edit":false,"delete":false,"__typename":"Permissions"},"Tool:101231773628223844":{"id":"101231773628223844","name":"Puppet Labs","slug":"puppet","title":"Server automation framework and application","verified":true,"imageUrl":"https://img.stackshare.io/service/421/954f7381089ac290b4690c5ffd9dd7d3.png","canonicalUrl":"/puppet","path":"/puppet","votes":227,"fans":1009,"stacks":1132,"followingTool":false,"followContext":null,"__typename":"Tool"},"Tool:101231773700412232":{"id":"101231773700412232","name":"Ansible","slug":"ansible","title":"Radically simple configuration-management, application deployment, task-execution, and multi-node orchestration engine","verified":true,"imageUrl":"https://img.stackshare.io/service/663/ElOjna20.png","canonicalUrl":"/ansible","path":"/ansible","votes":1323,"fans":18684,"stacks":19017,"followingTool":false,"followContext":null,"__typename":"Tool"},"$StackDecision:102366433133631527.answers({\"first\":2})":{"count":0,"pageInfo":{"type":"id","generated":true,"id":"$StackDecision:102366433133631527.answers({\"first\":2}).pageInfo","typename":"PageInfo"},"edges":[],"__typename":"StackDecisionConnection"},"$StackDecision:102366433133631527.answers({\"first\":2}).pageInfo":{"hasNextPage":false,"endCursor":null,"__typename":"PageInfo"},"$StackProfile:101231778534857432.stackDecisions({\"currentStackOnly\":true,\"first\":5}).edges.1":{"node":{"type":"id","generated":false,"id":"StackDecision:102366433133631527","typename":"StackDecision"},"__typename":"StackDecisionEdge"},"StackDecision:102366436051605760":{"id":"102366436051605760","publicId":"102366436051605760","htmlContent":"
Lyft had tried to make the switch to containers a few years before, with a home-grown provisioning system based on Salt, but ultimately waited until Kubernetes came on the scene and then matured.\n\n
Using a lot of Go for infrastructure and development (in addition to some Python), the Lyft engineering team aims to provide a unified developer experience going between dev and ops.\n\n
Lyft cites Kubernete's ecosystem and ability to be understood easily as reasons for adoption, and now uses it to support developers who have the option of deploying many different microservices with full control of the entire stack from the VM up.\n\n
âNow, [developers] just need to know that they can build an image that needs to run, and once they have that, they hand it over to the infrastructure team and then thatâs it.â","rawContent":"Lyft had tried to make the switch to containers a few years before, with a home-grown provisioning system based on Salt, but ultimately waited until Kubernetes came on the scene and then matured.\n\nUsing a lot of Go for infrastructure and development (in addition to some Python), the Lyft engineering team aims to provide a unified developer experience going between dev and ops.\n\nLyft cites Kubernete's ecosystem and ability to be understood easily as reasons for adoption, and now uses it to support developers who have the option of deploying many different microservices with full control of the entire stack from the VM up.\n\nâNow, [developers] just need to know that they can build an image that needs to run, and once they have that, they hand it over to the infrastructure team and then thatâs it.â","publishedAt":"2018-11-24T12:00:00Z","commentsCount":0,"private":false,"upvotesCount":5,"upvoted":false,"flagged":false,"bookmarked":false,"viewCount":53417,"draft":false,"createdAt":"2018-11-24T12:00:00Z","decisionType":null,"showAutoGeneratedTag":false,"permissions":{"type":"id","generated":true,"id":"$StackDecision:102366436051605760.permissions","typename":"Permissions"},"subjectTools":[],"fromTools":[],"toTools":[],"link":{"type":"id","generated":true,"id":"$StackDecision:102366436051605760.link","typename":"Link"},"company":{"type":"id","generated":false,"id":"Company:101231709499257139","typename":"Company"},"topics":[],"stack":null,"services":[{"type":"id","generated":false,"id":"Tool:101231773842107538","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231774206661614","typename":"Tool"}],"user":{"type":"id","generated":false,"id":"User:102366402938755636","typename":"User"},"rootComments":[],"__typename":"StackDecision","answers({\"first\":2})":{"type":"id","generated":true,"id":"$StackDecision:102366436051605760.answers({\"first\":2})","typename":"StackDecisionConnection"}},"$StackDecision:102366436051605760.permissions":{"edit":false,"delete":false,"__typename":"Permissions"},"$StackDecision:102366436051605760.link":{"url":"https://thenewstack.io/why-kubernetes-makes-lyft-rides-what-they-are-today/","title":"Why Kubernetes Makes Lyft Rides What They Are Today - The New Stack","imageUrl":"https://thenewstack.io/favicon.ico","__typename":"Link"},"Tool:101231774206661614":{"id":"101231774206661614","name":"Kubernetes","slug":"kubernetes","title":"Manage a cluster of Linux containers as a single system to accelerate Dev and simplify Ops","verified":false,"imageUrl":"https://img.stackshare.io/service/1885/21_d3cvM.png","canonicalUrl":"/kubernetes","path":"/kubernetes","votes":681,"fans":60087,"stacks":59695,"followingTool":false,"followContext":null,"__typename":"Tool"},"$StackDecision:102366436051605760.answers({\"first\":2})":{"count":0,"pageInfo":{"type":"id","generated":true,"id":"$StackDecision:102366436051605760.answers({\"first\":2}).pageInfo","typename":"PageInfo"},"edges":[],"__typename":"StackDecisionConnection"},"$StackDecision:102366436051605760.answers({\"first\":2}).pageInfo":{"hasNextPage":false,"endCursor":null,"__typename":"PageInfo"},"$StackProfile:101231778534857432.stackDecisions({\"currentStackOnly\":true,\"first\":5}).edges.2":{"node":{"type":"id","generated":false,"id":"StackDecision:102366436051605760","typename":"StackDecision"},"__typename":"StackDecisionEdge"},"StackDecision:102366434434775723":{"id":"102366434434775723","publicId":"102366434434775723","htmlContent":"
Lyft announced a major open source project in 2016, called Envoy. Envoy is a high-performance service bus for SOA, that abstracts details of the network layer and makes it easy to optimize traffic and see hotspots.\n\n
Before Envoy, Lyft used Amazon ELB for service discover and load balancing, a variety of PHP and Python apps, and some HAProxy where performance was critical. At just 30 services, enough sporadic network issues happened that the team decided to build something new, a way to get real network transparency even in a variable environment like Amazon EC2.\n\n
In an Envoy setup, each app connects through localhost, where a self-contained Envoy process is listening and then routing the traffic. The network is invisible to the app, in other words.","rawContent":"Lyft announced a major open source project in 2016, called Envoy. Envoy is a high-performance service bus for SOA, that abstracts details of the network layer and makes it easy to optimize traffic and see hotspots.\n\nBefore Envoy, Lyft used Amazon ELB for service discover and load balancing, a variety of PHP and Python apps, and some HAProxy where performance was critical. At just 30 services, enough sporadic network issues happened that the team decided to build something new, a way to get real network transparency even in a variable environment like Amazon EC2.\n\nIn an Envoy setup, each app connects through localhost, where a self-contained Envoy process is listening and then routing the traffic. The network is invisible to the app, in other words.","publishedAt":"2016-09-08T12:00:00Z","commentsCount":0,"private":false,"upvotesCount":4,"upvoted":false,"flagged":false,"bookmarked":false,"viewCount":58037,"draft":false,"createdAt":"2016-09-08T12:00:00Z","decisionType":null,"showAutoGeneratedTag":false,"permissions":{"type":"id","generated":true,"id":"$StackDecision:102366434434775723.permissions","typename":"Permissions"},"subjectTools":[],"fromTools":[],"toTools":[],"link":{"type":"id","generated":true,"id":"$StackDecision:102366434434775723.link","typename":"Link"},"company":{"type":"id","generated":false,"id":"Company:101231709499257139","typename":"Company"},"topics":[],"stack":null,"services":[{"type":"id","generated":false,"id":"Tool:101231773860248995","typename":"Tool"}],"user":{"type":"id","generated":false,"id":"User:102366402938755636","typename":"User"},"rootComments":[],"__typename":"StackDecision","answers({\"first\":2})":{"type":"id","generated":true,"id":"$StackDecision:102366434434775723.answers({\"first\":2})","typename":"StackDecisionConnection"}},"$StackDecision:102366434434775723.permissions":{"edit":false,"delete":false,"__typename":"Permissions"},"$StackDecision:102366434434775723.link":{"url":"https://eng.lyft.com/announcing-envoy-c-l7-proxy-and-communication-bus-92520b6c8191","title":"Announcing Envoy: C++ L7 proxy and communication bus","imageUrl":"https://cdn-images-1.medium.com/fit/c/304/304/1*[email protected]","__typename":"Link"},"$StackDecision:102366434434775723.answers({\"first\":2})":{"count":0,"pageInfo":{"type":"id","generated":true,"id":"$StackDecision:102366434434775723.answers({\"first\":2}).pageInfo","typename":"PageInfo"},"edges":[],"__typename":"StackDecisionConnection"},"$StackDecision:102366434434775723.answers({\"first\":2}).pageInfo":{"hasNextPage":false,"endCursor":null,"__typename":"PageInfo"},"$StackProfile:101231778534857432.stackDecisions({\"currentStackOnly\":true,\"first\":5}).edges.3":{"node":{"type":"id","generated":false,"id":"StackDecision:102366434434775723","typename":"StackDecision"},"__typename":"StackDecisionEdge"},"StackDecision:102366435528658523":{"id":"102366435528658523","publicId":"102366435528658523","htmlContent":"
In mid-2017, Lyft decided to introduce a type system to its large JavaScript code base. After assessing the options, they decided to move forward with TypeScript, emphasizing its popularity both among Lyft engineers as well as among the developer community as a whole. \n\n
Since adopting TypeScript, they team has implemented TSLint for linting all TypeScript projects, and have created many TypeScript-only project, like a custom React JavaScript-to-TypeScript converter and an integration with their internal CSS library. ","rawContent":"In mid-2017, Lyft decided to introduce a type system to its large JavaScript code base. After assessing the options, they decided to move forward with TypeScript, emphasizing its popularity both among Lyft engineers as well as among the developer community as a whole. \n\nSince adopting TypeScript, they team has implemented TSLint for linting all TypeScript projects, and have created many TypeScript-only project, like a custom React JavaScript-to-TypeScript converter and an integration with their internal CSS library. ","publishedAt":"2017-09-09T12:00:00Z","commentsCount":0,"private":false,"upvotesCount":4,"upvoted":false,"flagged":false,"bookmarked":false,"viewCount":52985,"draft":false,"createdAt":"2017-09-09T12:00:00Z","decisionType":null,"showAutoGeneratedTag":false,"permissions":{"type":"id","generated":true,"id":"$StackDecision:102366435528658523.permissions","typename":"Permissions"},"subjectTools":[],"fromTools":[],"toTools":[],"link":{"type":"id","generated":true,"id":"$StackDecision:102366435528658523.link","typename":"Link"},"company":{"type":"id","generated":false,"id":"Company:101231709499257139","typename":"Company"},"topics":[],"stack":null,"services":[{"type":"id","generated":false,"id":"Tool:101231773849098825","typename":"Tool"},{"type":"id","generated":false,"id":"Tool:101231774098455524","typename":"Tool"}],"user":{"type":"id","generated":false,"id":"User:102366402938755636","typename":"User"},"rootComments":[],"__typename":"StackDecision","answers({\"first\":2})":{"type":"id","generated":true,"id":"$StackDecision:102366435528658523.answers({\"first\":2})","typename":"StackDecisionConnection"}},"$StackDecision:102366435528658523.permissions":{"edit":false,"delete":false,"__typename":"Permissions"},"$StackDecision:102366435528658523.link":{"url":"https://eng.lyft.com/typescript-at-lyft-64f0702346ea","title":"TypeScript at Lyft â Lyft Engineering","imageUrl":"https://cdn-images-1.medium.com/fit/c/304/304/1*[email protected]","__typename":"Link"},"Tool:101231774098455524":{"id":"101231774098455524","name":"TypeScript","slug":"typescript","title":"A superset of JavaScript that compiles to clean JavaScript output","verified":false,"imageUrl":"https://img.stackshare.io/service/1612/bynNY5dJ.jpg","canonicalUrl":"/typescript","path":"/typescript","votes":502,"fans":83376,"stacks":93669,"followingTool":false,"followContext":null,"__typename":"Tool"},"$StackDecision:102366435528658523.answers({\"first\":2})":{"count":0,"pageInfo":{"type":"id","generated":true,"id":"$StackDecision:102366435528658523.answers({\"first\":2}).pageInfo","typename":"PageInfo"},"edges":[],"__typename":"StackDecisionConnection"},"$StackDecision:102366435528658523.answers({\"first\":2}).pageInfo":{"hasNextPage":false,"endCursor":null,"__typename":"PageInfo"},"$StackProfile:101231778534857432.stackDecisions({\"currentStackOnly\":true,\"first\":5}).edges.4":{"node":{"type":"id","generated":false,"id":"StackDecision:102366435528658523","typename":"StackDecision"},"__typename":"StackDecisionEdge"}}