Hey ð
### About Me ð
- Currently Senior EM at Databricks focusing on data portal, data lineage and other discovery efforts for Unity Catalog
- Apache Airflow PMC member and committer
- Co-creator and Maintainer of [ **Amundsen**](https://github.com/lyft/amundsen)
### Get in Touch ð«
-
Amundsen: [github.com/amundsen-io/amundsen](https://github.com/amundsen-io/amundsen)
- ð¦ Twitter: [@photoft45](https://twitter.com/photoft45)
-
Slack: [@amundsen / Tao Feng](https://join.slack.com/t/amundsenworkspace/shared_invite/enQtNTk2ODQ1NDU1NDI0LTc3MzQyZmM0ZGFjNzg5MzY1MzJlZTg4YjQ4YTU0ZmMxYWU2MmVlMzhhY2MzMTc1MDg0MzRjNTA4MzRkMGE0Nzk)
- ð LinkedIn: [@tao-f-17195814](https://www.linkedin.com/in/tao-f-17195814/)
### Talks & Writings ð¬ ð
#### Conference & Meetup Presentations
- [Democratize Data Discovery And Data Insight With Databricks Platform](https://www.databricks.com/dataaisummit/session/democratize-data-discovery-and-data-insight-databricks) @ Data+AI summit 2024
- [Discover Data Lakehouse With E2E Lineage](https://www.databricks.com/dataaisummit/session/discover-data-lakehouse-end-end-lineage) @ Data+AI summit NA 2022
- [Data Discovery at Databricks with Amundsen](https://databricks.com/session_na21/data-discovery-at-databricks-with-amundsen) @ Data+AI summit NA 2021
- [Data discovery Amundsen & Presto](https://www.meetup.com/prestodb/events/274895626/) @ Presto DB meetup Dec 2020
- [Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metadata Platform](https://databricks.com/session_eu20/solving-data-discovery-challenges-at-lyft-with-amundsen-an-open-source-metadata-platform) @ Data+AI summit Europe 2020
- [Project Amundsen update](https://events.linuxfoundation.org/open-source-summit-europe/program/schedule/) @ LFAI Mini Summit and open source summit europe 2020
- [Airflow Summit 2020 invited key note](https://airflowsummit.org/speakers/) ([slide](https://www.slideshare.net/taofung/airflow-at-lyft-airflow-summit2020))
- [Airflow @ Lyft](https://www.meetup.com/SF-Big-Analytics/events/259771952/) @ SF Big Analytics Meetup April 2019
- [Amundsen: A Data Discovery Platform from Lyft](https://www.datacouncil.ai/talks/amundsen-a-data-discovery-platform-from-lyft?hsLang=en) @ Data Council SF April 2019
- [Disrupting Data Discovery](https://www.slideshare.net/taofung/strata-sf-amundsen-presentation) @ Strata SF 2019
#### Engineering Blogs
- [Accelerating discovery on Unity Catalog with a revamped Catalog Explorer](https://www.databricks.com/blog/accelerating-discovery-unity-catalog-revamped-catalog-explorer) @ Databricks Engineering blog 2024
- [Creating a bespoke LLM for AI-generated documentation](https://www.databricks.com/blog/creating-bespoke-llm-ai-generated-documentation) @ Databricks Engineering blog 2023
- [Announcing Public Preview of AI Generated Documentation In Databricks Unity Catalog](https://www.databricks.com/blog/announcing-public-preview-ai-generated-documentation-databricks-unity-catalog) @ Databricks Platform blog 2023
- [Announcing General Availability of Data lineage in Unity Catalog](https://www.databricks.com/blog/2022/12/12/announcing-general-availability-data-lineage-unity-catalog.html) @ Databricks Platform blog 2022
- [Announcing Public Preview of Data Lineage in Unity Catalog](https://www.databricks.com/blog/2022/09/12/announcing-public-preview-data-lineage-unity-catalog.html) @ Databricks Platform blog 2022
- [Announcing the Availability of Data Lineage With Unity Catalog](https://databricks.com/blog/2022/06/08/announcing-the-availability-of-data-lineage-with-unity-catalog.html) @ Databricks Platform blog 2022
- [Amundsen: one year later](https://eng.lyft.com/amundsen-1-year-later-7b60bf28602) @ Lyft engineering blog 2020
- [Open Sourcing Amundsen: A Data Discovery And Metadata Platform](https://eng.lyft.com/open-sourcing-amundsen-a-data-discovery-and-metadata-platform-2282bb436234) @ Lyft engineering blog 2019
- [Securing Apache Airflow UI With DAG Level Access](https://eng.lyft.com/securing-apache-airflow-ui-with-dag-level-access-a7bc649a2821) @ Lyft engineering blog 2019
- [Running Apache Airflow At Lyft](https://eng.lyft.com/running-apache-airflow-at-lyft-6e53bb8fccff) @ Lyft engineering blog 2018
- [Common Issue Detection for CPU Profiling](https://engineering.linkedin.com/blog/2017/09/common-issue-detection-for-cpu-profiling) @ Linkedin engineering blog 2017
- [ODP: An Infrastructure for On-Demand Service Profiling](https://engineering.linkedin.com/blog/2017/01/odp--an-infrastructure-for-on-demand-service-profiling) @ Linkedin engineering blog 2017
- [Benchmarking Apache Samza: 1.2 million message per sec on a single node](https://engineering.linkedin.com/performance/benchmarking-apache-samza-12-million-messages-second-single-node) @ Linkedin engineering blog 2015
#### Conference Papers
- [ODP: An Infrastructure for On-Demand Service Profiling](https://www.slideshare.net/taofung/odp-on-demand-profiler-icpe-2018) @ IEEE ICPE 2018
- [Effective Multi-stream Joining for Enhancing Data Quality in Apache Samza Framework](https://www.slideshare.net/taofung/effective-multistream-joining-in-apache-samza-framework) @ IEEE Bigdata Congress 2016
- [A Memory Capacity Model for High Performing Data-filtering Applications in Samza Framework](https://www.slideshare.net/taofung/a-memory-capacity-model-for-high-performing-datafiltering-applications-in-samza-framework-85955263) @ IEEE Big Data 2015
#### Podcasts
- [Interview with Software Engineering Daily on Data Discovery at Lyft](https://softwareengineeringdaily.com/2019/04/16/lyft-data-discovery-with-tao-feng-and-mark-grover/)
- [Interview with Data Engineering Podcast on Amundsen](https://www.dataengineeringpodcast.com/amundsen-data-discovery-episode-92/)
#### Patents
- [On-demand profiling based on event streaming architecture](https://patents.google.com/patent/US10019340) (granted)
- [pending patent]