Welcome to my GitHub! I'm a Data guy (analytics/engineering/science) with a Master’s in Advanced Data Analytics and a solid foundation in Data Analytics, Data Science, Data Engineering, MLOps, and Business Analytics. I’m passionate about building data-driven solutions that drive growth, innovation, and operational efficiency. My background spans data architecture, scalable ML pipelines, cloud computing, and actionable insights that help teams make strategic decisions.
- ⚡ Former Product Lead at Cirrus Nexus (Cumulus Nexus India Pvt Ltd)
- 👨💻 Experienced in Python, R, SQL, Rust, C++, Go, Terraform, and advanced ML frameworks like TensorFlow, PyTorch, and Scikit-Learn
- ☁️ Proficient in Cloud Platforms: AWS (SageMaker, Glue, Redshift, Lambda), Azure (Data Factory, Synapse, HDInsight, ML Studio), GCP (BigQuery, Looker, Vertex AI Platform); Certified in AWS, Azure, GCP, and Kubernetes
- 📊 Skilled in Data Engineering (ETL, Data Modeling, Real-Time Streaming), MLOps (CI/CD, Model Deployment), and Data Science (Predictive Modeling, NLP, Computer Vision)
- 💬 Advocate for Cloud Cost Optimization strategies, helping companies cut costs while improving performance through structured planning
- Data Engineering & Big Data Pipelines – Architecting and optimizing ETL pipelines for large-scale data processing with Apache Spark, Flink, Superset, Dagster, Druid,Delta lakee,dbt,Airflow, Snowflake, and Fivetran
- MLOps Pipelines – Building end-to-end ML pipelines with Kubernetes, Docker, Jenkins, and Kubeflow to automate model training and deployment, with a focus on scalability and CI/CD workflows
- Generative AI & NLP Models – Developing cutting-edge models for NLP, including language models and sentiment analysis, using transformer architectures
- Cloud Infrastructure Optimization – Implementing efficient infrastructure using Terraform and IaC (Infrastructure as Code) to optimize cloud resources on AWS, Azure, and GCP
- Scaling Machine Learning Operations – Expanding knowledge in MLflow, Argo, and advanced MLOps for seamless deployment and monitoring of ML models
- Distributed Systems & Real-Time Analytics – Exploring Apache Flink, Kafka, and Delta Lake for real-time analytics and streaming solutions
- Advanced Data Engineering – Diving deeper into data warehouse and data lake architecture, leveraging platforms like Snowflake and Databricks
- Tools & Platforms: Apache Spark, Kafka, Hadoop, Snowflake, Databricks, Apache Airflow, Fivetran, dbt
- Cloud & Big Data: AWS (Lambda, Glue, RDS, S3, EMR, Redshift), Azure Data Factory, Azure Databricks, Azure Synapse, GCP BigQuery, Snowflake
- Skills: Data Pipeline Design, ETL Optimization, Data Modeling, Real-Time Data Streaming
- Languages & Libraries: Python, R, Julia, Scala, Java, SQL, Scikit-Learn, TensorFlow, PyTorch, PySpark, Keras, Pandas, Dask
- Specializations: Predictive Modeling, Time Series, NLP, Deep Learning, Hyperparameter Tuning, Computer Vision
- MLOps Tools: Docker, Kubernetes, Jenkins, MLflow, Kubeflow, Argo, Terraform, GitHub Actions
- CI/CD & Automation: CI/CD Pipelines, Model Versioning, Model Deployment, Monitoring & Logging
- Visualization Tools: Power BI, Tableau, Plotly, Matplotlib, ggplot2
- Business Tools: JIRA, Confluence, Lucidchart, Microsoft Visio, Business Process Mapping, Requirements Analysis
- Data Engineering & Cloud:
- AWS Cloud Data Engineer, Azure Data Engineer, Google Cloud Professional Data Engineer, SnowPro Core, Meta Database Engineer
- Machine Learning & Data Science:
- TensorFlow Developer, AWS Certified Machine Learning Specialty, IBM Data Science Professional
- MLOps & DevOps:
- Certified Kubernetes Administrator, Terraform Associate, Databricks Certified for Apache Spark
- Tools: R, SQL, Tableau, ETL
- Summary: Advanced to Round 2 among 400 teams by designing KPIs to track healthcare patient engagement, creating impactful insights for targeted health improvement.
- Tools: Kafka, AWS Lambda, Spark
- Summary: Built a real-time data streaming architecture to process and analyze data instantly, achieving 99.9% system availability and reducing latency for business-critical decisions.
- Tools: Python, Scikit-Learn, AWS
- Summary: Developed a predictive model with 86.2% accuracy to forecast customer churn, allowing for proactive retention strategies and enhancing customer engagement.
- Tools: Python, Apache Airflow, AWS SageMaker
- Summary: Created an ML pipeline automating data preprocessing, model training, and deployment, reducing operational costs by 14% while maintaining high model performance.
- 📫 Email: [email protected]
- 💼 LinkedIn: linkedin.com/in/chaitanyavankadaru
- 📝 Blog: Coming soon, where I'll share insights on data engineering, MLOps, and AI-driven strategies!
- ☕ Tea over Coffee! Extra fuel for complex problem-solving.
- 🎲 Avid puzzle solver and lover of challenging data problems.
- 👾 I enjoy exploring the latest in Generative AI and contributing to open-source projects.
Thanks for stopping by my profile! Feel free to explore my repos, and let’s collaborate if you share similar interests or need insights on cloud and AI solutions.