Skip to content
View EarthlyAlien's full-sized avatar
👨‍💻
Working from Home
👨‍💻
Working from Home

Block or report EarthlyAlien

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
EarthlyAlien/README.md

Hello, I'm Chaitanya 👋

Welcome to my GitHub! I'm a Data guy (analytics/engineering/science) with a Master’s in Advanced Data Analytics and a solid foundation in Data Analytics, Data Science, Data Engineering, MLOps, and Business Analytics. I’m passionate about building data-driven solutions that drive growth, innovation, and operational efficiency. My background spans data architecture, scalable ML pipelines, cloud computing, and actionable insights that help teams make strategic decisions.


🛠️ About Me

  • Former Product Lead at Cirrus Nexus (Cumulus Nexus India Pvt Ltd)
  • 👨‍💻 Experienced in Python, R, SQL, Rust, C++, Go, Terraform, and advanced ML frameworks like TensorFlow, PyTorch, and Scikit-Learn
  • ☁️ Proficient in Cloud Platforms: AWS (SageMaker, Glue, Redshift, Lambda), Azure (Data Factory, Synapse, HDInsight, ML Studio), GCP (BigQuery, Looker, Vertex AI Platform); Certified in AWS, Azure, GCP, and Kubernetes
  • 📊 Skilled in Data Engineering (ETL, Data Modeling, Real-Time Streaming), MLOps (CI/CD, Model Deployment), and Data Science (Predictive Modeling, NLP, Computer Vision)
  • 💬 Advocate for Cloud Cost Optimization strategies, helping companies cut costs while improving performance through structured planning

🔭 Projects

  • Data Engineering & Big Data Pipelines – Architecting and optimizing ETL pipelines for large-scale data processing with Apache Spark, Flink, Superset, Dagster, Druid,Delta lakee,dbt,Airflow, Snowflake, and Fivetran
  • MLOps Pipelines – Building end-to-end ML pipelines with Kubernetes, Docker, Jenkins, and Kubeflow to automate model training and deployment, with a focus on scalability and CI/CD workflows
  • Generative AI & NLP Models – Developing cutting-edge models for NLP, including language models and sentiment analysis, using transformer architectures
  • Cloud Infrastructure Optimization – Implementing efficient infrastructure using Terraform and IaC (Infrastructure as Code) to optimize cloud resources on AWS, Azure, and GCP

🌱 Always Learning

  • Scaling Machine Learning Operations – Expanding knowledge in MLflow, Argo, and advanced MLOps for seamless deployment and monitoring of ML models
  • Distributed Systems & Real-Time Analytics – Exploring Apache Flink, Kafka, and Delta Lake for real-time analytics and streaming solutions
  • Advanced Data Engineering – Diving deeper into data warehouse and data lake architecture, leveraging platforms like Snowflake and Databricks

🧩 Key Skills & Technologies

Data Engineering & ETL

  • Tools & Platforms: Apache Spark, Kafka, Hadoop, Snowflake, Databricks, Apache Airflow, Fivetran, dbt
  • Cloud & Big Data: AWS (Lambda, Glue, RDS, S3, EMR, Redshift), Azure Data Factory, Azure Databricks, Azure Synapse, GCP BigQuery, Snowflake
  • Skills: Data Pipeline Design, ETL Optimization, Data Modeling, Real-Time Data Streaming

Data Science & Machine Learning

  • Languages & Libraries: Python, R, Julia, Scala, Java, SQL, Scikit-Learn, TensorFlow, PyTorch, PySpark, Keras, Pandas, Dask
  • Specializations: Predictive Modeling, Time Series, NLP, Deep Learning, Hyperparameter Tuning, Computer Vision

MLOps & DevOps

  • MLOps Tools: Docker, Kubernetes, Jenkins, MLflow, Kubeflow, Argo, Terraform, GitHub Actions
  • CI/CD & Automation: CI/CD Pipelines, Model Versioning, Model Deployment, Monitoring & Logging

Data Visualization & Business Analysis

  • Visualization Tools: Power BI, Tableau, Plotly, Matplotlib, ggplot2
  • Business Tools: JIRA, Confluence, Lucidchart, Microsoft Visio, Business Process Mapping, Requirements Analysis

🎓 Certifications

  • Data Engineering & Cloud:
    • AWS Cloud Data Engineer, Azure Data Engineer, Google Cloud Professional Data Engineer, SnowPro Core, Meta Database Engineer
  • Machine Learning & Data Science:
    • TensorFlow Developer, AWS Certified Machine Learning Specialty, IBM Data Science Professional
  • MLOps & DevOps:
    • Certified Kubernetes Administrator, Terraform Associate, Databricks Certified for Apache Spark

🌟 Featured Projects

Humana-Mays Case Competition

  • Tools: R, SQL, Tableau, ETL
  • Summary: Advanced to Round 2 among 400 teams by designing KPIs to track healthcare patient engagement, creating impactful insights for targeted health improvement.

Real-Time Data Streaming Solution

  • Tools: Kafka, AWS Lambda, Spark
  • Summary: Built a real-time data streaming architecture to process and analyze data instantly, achieving 99.9% system availability and reducing latency for business-critical decisions.

Customer Churn Prediction Model

  • Tools: Python, Scikit-Learn, AWS
  • Summary: Developed a predictive model with 86.2% accuracy to forecast customer churn, allowing for proactive retention strategies and enhancing customer engagement.

Automated ML Pipeline for Model Deployment

  • Tools: Python, Apache Airflow, AWS SageMaker
  • Summary: Created an ML pipeline automating data preprocessing, model training, and deployment, reducing operational costs by 14% while maintaining high model performance.

💬 Let’s Connect!


⚡ Fun Facts

  • ☕ Tea over Coffee! Extra fuel for complex problem-solving.
  • 🎲 Avid puzzle solver and lover of challenging data problems.
  • 👾 I enjoy exploring the latest in Generative AI and contributing to open-source projects.

Thanks for stopping by my profile! Feel free to explore my repos, and let’s collaborate if you share similar interests or need insights on cloud and AI solutions.

Pinned Loading

  1. big-list-of-naughty-strings big-list-of-naughty-strings Public

    Forked from minimaxir/big-list-of-naughty-strings

    The Big List of Naughty Strings is a list of strings which have a high probability of causing issues when used as user-input data.

    Python

  2. earthlyalien.github.io earthlyalien.github.io Public

    HTML

  3. Public-APIs Public-APIs Public

    Forked from n0shake/Public-APIs

    📚 A public list of APIs from round the web.

  4. tayllan/awesome-algorithms tayllan/awesome-algorithms Public

    A curated list of awesome places to learn and/or practice algorithms.

    21k 2.7k

  5. MicrosoftDocs/azure-docs MicrosoftDocs/azure-docs Public

    Open source documentation of Microsoft Azure

    Markdown 10.3k 21.5k

  6. electricitymaps/electricitymaps-contrib electricitymaps/electricitymaps-contrib Public

    The open source repository for Electricity Maps App and data parsers that enables a real-time visualisation of the CO2 emissions of electricity consumption

    Python 3.6k 951