Driven by curiosity, eager self-learner, OpenSource contributor, PyData volunteer and speaker, currently searching for a Data Scientist role where I could fully employ my analytical, programming and communication skills
Passionate about developing and deploying machine learning models. Proficient in using Python for data analysis and machine learning tasks. Skilled in data visualization tools to communicate insights effectively. Familiar with with cloud platforms such as Azure and Google Cloud. I want to drive decision-making and solve problems using data.
-
I’m currently working on:
📊 PyData Amsterdam 2024 Talk I will be talking about my experience of stepping into the rabbit hole of contributing to open-source software, highlighting key learnings and practical steps for beginners. It covers overcoming self-doubt, learning through collaboration, and the unexpected joys of community engagement. What you can learn from contributing to Open Source and what you probably will not as an aspiring Data Scientist.🏃PyData Amsterdam 2024 Open Source Sprint: Narwhals Narwhals is an extremely lightweight and extensible compatibility layer between dataframe libraries, and it needs your help! An open source sprint is the perfect opportunity to make your first contribution to open source. The core maintainers of the Narwhals package will prepare a list of easy and accessible first issues to get started with, and will be present in this session to guide you to make your first commit to the package. This is the perfect opportunity to give back to the Python ecosystem, while having some fun.
🐳Contributing to Dask/PyArrow backend in Narwhals At Narwhals, we’re committed to helping you build dataframe-agnostic tools. Whether your users prefer pandas, polars dataframes, or even pyarrow tables, Narwhals has you covered. There’s still plenty of work to do, so if you’d like to contribute and enhance Narwhals, feel free to check out our Contributing Guide and join us on Discord.
🤖 Did ChatGPT replace Juniors? Inspired by personal curiosity and a 2023 Hackathon challenge (won in the ‘Most Polished’ category). This project investigates the impact of large language models like ChatGPT on entry-level roles in tech. Demonstrated skills include data cleaning, data wrangling, data analysis, and modeling, using tools such as Python, APIs, Polars, and Hvplot.
-
🌱 I’m currently learning 🐻❄️ Polars and that's Ritchie Vink - creator of Polars with my graffiti:
-
👨💻 All of my projects are available at https://github.com/anopsy
-
📑If you'd like to hire me, check my CV
-
📝 I write about my learning journey on https://medium.com/@anopsy28
-
📫 How to reach me [email protected]
-
⚡ Fun fact 🎨 I paint graffiti portraits
🎨 Selected Portfolio Projects ┣━━ contributing to OSS at: ┃ ┣━━ 🧱scikit-lego ┃ ┃ ┣━━ contributed to docs ┃ ┃ ┗━━ made ColumnSelector dataframe agnostic using Narwhals ┃ ┗━━ 🐳🦄narwhals ┃ ┃ ┣━━ worked on pyarrow/dask backend implementation ┃ ┃ ┗━━ contributed to docs and tests ┃ ┗━━ 💡embetter ┃ ┣━━ deprecated a method ┃ ┗━━ added pre-commit hooks ┃ ┣━━ Juniors_vs_ChatGPT ┃ - Did ChatGPT replaced Juniors and Interns? ┃ ┣━━ data cleaning ┃ ┣━━ data wrangling ┃ ┣━━ data analysis ┃ ┣━━ modeling ┃ ┗━━ python🐍/API/polars🐻❄️/hvplot📊 ┃ ┣━━ Compensation Prediction ┃ - How much do Engineers earn? ┃ ┣━━ data modeling ┃ ┣━━ model evaluation ┃ ┣━━ containerization using docker ┃ ┣━━ building streamlit app ┃ ┗━━ python🐍/scikit-learn/streamlit📈/docker📦 ┃ ┣━━ MaskMap: Decoding the Hidden Spectrum ┃ - Prototype of a diagnosis support tool using the power of NLP to identify symptoms of Autistic Masking ┃ ┣━━ data scraping ┃ ┣━━ data cleaning ┃ ┣━━ modeling ┃ ┣━━ deploying ┃ ┗━━ python🐍/pandas🐼/FastAPI ┃ ┣━━ Equity in Healthcare: Women in Data Science Datathon 2024 ┃ - WIDS Datathon Project predicting a timely diagnosis of Metastatic Cancer ┃ ┣━━ data cleaning ┃ ┣━━ data wrangling ┃ ┣━━ data analysis ┃ ┣━━ modeling ┃ ┗━━ python🐍/pandas🐼/ensemble🌳/keras🧠 ┃ ┣━━ Relative Search Volumes Analysis ┃ - Search Volumes for Autism vs Autism Spectrum Disorder around the world ┃ ┣━━ data scraping ┃ ┣━━ data cleaning ┃ ┣━━ modeling WIP ┃ ┗━━ python🐍/pandas🐼 ┃ ┣━━ Steelplate Defect Visual EDA ┃ - Colorful joyplots for Visual EDA ┃ ┣━━ data visualization ┃ ┣━━ ensemble ┃ ┗━━ python🐍/pandas🐼/xgb🌳/seaborn🎨 ┃ ┣━━ hossenfelder - 🦺WIP ┃ - Data Analysis and Prediction of views on Sabine Hossenfelder YT channel ┃ ┣━━ data scraping ┃ ┣━━ data cleaning ┃ ┣━━ modeling WIP ┃ ┗━━ python🐍/pandas🐼 ┃ ┗━━ MyFalaClassifier - 🦺WIP - Detector of surfable waves ┣━━ live-stream scraping ┣━━ image processing ┣━━ transfer learning ┣━━ deploying ┗━━ python🐍/keras🧠