This analysis delves into the intricate behaviors exhibited by fish in response to different environmental and social stimuli, employing a dataset that captures various metrics of shoaling behavior before and after treatment. Through meticulous data processing and visualization using pandas, seaborn, and matplotlib in Python, the research uncovers significant shifts in shoaling dynamics, scrutinizing pre-treatment and post-treatment behaviors across solitary and group settings. Utilizing a detailed dataset (source: Michelangeli, M., Munson, A. and Sih, A. (2021) “Stable social groups foster conformity and among-group differences”. Zenodo. doi: 10.25338/B84W7B.) Link: Stable social groups foster conformity and among-group differences (zenodo.org)
Initial analyses reveal the presence of missing values and provide a comprehensive statistical overview, setting the stage for a deeper examination of shoaling behavior. Boxplot visualizations offer a stark contrast between behaviors exhibited before and after treatment, highlighting the impact of environmental manipulation on shoaling preferences. Subsequent t-tests across different treatment types (Group and Solitary) underscore the behavioral changes, revealing statistically significant differences that illuminate the role of social context in shaping shoaling behavior.
I set out with the goal to test different analysis techniques, including statistical tests, data visualization, machine learning, and feature importance analysis, to uncover insights into the shoaling behavior of fish. This objective served a dual purpose: to generate scientific insights and to showcase my analytical skill set.
Data Preparation and Visualization Utilizing Python and its powerful libraries (pandas for data manipulation, seaborn and matplotlib for visualization), I began by processing the dataset to identify and address missing values, providing a clean foundation for further analysis. Visual comparisons of pre-treatment and post-treatment shoaling behaviors highlighted the initial insights into behavioral changes, effectively showcasing my ability to manipulate and visualize complex datasets.
Statistical Analysis By applying t-tests, I assessed the impact of treatments on shoaling behavior across different social settings (solitary vs. group). This step not only contributed to the scientific understanding of the subject but also demonstrated my capability in applying statistical methods to real-world data.
Exploratory Data Analysis I explored how different arenas and social contexts influence shoaling behavior, utilizing groupby aggregations and descriptive statistics. This phase underscored my proficiency in exploratory data analysis, revealing critical patterns and insights.
Machine Learning Leveraging a Gradient Boosting Regressor, I ventured into predictive modeling to examine heterogeneous treatment effects, adjusting the model to enhance its predictive power and interpretability. This advanced analysis highlighted my skills in applying machine learning algorithms and interpreting their outcomes.
Feature Importance and Model Evaluation Through feature importance analysis, I identified key drivers of shoaling behavior, further refining the model based on these insights. Evaluating the model's performance using metrics like mean squared error and R², along with investigating the residuals, I demonstrated a deep understanding of model evaluation techniques.
Residual Analysis The investigation into the model's residuals offered a critical lens through which the model's predictive capabilities were assessed. This step was crucial in showcasing my ability to conduct thorough model diagnostics and refine predictive models based on residual analysis.