Author: Sierra Stanton
This project analyses real estate data from Zillow Research to determine which areas are prime for investment. We have been charged with helping Hellbent Investments narrow down the ZIP codes that they should target as they plan for this next round of outspend. We'll use Zillow's dataset to dive into home values in Atlanta across time, and narrow down five ZIP codes worthy of investment according to project ROI.
We' be forecasting real estate prices of various ZIP codes using data from Zillow Research. For this project, we'll will be acting as a consultant for Hellbent Investments, a real-estate investment firm focused on development in Atlanta, GA.
Hellbent Investments has asked me what seems like a simple question:
What are the top 5 best ZIP codes for us to invest in?
In order to provide a solid recommendation, we met with the firm and determined that projected ROI across the next three years is the best way to narrow down the ZIP codes they'll want to focus on.
First, we'll investigate home value data across time within Atlanta's ZIP codes. Second, we'll investigate the ZIP codes themselves to gauge potential contributing factors. Third, we'll use time series forecasting via ARIMA and Facebook Prophet to chart out future home values and narrow down our list of ZIP codes to just five targets.
Data will be used from the following source:
- Zillow Research - this sector of Zillow aims to be the most open, authoritative source for timely and accurate housing data and unbiased insight.
Zillow Research's Zillow Data (zillow_data.csv): this dataset shows us the average housing sales prices in the United States based on location while shedding light on other location-focused aspects through rankings such as population density. This set shows us information on over 14,000 zip codes - let's explore it further.
I use descriptive analysis and time-series modeling to show:
- How the home values across Atlanta ZIP codes ranged over twenty-two years
- The population density across Atlanta ZIP codes
- The effect of the 08' housing crisis and how we narrowed data appropriately to downvalue this trend
Due to the datasets used and their current nature, I'm confident these findings will prove relevant as Hellbent Investments begins their next real-estate outspend. There are many more ways we can deepen our analysis to prove even more insightful with more time and access to additional data.
Here's how we estimate the strongest Atlanta ZIP codes will perform in the next two years:
30308: 24% return30316: 16.5% return30331: 15.7% return30317: 9.3% return30363: -2% return
This goes beyond our initial time-series analysis and is worth further exploration.
Since we have fairly large confidence intervals for these ZIP codes and can't confidently say these will produce that return, I advise that we drill down even further into the top three ZIP codes and do the following:
TECHNICAL
- Bring in our isolated data across ZIPS
- Introduce further methods for stationarity
- Tune our parameters according to our tests like ACF/PCF
- Evaluate exactly which months were affected by the 08' crisis and remove them, while keeping data back to 96' which brings over a decade more of training data to improve our models
- Bring in our holdout data for further evaluation betewen model iterations
- Evaluate all of our Atlanta ZIPs versus the best past-performers to see how ROI and confidence levels compare
NON-TECHNICAL Tie our research further to Hellbent’s Real Estate Development Goals by:
- Diving deeper into Zillow’s metric for value determination through time, and ensure it matches the type of development and returns expected in context
- Analyzing each ZIP code and creating a development project timeline and/or aligned metric to ensure we’re abreast of current development saturation - and how such past developments have affected housing values in Atlanta
- Map ROI past a three year horizon in order to make increasingly actionable
We can use the same steps as shown here to evaluate different real-estate data given we have methods of assessing value and can therefore bring in comparable data to see how well our conclusions hold. We can even expand this to all or some of the other 14,000+ ZIP codes represented to develop more targets and evaluate predictability among returns.
Please review my full analysis in my Jupyter notebooks or presentation.
For any additional questions, please contact Sierra Stanton & [email protected]
├── README.md
├── notebooks
├── data
├── presentation
├── images
└── Atlanta_Zipcode_Analysis_Project.pdf






