View on GitHub


For the 2017-2019 Zillow Kaggle competition.

Kaggle Zillow Error Prediction Competition

Competition on Kaggle to predict the error of Zillow’s housing estimates in several California markets. The data contained information on housing locations such as number of bed rooms, latitude and longitude and other housing related information. The dependent was the log difference between the sale price and Zillow’s estimate.

Devin and I conducted exploratory data analysis and data cleaning for the competition. We tried K Nearest Neighbors, Neural Networks and Linear Regression algorithms with mixed success.

The linear regression did not perform much better than guessing the average house price. The R-square was essentially 0.

The files in MT folder are reduced versions of the descriptive analysis in the notebooks in the directory above.