Hello! My name is Malcolm Taylor. This website is designed to help you get to know more about me and my projects.
Feel free to look at my various projects and contact me at firstname.lastname@example.org if you have any questions.
Tinder Data Exploratory Analysis - Python Scripts - 2019
Tinder is a dating app that allows users to start conversations if they both like the other’s profile. I was able to download all data Tinder has available from my profile and interactions since I opened an account with them. In this project, I analyzed my Tinder data and created visualizations via Dash. I also explored deploying the website via Amazon Web Services via different methodologies. More information can be found here
Electrolysia Energy Optimization - iPython Notebooks - 2018
This dataset is provided via the Analytics Vidhya hackathon betwen January 13, 2018- January 14, 2018. and consists of hourly measurements for electricity consumption as well as several environmental features. While I did not compete in the competition, this dataset useful for performing analysis on a multivariate time series. I explore standard machine learning techniques such as linear regression and random forests and more complex algorithms such as Recurrent Neural Networks and Long Short Term Memory Networks. More information can be found here
Song Lyrics NLP Analysis - Python Scripts - 2018
In this project, my friend Devin Rose and I use the spotify and genius APIs to get the lyrics for songs in user created playlists. Next we use pre-trained embeddings for the words in each song to generate an embedding for song and reduce the dimensionality using TSNE for visualization. We are also able to pull several musical metrics for each song from the Spotify API such as danceability and acousticness. We are able conduct different analyses based on the Spotify metrics as well as other features from the song lyrics and embeddings. More information can be found here
Zillow Z-Estimate Home Value Prediction Project - iPython Notebook - 2017
In this project, my friend Devin Rose and I work on the 2017 Kaggle Zillow error prediction competition. The goal of this competition is to predict the difference between Zillow’s housing estimate and the sale price of the house given various information about the properties. We clean the data set and apply several machine learning algorithms to generate predictions on the test set. We used practiced using linear regression, K-Nearest Neighbors and Neural Networks. More information can be found here
Wine Quality Analysis - iPython Notebook - 2017
In this project, the goal is to predict the quality of wine based on several chemical features. There are two sets of analysis. In one notebook, we try to determine which wines are high quality compared to low quality in a classification setting. We compare a standard logistic regression with a Bayesian logistic regression using pymc3.
In the other notebook, we examine the quality as a continous regression problem using Random Forest and Linear Regression. We conduct some exploratory data analysis, preprocess the data and run the algorith. We also conduct a random search among the hyper parameters of the Random Forest to find an optimal model without taking too much time.
More information can be found here
SAT Score EDA - R - 2017
In this project, I conduct Exploratory data analysis on a the 2012 SAT Scores from the 5 boroughs of New York City. More information can be found here
Rental List Inquiries by Two Sigma - R -2017
In this project, I construct several R scripts to explore and create predictions for the rental listings dataset from the Kaggle competition. I use the h2o platform in R to conduct one-vs-all logistic regression and a random search through the hyper parameters of a gradient boosted machine. More information can be found here
Stock Forecasting using Fundamental Data - R - 2016
This was a group project where we attempted to forecast the monthly return and sign of the return of several public companies.
The team and I used elastic net regression techniques, random forest and deep neural networks to create an ensemble model.
This project was a final assignment at Boston University. More information can be found here
Stephen Curry / Under Armour Project - R - 2016
In this project, I examine Stephen Curry’s performance with the over night stock return of his corporate sponsor Under Aromour. More information about this project can be found here
VIX Forecasting, Bayesian Monte Carlo Simulation Assignment - R - 2016
In this project, I examine the stock ticker for the Volatility Index (VIX). I use a Bayesian version of a time series linear regression. We aim to obtain predictive posterior distributions for the VIX given weekly data over a 3 year period. This project was inspired by an assignment at Boston University. More information can be found here
Abalone Age Project - iPython Notebook - 2016
In this project, I analyze the famous Abalone dataset from the University at California Irivine (UCI). We visualize some of the data and practice data cleaning and running a few machine learning algorithms on the data. More information can be found here