Data Science

Data Science and Machine Learning - Prediction of Parkinson’s Disease Progression

Predicted the Total UPDRS score based on 6000 records from the Parkinson’s telemonitoring dataset using various regression techniques. Achieved a minimum MAE of 0.415 using PCA with a multi-layered perceptron model. Determined an optimum threshold value of 15 for motor UPDRS score discriminated by dysphonia measurements

Deep Learning - Terrain Identification for Time Series Data

Implemented an LSTM model with shifting and down sampling the data to achieve an F1 score of 0.868, a signification improvement over random forest baseline which gave an F1 score of 0.39. Compared results with techniques such as SMOTE and weighted loss.

Natural Language Processing - Sentiment Analysis

Performed sentiment analysis on a crowd sourced movie review dataset with doc2vec model for vectorization pre-trained on IMDB movie review dataset. Compared results against other vectorization techniques such as count vectorization and word2vec models