Dan Tadmor

Driven data science engineer skilled at applying machine learning methods, developing code, breaking down complex concepts into understandable pieces, and communicating data science insights.

Work Experience

DATA SCIENTIST

BJs Wholesale Club

Increased revenue by $600K per month with a CART LTV segmentation model built in Scikit-Learn

Optimized model performance by evaluating demographic and behavioral features based on reliability, multicollinearity, model performance, and business sense

Improved the targeting of 10% of members by identifying and resolving bugs in production model through rigorous ad-hoc analysis on AWS EMR clusters

Analyzed Recommender and Propensity model performance using F1 and ROC AUC scores, prediction shifts, and feature importance

Designed and developed statistically sound longitudinal testing framework utilizing PySpark

Developed regression test suite with the Python Unittest library and determined code checks and process to ensure correct execution

Reduced team workload by 5 hours per week by developing code which increased accuracy and user- friendly outputs of AB test measurements

March 2019 - Present

DATA SCIENCE CONTRACTOR

iRobot

Predicted customer satisfaction using clustering and classification models on user behavior to help automate customer communication
Collaborated with members of marketing, engineering, and data teams to understand how to best define and solve problems with data
Communicated findings across teams using visualizations created in Plotly

November 2018 - March 2019

6th GRADE MATH AND SCIENCE TEACHER

McAuliffe Charter School

Created an alternate method for teaching computational skills, utilizing performance data, that resulted in a 50% increase of student mastery
Designed the 6th grade math and science curriculum by breaking down large concepts into understandable sequential steps
Co-created and installed a school-wide behavior management framework on the culture leadership team improving consistency and effectiveness of teacher expectations
Taught teachers how to successfully implement team building activities with students
On the academic leadership team, improved school wide academic performance in coordination with school leaders while balancing the needs of teachers
Communicated regularly with parents, administrators, and teachers on how to best support students
Strengthened student relationships by coaching the boys basketball team

August 2012 - August 2018

Resume

Education

Georgia Institute of Technology

Masters: Computer Science

Reinforcement Learning and Decision Making
Artificial Intelligence
Robotics: AI Techniques
Computer Vision
High Performance Computer Architecture

August 2019 - December 2022*

General Assembly

Data Science Immersive

Learned machine learning methods in Python over a 480 hour intensive course
Completed several projects involving the full data science work flow.

July 2018 - October 2018

ColumbiaX MicroMasters

Artificial Intelligence Course

Utilized natural language processing to conduct sentiment analysis on IMDB reviews
Built intelligent agents for search problems, games, and constraint problems
Created machine learning models using classification, linear regression, and perceptrons
Certificate

January 2018 - April 2018

Northeastern University

Masters in the Arts of Teaching: Elementary Education

September 2010 - August 2011

Uniersity of Illinois at Urbana - Champaign

Bachelor of Science: Mathematics

August 2005 - May 2009

Coding Projects

Google Foobar

Coding Challanges

Successfully completed all 5 levels of increasingly difficult Google coding challenges in Python. These challenges required an assortment of math and programming skills ranging from complex concepts in Group Theory to implementing a variety of search algorithms. In addition to a set of visible and hidden test cases, solutions needed to complete within a time limit and have a limited amount of code. This often required breaking down the problems and planning solutions prior to writing any code. My solutions and tests along with the challenge descriptions are available on GitHub.

Lunar Lander

Deep-Q Learning

The goal of this project was to train a computer agent to land a spacecraft in a target zone through the use of reinforcement learning. The learning algorithm I used for the solution was a Q-learning algorithm that used a neural network to generate value approximations. The number of layers and depth of each layer of the neural network needed to be carefully tuned along with other parameters such as action replay, and ε-greedy decay. The action replay is a set of saved memories used for retraining throughout learning. It needs to balance newer memories which allow the agent to learn from better runs and older memories which prevent the agent from getting stuck in a local minimum. ε-greedy determines how often the agent explores (chooses actions at random) and exploits (chooses actions based on its learnings). As the agent improves, it finds itself in new states and needs to explore these states, but too many random actions prevent it from reaching its full potential. Click here for a video of the trained lunar lander.

Gem Finder

Graph SLAM

The goal of this project was to help a robot navigate an unknown territory and extract a set of predetermined gems. Using measurements from robot sensors, the robot could determine the distance to the gems. This distance, along with the robot movements were used to map the environment as well as locate the robot in that environment with a SLAM algorithm. Limited by a maximum speed and turning radius and noisy sensory data, the robot needed to determine a course of action and extract the necessary gems within a specified time limit and without making costly mistakes.

Data Projects

NBA Player Growth

Performance and Rates

Recognizing the complexities of player performance, this project aims to predict player tendencies and expectations in key aspects of the game. After defining the project goal, I collected hundreds of individual player statistics from the NBA website and additional data from Basketball Reference required to evaluate different outcomes of plays. I then created target features of performance and rate for different plays and engineered features that better valued a player's current performance. For each target, I closely examined the most related features using graphs, functions I created, and feature selection tools from scikit-learn. Using linear and regularized regression, I optimized model performance and interpretability. These models were able to outperform baseline estimates by up to 29%. See more on GitHub

West Nile Virus

Surveillance and Response

This was a team project with the goal of creating a cost benefit analysis of when and where the city of Chicago could spray pesticides in order to reduce the affect of West Nile virus on humans. We began by researching effective surveillance techniques and the best predictors of West Nile virus. We then retrieved mosquito trap data for the last 12 years from the city of Chicago website along with weather data for those years from the NOAA. Using the data, I oversampled positive cases in order to create a model that was sensitive to its presence. To avoid overfitting on sparse data, I used a random forest with a small depth. These methods increased our model's sensitivity from 0 to .77. Taking into account areas with dense susceptible populations, the predictions of our model, and the cost of spraying, we created our recommendations for the city of Chicago. See more on GitHub

Board Games and Video Games

Understanding Different Languages

The goal of this project was to better understand the board gaming and video gaming communities by determining the differences in their word choice. Using the Reddit API, I scraped over 5,000 posts from the board game and video game subreddits. I stemmed and applied a TFIDF vectorizer to the posts. From there, I optimized the hyperparameters of different classifiers. I chose a logistic regression due to strong predictability and interpretability. It was able to determine subreddits with 98% accuracy. Using it, I could determine which words were the best predictors of each subreddit. See more on GitHub

Ames, Iowa Housing

Predicting Homeprices

The goal of the project was to create a model to predict home prices in Ames, Iowa using a dataset with around 2000 homes and 80 features. I began by cleaning the data and exploring it to understand the relationships of the features with each other and with home prices. Once completed, I used background knowledge to create features that had stronger correlations with sale price. Using a few different functions that I created, I narrowed down the features to ones that were less correlated with each other and best correlated with sales prices. To create interaction terms I used polynomial features. Finally, I scaled the data, and optimized L1 and L2 regularized regressions to predict the home prices.

Skills

Programming Languages & Libraries

Python

Pandas
PySpark
Scikit-learn
Statsmodels
NumPy
SciPy
Keras
OpenCV
Plotly
Bokeh

SQL

C++

Machine Learning Tools

Generalized Linear Models
Regularization
Classification and Regression Trees
Support Vector Machines
Neural Networks
Count and TFIDF Vectorizers
Bagging and Boosting
Clustering
Principal Component Analysis
Pipelines

Interests

Basketball

I am a huge fan of all things basketball. As much as possible, I like to play basketball. When that is not possible, I love watching the Celtics (or other NBA teams) and I love analyzing teams using advanced analytics.

Board Games

In high school, I fell in love with board games while playing Settlers of Catan. Years later, I discovered that there was a whole world of board games out there. While most of my favorite games are medium-heavy Euros, I have been slowly getting into economic train games.

Books

When I have some free time to myself, especially when commuting, I often fill it with reading. Mostly, I enjoy reading science fiction and fantasy. I am currently alternating between The Maze of Games and A People's Future of the United States.