The Convolutional Neural Networks or “ConvNets” gained popularity through their use with image data, and are currently the state of the art for identifying faces, objects and traffic signs apart from powering vision in robots and self driving cars. In this tutorial we’ll work with MNIST dataset. MNIST is a...
[Read More]
Predicting Digits from their Handwritten Images - I
Deep Learning with TensorFlow
TensorFlow allows us to perform machine learning operations on huge matricies with large efficiency. It can also easily distribute this processing across CPU cores, GPU cores, or even multiple devices like multiple GPUs. Tensor, in TensorFlow is an array-like object, and, similar to an array it can hold matrix, vector,...
[Read More]
Naive Bayes to Classify Movie Reviews Based on Sentiment
From scratch and with Scikit-Learn
We want to predict whether a review is negative or positive, based on the text of the review. We’ll use Naive Bayes for our classification algorithm. A Naive Bayes classifier works by figuring out how likely data attributes are to be associated with a certain class. Naive Bayes model is...
[Read More]
Clustering NBA Players
A Machine Learning Tutorial on using KMeans Clustering
In this blog post, I am sharing my experience in understanding and employing K-Means clustering by clustering NBA Players. K-Means is a popular centroid-based clustering algorithm that we will use. The K in K-Means refers to the number of clusters we want to segment our data into. We first load...
[Read More]
Predicting Upvotes on Hacker News Data
Natural Language Processing
Natural language processing (NLP) is the study of enabling computers to understand human languages. This field may involve teaching computers to automatically score essays, infer grammatical rules, or determine the emotions associated with text. In this project we will employ NLP on Hacker News data. Hacker News is a community...
[Read More]
Predicting Salaries with Decision Trees
Scikit-Learn
The decision tree algorithm is a supervised learning algorithm – we first construct the tree with historical data, and then use it to predict an outcome. One of the major advantages of decision trees is that they can pick up nonlinear interactions between variables in the data that linear regression...
[Read More]
Facts about countries - Using Python With SQLite
Intro to sqlite3
In this project I querry a SQLite database using python. SQLite is a relational database management system that enables us to create databases and query them using SQL syntax. SQLite is simpler than full database systems like MySQL and PostgreSQL likely at the expense of reduced performance. The sqlite3 Python...
[Read More]
Titanic - Machine Learning from Disaster!
Machine Learning Project with Kaggle
The Titanic shipwreck is the most famous shipwreck in history and led to the discussions of better safety regulations for ships. One substantial safety issue was that there were not enough lifeboats for every passenger on board, which meant that some passengers were prioritized over others to use the lifeboats....
[Read More]
Classification of Iris Varieties
Machine Learning Tutorial with scikit-learn - KNN-classification
Iris might be more polular in the data science community as a machine learning classification problem than as a decorative flower. Three Iris varieties were used in the Iris flower data set outlined by Ronald Fisher in his famous 1936 paper “The use of multiple measurements in taxonomic problems as...
[Read More]
Analyzing Thanksgiving Dinner
Intro to pandas and matplotlib
In this fun project I break down the behaviour of Americans on Thanksgiving to deep dive into pandas! I use this dataset by DataQuest which contains responses to an online survey about what Americans eat for Thanksgiving dinner. Each survey respondent was asked questions about what they typically eat for...
[Read More]