Deep Learning Machine Learning

Backprop, Autograd and Squeezing in larger batch using PyTorch

Backprogation is a beautiful play of derivatives which we have taken for granted. We often do a simple one-liner: to leverage the power of automatic differentiation in many deep learning framework without much thought. Today let’s look at a developed view of backpropagation as backward flow in real-valued circuits. Motivation Given some function ,we are […]

Machine Learning NLP

Crossing the language barrier with NLP

One of the biggest open problems in NLP is the unavailability of many non-English dataset. Dealing with low-resource/low-data setting can be quite frustrating when it seems impossible to transfer the same success we saw in various English NLP tasks. In fact, there are voices within the NLP community to advocate research and focus on low-resource […]

Deep Learning Kaggle Machine Learning

Tackling Toxic Using Keras

This is a repost from my kernel at Kaggle, which has received several positive responses from the community that it’s helpful to them. This is one of my kernels that tackles the interesting Toxic Comment Classification Challenge at Kaggle, which aims to identify and classify toxic online comments. This notebook attempts to tackle this classification […]

Machine Learning

How to install Kaggle’s Most Won Algorithm – XGBoost (Screenshots included)

If you are on this page, chances are you have heard of the incredible capability of XGBoost. Not only it “boasts” higher accuracy compared to similar boasted tree algorithms like GBM (Gradient Descent Machine), thanks to a more regularized model formalization to control over-fitting, it enables many Kaggle Masters to win Kaggle competitions as well. […]

Data Preprocessing Machine Learning

Reality sucks – dealing with imbalanced data

You stumble upon some intriguingĀ patient cancer dataset that seems to be the last remaining puzzle towards solving the human war against cancer that will make this world a better place for everyone and you excitedly download the dataset. Your data analysis usually go through these standard processes: 1) Load data 2) Do some pre-processing of […]