Machine Learning
Content on the science and practice of building learning systems
Notes on Reinforcement Learning
Written on 10 Feb 2021Pieter Abbeel is one of the world’s leading RL researchers. He has recently made a lot of the material he teaches in classes at Berkley free on the internet. The below is an excerpt of my notes from his Deep RL Bootcamp.
1 - Motivation + Overview +...
Distributed Programming with Ray
Written on 02 May 2020Context
Ray is a distributed computing platform (1) to scale Python applications (2) with minimal effort (3).
Let’s break this mission statement down into its 3 major components:
(1) Distributed computing platform: Ray is a highly-performant runtime written in C++ that scales extremely well to large, heterogeneous compute clusters.
...WALS Factorization
Written on 09 Apr 2020Collaborative Filtering is a very common way of implementing recommender systems. To implement collaborative filtering for large corpuses/userbases, we usually resort to low-rank matrix factorization techniques such as [Weighted] Alternating Least Squares. These methods assume there exists a lower-dimensional space that we can embed each user as well as each...
Fighting COVID-19 with TensorFlow.js
Written on 09 Mar 2020In this post, we’re going to show how we can use TensorFlow JS to fight the spread of the Coronavirus. Please note, this is mostly an educational exercise.. so please don’t take it too seriously.
This post eventually ended up turning into a project that is actively being developed by...
AutoML with AdaNet
Written on 13 Feb 2020Background
There has recently been a lot of hype concerning automatic machine learning (or AutoML), with a lot of start-ups and researchers claiming that they can automatically produce state-of-the art models for a variety of tasks, given nothing more than a dataset of a modest size.
Although AutoML is...
A Review of ALBERT
Written on 08 Jan 2020ALBERT: A Lite BERT For Self-Supervised Learning of Language Representations
Motivation
A research direction that I’ve spent some time exploring recently is model compression; more specifically, techniques for compressing large embeddings. While the focus is on techniques that generalize to arbitrary NN model architectures, I have found myself...
Data Transformations with TensorFlow
Written on 31 Aug 2019Background
When applying machine learning to real world use cases, a common observation is that data cleaning and preprocessing is an extremely difficult and tedious task, due to the need to convert large datasets between formats, tokenize and stem texts and convert them into numbers using large vocabularies, and...
Recurrent Neural Networks
Written on 26 May 2019Traditional neural networks (multilayer perceptron and associated feed-forward architectures) do not perform well on sequence data. Let’s take a toy problem to see this for ourselves.
Suppose we have the most basic sequence modeling problem: Given a sequence of words, predict the next word.
We can use a traditional feed-forward...
Backprop: The Bastard Child of Calculus
Written on 12 May 2019Calculus is possibly the most powerful tool humanity has invented, appearing in physical sciences, actuarial science, computer science, statistics, engineering, economics, business, medicine, demography, and many many more. Richard Feynman famously said “Calculus is the language God speaks.” Regardless of what one thinks about religion, the sentiment is a compelling...
Serving TensorFlow models on CPU
Written on 04 May 2019For most heavy inference workloads, accelerators are necessary for achieving high throughput. But what if you don’t have access to expensive GPUs or TPUs? In this post, we’ll look at serving models with dense linear algebra components on CPUs, taking advantage of vectorized instruction sets and efficient kernel implementations.
Let’s...
Unified Intelligence v2
Written on 04 Apr 2019Background
In my undergraduate thesis, Unified Intelligence, I spent a great deal of time revieweing literature that talked about intelligence and what constitutes an intelligent agent. I then developed a framework for assessing an agent’s intelligence using only its information processing capabilities. After half a decade of more research,...
Evolution of hardware for deep learning
Written on 03 Apr 2019Evolution of deep learning is not just gated by algorithm development, but also evolution of hardware (one could make the argument that algorithm development itself is gated by hardware as well, but that’s a stronger point than I want to argue). For example, tracing the state of the art models...
Deep Natural Language Processing 101
Written on 16 Mar 2019Sequence-to-Sequence
As expected, a sequence-to-sequence model transforms a sequence to another using an encoder and decoder module (two RNNs). As the encoder processes each item in the input sequences, it compiles the information into a context vector. The decoder is then conditioned using the context vector and begins generating...