Machine Learning

Content on the science and practice of building learning systems



Notes on Reinforcement Learning

Written on 10 Feb 2021

Pieter Abbeel is one of the world’s leading RL researchers. He has recently made a lot of the material he teaches in classes at Berkley free on the internet. The below is an excerpt of my notes from his Deep RL Bootcamp.

1 - Motivation + Overview +...


Distributed Programming with Ray

Written on 02 May 2020

Context

Ray is a distributed computing platform (1) to scale Python applications (2) with minimal effort (3).

Let’s break this mission statement down into its 3 major components:

(1) Distributed computing platform: Ray is a highly-performant runtime written in C++ that scales extremely well to large, heterogeneous compute clusters.

...


WALS Factorization

Written on 09 Apr 2020

Collaborative Filtering is a very common way of implementing recommender systems. To implement collaborative filtering for large corpuses/userbases, we usually resort to low-rank matrix factorization techniques such as [Weighted] Alternating Least Squares. These methods assume there exists a lower-dimensional space that we can embed each user as well as each...


Fighting COVID-19 with TensorFlow.js

Written on 09 Mar 2020

In this post, we’re going to show how we can use TensorFlow JS to fight the spread of the Coronavirus. Please note, this is mostly an educational exercise.. so please don’t take it too seriously.

This post eventually ended up turning into a project that is actively being developed by...


AutoML with AdaNet

Written on 13 Feb 2020

Background

There has recently been a lot of hype concerning automatic machine learning (or AutoML), with a lot of start-ups and researchers claiming that they can automatically produce state-of-the art models for a variety of tasks, given nothing more than a dataset of a modest size.

Although AutoML is...


A Review of ALBERT

Written on 08 Jan 2020

ALBERT: A Lite BERT For Self-Supervised Learning of Language Representations

Motivation

A research direction that I’ve spent some time exploring recently is model compression; more specifically, techniques for compressing large embeddings. While the focus is on techniques that generalize to arbitrary NN model architectures, I have found myself...


Data Transformations with TensorFlow

Written on 31 Aug 2019

Background

When applying machine learning to real world use cases, a common observation is that data cleaning and preprocessing is an extremely difficult and tedious task, due to the need to convert large datasets between formats, tokenize and stem texts and convert them into numbers using large vocabularies, and...


Recurrent Neural Networks

Written on 26 May 2019

Traditional neural networks (multilayer perceptron and associated feed-forward architectures) do not perform well on sequence data. Let’s take a toy problem to see this for ourselves.

Suppose we have the most basic sequence modeling problem: Given a sequence of words, predict the next word.

We can use a traditional feed-forward...


Backprop: The Bastard Child of Calculus

Written on 12 May 2019

Calculus is possibly the most powerful tool humanity has invented, appearing in physical sciences, actuarial science, computer science, statistics, engineering, economics, business, medicine, demography, and many many more. Richard Feynman famously said “Calculus is the language God speaks.” Regardless of what one thinks about religion, the sentiment is a compelling...


Serving TensorFlow models on CPU

Written on 04 May 2019

For most heavy inference workloads, accelerators are necessary for achieving high throughput. But what if you don’t have access to expensive GPUs or TPUs? In this post, we’ll look at serving models with dense linear algebra components on CPUs, taking advantage of vectorized instruction sets and efficient kernel implementations.

Let’s...


Unified Intelligence v2

Written on 04 Apr 2019

Background

In my undergraduate thesis, Unified Intelligence, I spent a great deal of time revieweing literature that talked about intelligence and what constitutes an intelligent agent. I then developed a framework for assessing an agent’s intelligence using only its information processing capabilities. After half a decade of more research,...


Evolution of hardware for deep learning

Written on 03 Apr 2019

Evolution of deep learning is not just gated by algorithm development, but also evolution of hardware (one could make the argument that algorithm development itself is gated by hardware as well, but that’s a stronger point than I want to argue). For example, tracing the state of the art models...


Deep Natural Language Processing 101

Written on 16 Mar 2019

Sequence-to-Sequence

As expected, a sequence-to-sequence model transforms a sequence to another using an encoder and decoder module (two RNNs). As the encoder processes each item in the input sequences, it compiles the information into a context vector. The decoder is then conditioned using the context vector and begins generating...