I adamantly believe that while ultimately feelings are how we make sense of the world (i.e. underneath the hood, thoughts are “understood” through changes in emotional states), we humans are terrible at preserving feelings and emotions. We feel so much so often and yet do so little to...
The future of software development is not large software companies with thousands of engineers; it’s small groups of developers working on projects they’re passionate about, funded via Github Sponsors.
The future of music production is not a few superstars making billions through large record labels and music production studios; it’s...
Pieter Abbeel is one of the world’s leading RL researchers. He has recently made a lot of the material he teaches in classes at Berkley free on the internet. The below is an excerpt of my notes from his Deep RL Bootcamp.
I started systematically thinking about the idea of maturity in the summer of 2018 as in the preceeding few months, I had witnessed a profound change in my internal machinery and external behavior that I did not have a better word to associate with.
I’m not sure why I feel so strongly about this event, or why I seem to be so affected by it.. but I know I don’t want to let myself forget this strange and singular feeling.
Maybe I’ve been on an emotional roller coaster for so long that this latest...
Ray is a distributed computing platform (1) to scale Python applications (2) with minimal effort (3).
Let’s break this mission statement down into its 3 major components:
(1) Distributed computing platform: Ray is a highly-performant runtime written in C++ that scales extremely well to large, heterogeneous compute clusters.
Collaborative Filtering is a very common way of implementing recommender systems. To implement collaborative filtering for large corpuses/userbases, we usually resort to low-rank matrix factorization techniques such as [Weighted] Alternating Least Squares. These methods assume there exists a lower-dimensional space that we can embed each user as well as each...
In this post, we’re going to show how we can use TensorFlow JS to fight the spread of the Coronavirus. Please note, this is mostly an educational exercise.. so please don’t take it too seriously.
This post eventually ended up turning into a project that is actively being developed by...
There are ton of third-party audio effect plug-ins that you can find on the internet, costing anywhere from free to thousands of dollars; but one slightly underappreciated collection is the effects that come with Ableton Live.
In this post, I’ll list them and give a brief description of what each...
There has recently been a lot of hype concerning automatic machine learning (or AutoML), with a lot of start-ups and researchers claiming that they can automatically produce state-of-the art models for a variety of tasks, given nothing more than a dataset of a modest size.
ALBERT: A Lite BERT For Self-Supervised Learning of Language Representations
Motivation
A research direction that I’ve spent some time exploring recently is model compression; more specifically, techniques for compressing large embeddings. While the focus is on techniques that generalize to arbitrary NN model architectures, I have found myself...
Ever since hearing Gustav Soderstorm (Head of Research at Spotify) describe Spotify users creating playlists as a form of meta-programming, I’ve started to realize that as tooling and technologies for content creation increase in sophistication, we can apply more and more concepts from the world of programming to that...
When applying machine learning to real world use cases, a common observation is that data cleaning and preprocessing is an extremely difficult and tedious task, due to the need to convert large datasets between formats, tokenize and stem texts and convert them into numbers using large vocabularies, and...
Traditional neural networks (multilayer perceptron and associated feed-forward architectures) do not perform well on sequence data. Let’s take a toy problem to see this for ourselves.
Suppose we have the most basic sequence modeling problem: Given a sequence of words, predict the next word.
Calculus is possibly the most powerful tool humanity has invented, appearing in physical sciences, actuarial science, computer science, statistics, engineering, economics, business, medicine, demography, and many many more. Richard Feynman famously said “Calculus is the language God speaks.” Regardless of what one thinks about religion, the sentiment is a compelling...
For most heavy inference workloads, accelerators are necessary for achieving high throughput. But what if you don’t have access to expensive GPUs or TPUs? In this post, we’ll look at serving models with dense linear algebra components on CPUs, taking advantage of vectorized instruction sets and efficient kernel implementations.
Ten years ago, my parents made the decision to move from Iran to the U.S. so that my brother and I could have access to better education and opportunities. While immigration to the U.S. on the basis of access to opportunity is a prosaic idea, thoroughly explored and romanticized by...
In my undergraduate thesis, Unified Intelligence, I spent a great deal of time revieweing literature that talked about intelligence and what constitutes an intelligent agent. I then developed a framework for assessing an agent’s intelligence using only its information processing capabilities. After half a decade of more research,...
Evolution of deep learning is not just gated by algorithm development, but also evolution of hardware (one could make the argument that algorithm development itself is gated by hardware as well, but that’s a stronger point than I want to argue). For example, tracing the state of the art models...
Alex Danco says the valley has a habit of picking a thing and make it abundant. We’ve done it with information (Google), entertianment (Youtube, Netflix, Reddit), products (Amazon), etc. Some say we’re on the path of making intelligence abundant. While I don’t disagree, I do not agree. I do not...
As expected, a sequence-to-sequence model transforms a sequence to another using an encoder and decoder module (two RNNs). As the encoder processes each item in the input sequences, it compiles the information into a context vector. The decoder is then conditioned using the context vector and begins generating...
I think about product and how one should approach product development a lot - I will use this post as a means of trying to distill my thoughts into (somewhat) comprehendible string of words, borrowing mathematical formalisms when appropriate.
The most important thing I’ve learned about product is that most...