My Musical Experiments with Artificial Intelligence – Part 2

Published Material on AI and Neural Networks

Theoretical foundation of Artificial Neural Networks is nothing new. As a discipline, theoretical framework of AI has been existence in Computer Science since 1960s. However, recent scaling in compute power enables these frameworks to be put in practice to produce something meaningful.

cars.mit.edu

Anyone who wants to learn about AI and Neural Networks (and has some cursory background in machine learning concepts) should probably start here – cars.mit.edu. It is a fun class taught by Lex Friedman of MIT. It has free videos available and as part of the project, one has to optimize the neural network to get the red car as far ahead as possible, teaching it to dodge the traffic. This course would provide you with an overview of various types of neural network architectures and their applications.

Andrej Karpathy Blog

Another important destination for anyone who might be interested in the topic of AI and Neural Networks is the blog of Andrej Karpathy (http://karpathy.github.io). Andrej Karpathy is a Stanford researcher, academic and currently Director of AI at Tesla. If you are a bit technically minded, you should check out his “Hacker’s Guide to Neural Networks”.

An especially interesting article is called “The Unreasonable Effectiveness of Recurrent Neural Networks”. Recurrent Neural Networks (RNNs) are a class of Neural Networks that are especially good in dealing with stateful or sequential data. Any data set whose next state depends on the previous few states is best handled with RNNs.

Classic neural networks live in the moment. There is no concept of past or future. For example, in image recognition application (say you want to recognize if there is a cat in a picture), one typically uses an architecture called Convolutional Neural Networks (CNNs). In such an application, all it needs to do is to analyze the bitmap to identify cat pattern that it would have learnt from training data.

However, in most music applications, the next note or rhythmic pattern depends on what has been played immediately leading up to that point. This makes Recurrent Neural Network architecture especially suitable.

Google Magenta Project

Anyone who wants to apply AI to music should certainly check out the Google Magenta project. Google Magenta is a Google Brain research project to advance the state of the art in music, video, image and text generation. In order to do so, it recognizes that it needs to build a community to bring together computer scientists and artist together.

One of the fascinating application written in the Magenta Project is the AI Duet – https://experiments.withgoogle.com/ai/ai-duet/view/. You can punch in some notes on piano keyboard and the AI responds to another set of notes that feel like a response to what you played.

Tensorflow

The most popular framework to write AI applications today is Google’s Tensorflow. It is technology that Google has been using for its own applications and made open source a couple of years ago.

Tensorflow tries to solve a few problems. First of all, it provides scalable and efficient framework and software library for numerical computations using data metrices. As a reminder from high school algebra – single dimensional array of numbers is called Vector, two dimensional is called Matrix and more than two dimensional arrays are called Tensors – hence the name Tensorflow.

Even a trivial neural network architecture can have hundreds or even thousands of neurons. For more sophisticated applications – say a self driving car – it is not difficult to imagine just the input layer to have million plus neurons. Latest version of Teslas have 8 cameras, several ultrasonic sensors and a radar. Consider even black and white, not so high definition image could have a bitmap of 1000×1000 pixels. Now considering that one does need color (to spot traffic lights for example) and the number just keeps multiplying. To get answers in milliseconds for steering the car at fast speeds, one would have to solve several problems.

First of all, computing one neuron at a time would just not work. Fortunately, matrix algebra comes to our rescue and helps us represent the network in form of matrices, vectors and tensors and enables us do mathematics at scale.

Secondly, since matrix algebra requires a large number of small trivial calculations at scale, the general purpose approach of CPUs found in traditional computers proves suboptimal. Instead, GPUs – the ones that fire pixels on displays, do pretty much same type of computations therefore are much better suited for AI applications (just check stock price of NVIDIA – the traditional GPU maker). Google Public Cloud now offers Tensorflow optimized (TPU) virtual machines.

Tensorflow library abstracts computation from scale. Therefore, you can run Tensorflow on your Mac or single Linux machine but you can also scale the model infinitely by making it run on a large number of GPU or TPU machines. It will also run on mobile devices as AI becomes more mainstream and used in mobile applications.

Tensorflow also makes the libraries available on multiple programming languages – they support C, Go and Python at the moment, but have roadmaps to support more languages.

For my application, I used Tensorflow on my Mac with Python as the programming language.