AI Glossary: What Makes Deep Learning ‘Deep’

June 3, 2017, 9:00am

AI Glossary is a semi-regular column diving into the oft-misunderstood fields of machine learning and artificial intelligence by way of their frequently imperfect jargon.

In a press session held at a recent AI-centric computing conference one dude in particular was just not getting it. How are deep learning and neural networks different?, he asked in a tone more interrogative than curious. And then asked again in a somewhat mutated form. The engineers at the front of the room, it seemed, weren’t quite getting it either. For a moment, it was like two groups of people speaking two different languages trying to engage in earnest discussion while not quite getting that the other group speaks a different language entirely.

Videos by VICE

I suppose the dynamic isn’t all that rare in science journalism, and the confusion in nomenclature here is reasonable. Deep learning, in particular, is a vague and not-very-technical term. Just the other day, I ran into the exact same confusion as above in explaining a recent project of mine. So you were using deep learning and neural network models? Well, no. But also yes.

Let’s dig in.

The press conference engineers eventually explained the distinction as kind of a non-distinction. Deep learning is a big umbrella term that contains many different sorts of neural network and a couple of other machine learning techniques that are a lot like neural networks. Unsurprisingly, the networks under this umbrella can all be characterized by depth.

It’s a mathematical depth. Imagine just a basic formula like y = 2 * x. We see in this formula how we put a number in and then get a number in return, specifically a number that is two times the input number. Deep learning can be imagined as starting with an equation like that, but one that doesn’t actually give you the right answer.

The equation itself has to learn how to produce the right answer based on a bunch of correct inputs and outputs, which is what we call training data. For the formula to learn how to make the right answer, we give it some blank space, a gap or gaps between the input layer and the output layer. These are called hidden layers and they’re like hidden parts of the formula that need to be filled in. They’re not quite hidden in the sense of us not being able to see them, but they’re also not part of our initial formula. We give the formula the tools to write itself, basically.

It just takes one hidden layer for a machine learning model to be considered “deep.” That’s what we call the resulting formula: a model. Through the process of training, the model improves itself such that it can abstractly represent the training data. If it’s a good model, we should be able to give it some new data, without saying anything about that data, and get a prediction of some kind in return. The more hidden layers we provide, we can usually count on getting a more accurate model. But that comes with tradeoffs in complexity and computational difficulty.

Not every machine learning model is “deep.” The alternative is a linear model, which has no hidden layers. In that case the algorithm starts out with a formula and just sort of refines it, scaling each variable up or down so that the formula as a whole has better predictive power. In many, many cases, this is good enough. This happens when there is a clear straight-line division in training data, or something pretty close to it. This actually happens a lot, but not so much for big learning tasks involving lots of data “features” like classifying images containing many thousands of pixels. Linear models are not only very simple and easy to understand, they also require a lot of less work from your machine.

We’ll talk more about the sorts of models that fall under the broad umbrella of deep learning in future columns, but now you should at least have some sense of the depth involved in deep learning.