When we first hear terms like “Artificial Intelligence” (AI) and “Deep Learning,” we tend to judge them with a set of preconceived ideas generated by a sensationalist media and a plethora of Hollywood films. It is easy to assume that anything produced by Artificial Intelligence is extremely complicated, and kind of scary – whether that be chess playing computers, driverless cars, or humanoid robots like Sophia; the first robot to receive citizenship of any country. However, I hope to convince you that (at least for now) Artificial Intelligence is deceptively simple and that there is really nothing that scary going on under the hood in most instances of AI.
By now we are quite used to computers performing basic mathematical operations like multiplying or dividing much faster than us—this doesn’t scare us. We understand that the structure of computers allows them to make calculations like these extremely quickly and that our pocket calculators certainly don’t need to be conscious to find the square root of 158. I would like to argue that we should feel the same way about most of the things AI is doing today.
Let’s take the case of an Artificial Neural Network learning how to identify the emotion shown on a human’s face. Without any specialist knowledge, this seems like a daunting task, and the thought of the computer learning about emotions might give us reason to consider it conscious – however this is simply not the case.
An Artificial Neural Network (ANN) despite originally being inspired by the brain, actually bears almost no resemblance to the brain whatsoever. At heart, it is more like a glorified calculator. We can think of an ANN as a function which converts an input to an output. In this case, it would be converting an image into a label. The way it connects the image to the label is to simply multiply the values of each pixel by some parameters which we can think of as tuneable dials. (There is also some non-linearity in there, which again sounds complicated but is often as simple as converting any negative values to 0.)
When an ANN first begins its task of predicting an emotion from a picture of a face, it tunes these dials randomly, and so the first guess probably isn’t going to be very good. In order to make a better guess it “learns” by looking a few thousand pictures of faces. The actual “learning” is done by checking how far off its prediction was from the true label, and tuning the dials so that this distance is minimized. This process is called optimization and usually taught to students during high school.
That’s all there is to it folks. Almost all AI in the world today can be explained in terms like these, and so we shouldn’t think of Jesse the Driverless Car, Deep Blue the Chess Playing Computer or Sophia the Hong Kongese Robot as anything more than glorified calculators with arms and wheels.
Rohin Berichon was a recipient of a 2018/19 AMSI Vacation Research Scholarship.