MMS • RSS
Article originally posted on InfoQ. Visit InfoQ
Welcome to this AI-themed edition of The Morning Paper Quarterly. I’ve selected five paper write-ups which first appeared on The Morning Paper blog over the last year. To kick things off we’re going all the way back to 1950! Alan Turing’s paper on “Computing Machinery and intelligence” is a true classic that gave us the Turing test, but also so much more. Here Turing puts forward the idea that instead of directly building a computer with the sophistication of a human adult mind, we should break the problem down into two parts: building a simpler child program, with the capability to learn, and building an education process through which the child program can be taught. Writing almost 70 years ago, Turing expresses the hope that machines will eventually compete with men in all purely intellectual fields. But where should we start? “Many people think that a very abstract activity, like the playing of chess, would be best.”
That leads us to my second choice, “Mastering chess and shogi by self-play with a general reinforcement learning algorithm.” We achieved super-human performance in chess a long time ago of course, and all the excitement transferred to developing a computer program which could play world-class Go. Once AlphaGo had done it’s thing though, the Deep Mind team turned their attention back to chess and let the generic reinforcement learning algorithm developed for Go try to teach itself chess. The results are astonishing: with no provided opening book, no endgame database, no chess specific tactical knowledge or expert rules – just the basic rules of the game and a lot of self-play – AlphaZero learns to outperform Stockfish in four hours elapsed of training time.
Of course chess (and Go) are very constrained problems. When we take deep learning into the noisy, messy, real world things get harder. In “Deep learning scaling is predictable, empirically,” a team from Baidu ask “how can we improve the state of the art in deep learning?” One of the major levers that we have is to feed in more data by creating larger training data sets. But what’s the relationship between model performance and training data set size? Will an investment in creating more data pay off in increased accuracy? How much? Should we spend our efforts building better models that consume the existing data we have instead? The authors show that beyond a certain point training data set size and generalization error are connected by a power-law. The results suggest that we should first search for ‘model-problem fit’ and then scale out our data sets.
Another critical question for systems deployed in the real world is how we can have confidence they will work as expected across a wide range of inputs. In “DeepTest: automated testing of deep-neural-network-driven autonomous cars,” Tian et al. consider how to test an autonomous driving system. The approaches they use could easily be applied to testing other kinds of DNNs as well. We’re all familiar with the concept of code coverage, but how is your neuron coverage looking?
My final choice for this issue is “Deep code search.” It’s a great example of learning joint-embeddings mapping associations between two different domains. In this case, natural language descriptions, and snippets of code. The result is a code search engine that I’m sure InfoQ readers will relate to. It let’s you ask questions such as “where (in this codebase) are events queued on a thread?” and be shown candidate code snippets as results.