In the words of Jeff Dean (senior fellow in Google Research): «Deep learning is a really powerful metaphor for learning about the world». What differentiates a human brain from a machine?, what makes our grey matter so special? Regardless of the approach (biological, psychological or philosophical), stand out characteristics like perception, action, articulate language and cognition. In past editions of our STI Wednesday, we talked to you about artificial intelligence (AI) and autonomous agents: the unreachable dream of emulating human brain and the polemical creation of «thinking machines». As we commented to you before, the prevailing technological trend that is currently a key within AI key is deep learning: AI «is finally getting smart»; it is outlined that new autonomous agents be capable of understanding human language and do both inferences and decision making by themselves. Experts on technological research and consultancy note that deep learning will start having a huge impact in most of industries over the next few years. Now, if you wonder why this should be of your concern, we invite you to continue reading our article, where we’ll present some basic theoretical issues and some applications of this group of techniques to extract, transform, classify and analyse information.
Human brain neural networks as inspiration
The work that begun in the sixties of the 20th century as something conceptually attractive but difficult to prove, started to have a wide range of commercial uses in the early nineties of the same century. According to the US Association for Computing Machinery (ACM), artificial neural networks (neural nets) follow brain neurons’ patterns, as well as the connections or synapses between neurons. Thus, artificial neural networks are systems that consist of highly interconnected and simple processing elements which behaviour changes according to the «weights» assigned to each connection; contrary to traditional computing programmes, deep learning requires training: the greater the amount of input information, the better the results. During many years, most of neural nets had only one layer of «characteristics detectors» and were mainly trained with data classified through a process called «supervised» training; afterwards would appear the multilayer and hybrid types.
In the eighties appeared a more powerful deep learning type, which used multiple layers. In the years of futuristic unreal robots proper to movies like Blade Runner (1982), The Terminator (1984) and Aliens (1986), within the scientific sphere, computers were not fast enough to deal at the same time with the learning processes of multiple layers of characteristics since this meant a huge amount of calculations, in addition, there were not enough labelled data and developers did not count on an efficient way to initiate the «weights». Geoffrey Hinton (University of Toronto), computer scientist and machine learning pioneer, claims that: «The basic approach back then was, you hand-engineered a bunch of features, and then you learned what weights to put on the features to make a decision. For example: if it’s red, it’s more likely to be a car than a refrigerator».
The newest in AI: layers and layers of artificial reasoning
Li Deng and Dong Yu from Microsoft Research, in their article «Deep Learning: Methods and Applications» (2014), place these techniques in the intersection of neural nets research, AI, graphic modelling, optimisation, patter recognition and signals processing. They claim that since 2006 deep structured learning, more commonly known as deep learning or hierarchical learning, has emerged as a great new research area within machine learning. An area which essence is «to automate the process of discovering effective features or representations for any machine learning task, including automatically transferring knowledge from one task to another concurrently».
Microsoft Research’s scientists emphasised three important reasons for deep learning’s current popularity: 1) chip’s drastically augmented processing abilities, 2) significantly increased size of data used for training, 3) recent research advances on machine learning and signals/information processing; also, for its part, the ACM points out the improvements in algorithms and apps architectures. As a whole, all this makes possible a greater increase in the machine learning systems; in particular, multilayer artificial neural nets are producing amazing advances in the matter of precision within fields like computer vision and voice recognition.
Listening, speaking, observing and learning «like a human»
Deng and Yu highlight the fact that there are numerous active researches in the field, carried out by higher education institutions like University of Toronto, New York University, University of Montreal, Stanford University and Massachusetts Institute of Technology, as well as companies of the size of Microsoft, Google, IBM, Baidu and Facebook. Such researchers, they assert, have showed to have empirical success on deep learning in diverse uses of: computer vision, phonetic recognition, search by voice, oral speech recognition, voice and image codification functions, semantic expressions classifications, natural language comprehension, handwriting recognition, SPAM filtering, fraud detection, audio processing, information retrieval, robotics, «and even in the analysis of molecules that may lead to discovery of new drugs».
Below we explain some of the newest deep learning applications and a fruitful research group which products you probably use more than once a day.
- IBM PowerAI(IBM and NVIDIA, 2016). A software toolkit for enterprises that «will help to train systems into thinking and learning in a more human way at a faster rhythm».
- DeepText(Facebook, 2016). Tool that can understand the textual content of various thousands of publications per second with almost-human precision, spanning more than twenty languages.
- Deep Voice(Baidu, 2017). Production-quality text-to-voice system that synthesises in real time, entirely constructed from deep neural networks.
- DeepCoder(Microsoft and Cambridge University, 2017). This software will allow people who don’t know any programming language to code, create new programmes taking «borrowed» other programming codes.
- Brain Team(Google, since 2011). This research group is responsible for Android’s voice recognition systems, Google’s search by images and YouTube’s video recommendations.
- Genetic interpretation (Universidad of Toronto and NVIDIA, in process). Powered by a graphic processing unit, this method will identify cancer-causing mutations.
Are we truly each time closer to creating «thinking machines»?
About three years ago, the aforementioned Microsoft Research’s duo ventured that the «productivity plateau» of deep learning would be greater than the one projected by Gartner’s hype cycle methodology (graphical representation of maturity and adoption of technologies and apps); nevertheless, this enterprise devoted to information technologies research doesn’t include deep learning in its hype cycle for 2016 emerging technologies, although it claims that during the next decade smart machines will be the most disruptive class of technologies, and partially because the «unprecedented advances in deep neural networks that will allow organizations with smart machine technologies to harness data in order to adapt to new situations and solve problems that no one has encountered previously».
Given that both those already functioning and those nowadays being developed are technologies widely applicable in numerous and diverse market niches, you’ll probably keep on listening more and more about deep learning. Even so, emphasises Mark Skilton (experienced consultant and professor at Warwick Business School), achievements like IBM’s Watson supercomputer winning against human players of Jeopardy (2011), Google’s AlphaGo winning over world champion Lee Sedol (2016) and Infineon’s robot completing in just 0.637 seconds a Rubik’s Cube (2016) «are examples of what is referred to as specialised intelligence but are extremely narrow parts of general intelligence». At the end: «In a field that attempts something as profound as modeling the human brain, it’s inevitable that one technique won’t solve all the challenges» (Robert D. Hoff, guest contributor of the MIT Technology Review). Well, perhaps it might be too soon to ask ourselves what’s next after deep learning, but, undoubtedly, it seems like we’re getting closer to the dream of building «thinking machines»… Or at least we’re no longer so far away from it, right?
Written and translated by Belem Ruiz (Edition and Communication, PIT-UAS).