With the proliferation of technologies and the emergence of machine learning, much talk has arisen in recent years over the transition from human neurons to artificial neurons. In this article the term Artificial Neural Networks is introduced alongside a discussion of their potential use for human language learning.
What is it all about?
Artificial Neural Networks (ANN) refers to a novel information processing system that takes inspiration by the way different biological nervous systems like the brain, process information. The distinct characteristic of ANN is how it is a software simulation of a bunch of highly interconnected processing elements (neurons) working united to solve problems. An ANN learns by doing, which means that it is configured for specific applications, to perform operations such as pattern recognition through a learning process (Stergiou and Siganos, ). The learning process involves adjustments to the synaptic connections that exist between the neurons. There is a fix number or “layers” of these neurons and the output is produced after moving through them all.
Why use neural networks?
Neural networks have the ability to convey meaning from unstructured data and in this sense can be used to extract patterns and detect trends that the human eye cannot process neither computer techniques share this capability. Therefore a trained neural networks functions as an “expert” in information analysis and provides endless possibilities to explore “what if” questions. Other advantages of using neural networks include: the ability to perform tasks based on data provided for training or experience, the ability to organize and represent information on their own, perform computations in parallel using specific hardware devices and even retain some network capabilities even when major network damage occurs (Stergiou and Siganos, ).
What is the potential for learning Human Language through Neural Networks?
Based on their unlimited potential, neural networks have a broad applicability to real world. For example, neural networks have been successfully implemented in many industries and assist to identify patterns or trends in data. What is noteworthy is that ANNs have been identified as potentially a great means to simulating human learning. In this perspective, researchers gradually explore the fascinating field of language learning through neural networks to create sophisticated systems for the future (Abidi, 1996). These neural network language models (NNLMs) operate by exploiting their ability to learn distributed representations that reduce the impact of the curse of dimensionality. The latter refers to the need for extensive amount of training examples when learning complicate functions (Bengio, 2008: 3881). Several NNLMs have emerged and despite their differences, they share some basic commonalities such as the input words encoded by 1-of-K coding where K represents the number of words in the vocabulary. Further to this, at the output layer of these systems, to create correctly normalized probability values a softmax activation function is used. Finally, the cross entropy error is used as a training criterion, which equals to maximum likelihood (Sundermeyer, Schluter, and Ney, 2012).
Recent trends in the field
Among the different research advances in the field of NNLMs is the study by Panchev (2005) on a model of a leaky Integrate-And-Fire (lIAF) spiking neurone with Active Dendrites and Dynamic Synapses (ADDS). The architecture of this network consists of several main modules that link information across different modalities including an auditory system recognising single spoken words, a visual system recognising objects of different colour and shape, motor control system for navigation and motor control and a working memory (Panchev, 2005). More recently, a breakthrough by Golosio, et al. (2015) was the development of a cognitive system, that is build on a large-scale neural architecture. These researchers from the University of Sassari (Italy) and the University of Plymouth (UK) have developed a computer simulation of a cognitive model, consisting of two million interconnected artificial neurons. The intention of this development was to shed light on the procedural knowledge involved in language elaboration. In this model the central executive is a neural network that coordinates the other components of working memory by taking as input the neural activation states of short-term memory and yields output mental actions to control the flow of information through neural gating mechanisms. The innovation lies in the ability of the system to learn to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structural components of language. The model, called ANNABELL (Artificial Neural Network with Adaptive Behavior Exploited for Language Learning) is considered as breakthrough in terms of providing with insights on the neural processes that underlie the development of language. Researchers who designed the model, were able to give the computer a series of sentences from literature about language structure, allowing it to construct sentences itself. The estimate is that the model has the language skills of a 4-year-old.
Neural network architectures can play an immense role for modelling aspects of human language learning and acquisition. Current research serves as the building block for future advances in the field and the evidence is promising.
Abidi, S. S. R. (1996) Neural Networks and Child Language Development: Towards a ‘Conglomerate’ Neural Network Simulation Architecture. In Proc. of the International Conference on Neural Information Processing (ICONIP’96), Hong Kong, 1996.
Golosio, B., Cangelosi, A., Gamotina, O., Masala, G.L. (2015) A Cognitive Neural Architecture Able to Learn and Communicate through Natural Language. PLoS ONE 10(11): e0140866. doi:10.1371/journal.pone.0140866.
Panchev, C. (2005) A Spiking Neural Network Model of Multi-modal Language Processing of Robot Instructions. In Wermter, S., Palm, G., and Elshaw, M., Biomimetic Neural Learning for Intelligent Robots, pp. 182-210, Springer.
Sundermeyer, M., Schluter, R., and Ney, H. (2012) LSTM Neural Networks for Language Modeling.