The Evolution of Deep Leaning in Natural Language Processing


The initial period of NLP lasted for a long time during the 1950s. In this time, Alan Turing postulated a Turing test for evaluating computers’ ability for exhibiting smart behavior similar to humans. The dominance of the rational approach in NLP was due to the extensive acceptance of Noam Chomsky’s arguments regarding the criticism of N-grams and language structure in the year 1957. These rational approaches attempted to frame handcrafted rules for the incorporation of reasoning and knowledge mechanisms into the corresponding intelligent network. Still, 1980 the significant and successful ELIZA (NLP systems) structures real-world data into ensemble concept ontologies on the basis of intricate handwritten rules. Consequently, this period concurred with the initial stage of AI (Artificial Intelligence) described by expert knowledge training in the applicative domains.


There exists five kinds of foremost tasks in NLP which are Classification, translation, matching, sequential decision process and structured prediction. Most of the NLP problems could be resolved with the employment of these tasks, in which phrases, words, paragraphs, sentences and even documentations are commonly observed as strings and are treated equally in spite of various complexities. The extensively utilized processing units stays with the sentences. Deep Learning effectively improves the efficiency of these stated tasks and overcomes the prevailing limitations. Focusing on the progression of machine learning, the neural machine translation (Deep learning based on machine translation) has considerably overtook conventional machine learning approaches.

  1. Deep Learning and Neural Networks

Neural Networks are comprised of interlinked nodes or the neurons that receives certain number of input followed by the supply of output. Every nodes present in the output layer accomplish weighted sum computation the the received values form input nodes associated with output generation with simple nonlinear transformation functions on summations. The predominant factors which differentiate various types of networks reveal the connection of nodes and layers.

Convolutional neural networks (CNNs) constructed upon Fukashima’s neocognitron extract the name from the convolution operation in signal processing and mathematics. CNNs use functions, known as filters, allowing for simultaneous analysis of different features in the data and are used extensively in image and video processing, as well as speech and NLP. EscortStars. With the advancement of RNN(Recurrent Neural Network) and  Residual Connections and Dropout Deep learning method progress towards the success in image retrieval (that is from text to image) where the image and the image and the query are first transferred to CNN mediated vector representations. This has been integrated with deep neural network and the estimation of relevancy of the image to query has been performed. Deep learning is also engaged in the generation bases NLP for the automatic generation of response thereby accomplishing trained model in sequence to sequence learning.

Appropriate model that carefully selects the features on the basis of activation function has been under investigation. In DNN (deep neural network), auto-encoder was employed for better understanding the encoding data in an appropriate unsupervised and efficient way. Existing auto encoder (AE) is generally effective and possess unsupervised learning where the wavelet function possess a better time frequency properties with facial features. This kind of auto encoder comprise an wavelet function as the corresponding activation function in the place of standard sigmoid function that describes different signal characteristics with varied resolution. During such AE were extended to further improve the feature quality the deep WAE has been constructed similar to standardized deep auto encoder. The aim of deep WA is to provide an optimized feature learning and automatic method for the diagnosis of fault.


Various widespread application of NLP using deep learning comprise

  • Information Retrieval,
  • Information Extraction (Event and relationship),
  • Text Classification,
  • Text Generation with GAN, VAE,
  • Summarization,
  • Question Answering,
  • Machine Translation.



  • Not satisfactory at decision making and inference
  • Due to data hungry, it is not suitable for when the data size is very small
  • It is not able to directly handle long tail phenomena and symbols
  • Need improvement in case of unsupervised learning
  • High computational cost.


  • Effective performance in data driven and pattern recognition problems.
  • Easy and simple employment of gradient based learning.
  • Possibility of cross model processing

You may also like

Read More