Title
Building neural network language model with POS-based negative sampling and stochastic conjugate gradient descent.
Abstract
Traditional statistical language model is a probability distribution over sequences of words. It has the problem of curse of dimensionality incurred by the exponentially increasing number of possible sequences of words in training text. To solve this issue, neural network language models are proposed by representing words in a distributed way. Due to computation cost on updating a large number of word vectors’ gradients, neural network model needs much training time to converge. To alleviate this problem, in this paper, we propose a gradient descent algorithm based on stochastic conjugate gradient to accelerate the convergence of the neural network’s parameters. To improve the performance of the neural language model, we also propose a negative sampling algorithm based on POS (part of speech) tagging, which can optimize the negative sampling process and improve the quality of the final language model. A novel evaluation model is also used with perplexity to demonstrate the performance of the improved language model. Experiment results prove the effectiveness of our novel methods.
Year
DOI
Venue
2018
10.1007/s00500-018-3181-2
Soft Comput.
Keywords
Field
DocType
Language model, Conjugate gradient, Negative sampling, POS tagging, Evaluation model
Conjugate gradient method,Convergence (routing),Perplexity,Gradient descent,Computer science,Probability distribution,Sampling (statistics),Artificial intelligence,Artificial neural network,Language model,Machine learning
Journal
Volume
Issue
ISSN
22
20
1432-7643
Citations 
PageRank 
References 
1
0.35
32
Authors
7
Name
Order
Citations
PageRank
Jin Liu131650.24
Li Lin24423.07
Haoliang Ren330.72
Minghao Gu410.35
jin wang524336.79
Geumran Youn621.38
Jeong-Uk Kim7213.29