Learning to rank with SoftRank and Gaussian processes - Citegraph

Paper Info

Title
Learning to rank with SoftRank and Gaussian processes

Abstract
In this paper we address the issue of learning to rank for document retrieval using Thurstonian models based on sparse Gaussian processes. Thurstonian models represent each document for a given query as a probability distribution in a score space; these distributions over scores naturally give rise to distributions over document rankings. However, in general we do not have observed rankings with which to train the model; instead, each document in the training set is judged to have a particular relevance level: for example "Bad", "Fair", "Good", or "Excellent". The performance of the model is then evaluated using information retrieval (IR) metrics such as Normalised Discounted Cumulative Gain (NDCG). Recently Taylor et al. presented a method called SoftRank which allows the direct gradient optimisation of a smoothed version of NDCG using a Thurstonian model. In this approach, document scores are represented by the outputs of a neural network, and score distributions are created artificially by adding random noise to the scores. The SoftRank mechanism is a general one; it can be applied to different IR metrics, and make use of different underlying models. In this paper we extend the SoftRank framework to make use of the score uncertainties which are naturally provided by a Gaussian process (GP), which is a probabilistic non-linear regression model. We further develop the model by using sparse Gaussian process techniques, which give improved performance and efficiency, and show competitive results against baseline methods when tested on the publicly available LETOR OHSUMED data set. We also explore how the available uncertainty information can be used in prediction and how it affects model performance.

Year	DOI	Venue
2008	10.1145/1390334.1390380	SIGIR
Keywords	Field	DocType
probabilistic non-linear regression model,gaussian process,document score,thurstonian model,model performance,different underlying model,document retrieval,document ranking,softrank framework,softrank mechanism,cumulant,non linear regression,probability distribution,neural network,ranking,learning to rank,information retrieval	Learning to rank,Data mining,Computer science,Probability distribution,Artificial intelligence,Gaussian process,Document retrieval,Probabilistic logic,Thurstonian model,Ranking,Information retrieval,Machine learning,Discounted cumulative gain	Conference
Citations	PageRank	References
28	1.12	13
Authors
2

Authors (2 rows)

Cited by (28 rows)

References (13 rows)

Name	Order	Citations	PageRank
John Guiver	1	482	21.48
Edward Snelson	2	610	41.42

1