Abstract | ||
---|---|---|
We examine the task of finding thematic structure in a data corpus comprising text and time series. To achieve this we introduce topic factor modelling (TFM). We develop a novel, joint generative model for both data types which resembles supervised latent Dirichlet allocation. TFM allows the decomposition of time series into factors which also reflect the thematic content of the text. We describe a variational method for inference and demonstrate its effectiveness on a synthetic corpus. For a corpus of publicly available equity data, we show that a TFM can simultaneously and robustly model both stock price time series and text data describing the corresponding companies. We also discuss how topic modelling could assist with external tasks such as robust covariance estimation. |
Year | DOI | Venue |
---|---|---|
2015 | 10.3233/IDA-150770 | INTELLIGENT DATA ANALYSIS |
Keywords | Field | DocType |
Topic modelling, latent dirichlet allocation, variational inference, computational finance, text mining | Latent Dirichlet allocation,Thematic structure,Computer science,Inference,Decomposition of time series,Data type,Artificial intelligence,Topic model,Factor analysis,Machine learning,Generative model | Journal |
Volume | Issue | ISSN |
19 | s1 | 1088-467X |
Citations | PageRank | References |
0 | 0.34 | 7 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Joe Staines | 1 | 2 | 1.10 |
David Barber | 2 | 404 | 45.57 |