Abstract
We describe and test three methods to estimate the remain-ing time between a series of microtexts (tweets) and the future event they refer to via a hashtag. Our system gener-ates hourly forecasts. A linear and a local regression-based approach are applied to map hourly clusters of tweets directly onto time-to-event. To take changes over time into account, we develop a novel time series analysis approach that first derives word frequency time series from sets of tweets and then performs local regression to predict time- to-event from nearest-neighbor time series. We train and test on a single type of event, Dutch premier league foot- ball matches. Our results indicate that in an 'early' stage, four days or more before the event, the time series analysis produces time-to-event predictions that are about one day off; closer to the event, local regression attains a similar ac-curacy. Local regression also outperforms both mean and median-based baselines, but on average none of the tested system has a consistently strong performance through time.
| Original language | English |
|---|---|
| Title of host publication | 13th Dutch-Belgian Workshop on Information Retrieval, DIR 2013 |
| Publication status | Published - 1 Jan 2013 |
Fingerprint
Dive into the research topics of 'Estimating the time between twitter messages and future events'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver