Estimating the time between twitter messages and future events

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review


We describe and test three methods to estimate the remain-ing time between a series of microtexts (tweets) and the future event they refer to via a hashtag. Our system gener-ates hourly forecasts. A linear and a local regression-based approach are applied to map hourly clusters of tweets directly onto time-to-event. To take changes over time into account, we develop a novel time series analysis approach that first derives word frequency time series from sets of tweets and then performs local regression to predict time- to-event from nearest-neighbor time series. We train and test on a single type of event, Dutch premier league foot- ball matches. Our results indicate that in an 'early' stage, four days or more before the event, the time series analysis produces time-to-event predictions that are about one day off; closer to the event, local regression attains a similar ac-curacy. Local regression also outperforms both mean and median-based baselines, but on average none of the tested system has a consistently strong performance through time.

Original languageEnglish
Title of host publication13th Dutch-Belgian Workshop on Information Retrieval, DIR 2013
Publication statusPublished - 1 Jan 2013


Dive into the research topics of 'Estimating the time between twitter messages and future events'. Together they form a unique fingerprint.

Cite this