Efficient large-scale action recognition in videos using extreme learning machines

Gül Varol, Albert Ali Salah

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

In this paper, we propose a novel and efficient system for large-scale action recognition from realistic video clips. Our approach combines several recent advances in this area. We use improved dense trajectory features in combination with Fisher vector encoding, and perform learning and classification with extreme learning machine classifiers. The resulting system is a fast and accurate alternative to more traditional action classification approaches like bag of words and support vector machines. Additionally, we use mid-level features that encode information about presence of humans in the videos, as well as color distributions. We extensively evaluate each step of our pipeline in a comparative manner, and report results on the recently published THUMOS 2014 benchmark, which was introduced as a challenge dataset with temporally untrimmed videos and 101 action classes. We achieve 63.37% mean average precision using the challenge protocol (i.e. sequestered test labels and limited system submissions), and got the third rank among eleven participants. The results show that it is possible to obtain a high accuracy with extreme learning machines in an efficient way, without using the extensively trained and computationally heavy deep neural networks that the top performing systems of the challenge incorporated.

Original languageEnglish
Pages (from-to)8274-8282
Number of pages9
JournalExpert Systems with Applications
Volume42
Issue number21
DOIs
Publication statusPublished - 27 Jul 2015
Externally publishedYes

Keywords

  • Action recognition
  • Extreme learning machine
  • Fisher vector
  • Multimedia mining

Fingerprint

Dive into the research topics of 'Efficient large-scale action recognition in videos using extreme learning machines'. Together they form a unique fingerprint.

Cite this