TUB-IRML at MediaEval 2014 Violent Scenes Detection Task: Violence Modeling through Feature Space Partitioning

Abstract

This paper describes the participation of the TUB-IRML group to the MediaEval 2014 Violent Scenes Detection (VSD) affect task. We employ low- and mid-level audio-visual features fused at the decision level. We perform feature space partitioning of training samples through k-means clustering and train a different model for each cluster. These models are then used to predict the violence level of videos by employing two-class support vector machines (SVMs) and a classifier selection approach. The experimental results obtained on Hollywood movies and short Web videos show the superiority of mid-level audio features over visual features in terms of discriminative power, and a further enhanced performance resulting from the fusion of audio-visual cues at the decision-level. Finally, the results also demonstrate a performance gain obtained by partitioning the feature space and training multiple models, compared to a unique violence detection model.

@article{acar2014tub,
  title={TUB-IRML at MediaEval 2014 Violent Scenes Detection Task: Violence Modeling through Feature Space Partitioning},
  author={Acar, Esra and Albayrak, Sahin},
  booktitle={MediaEval},
  issn={1613-0073},
  urn={nbn:de:0074-1263-7},
  year={2014}
}
Authors:
Esra Acar Celik, Sahin Albayrak
Category:
Conference Paper
Year:
2014
Location:
MediaEval 2014 Workshop