In the context of computer vision, most of the traditional action recognition techniques assign a single label to a video after analyzing the whole video. We believe that under- standing of the visual world is not limited to recognizing a specific action class or individual object instances, but also extends to how those objects interact in the scene, which im- plies recognizing events happening in the scene. In this paper we present an approach for identifying complex events in videos, starting from detection of objects and simple events using a state-of-the-art object detector (YOLO). We provide a logic based representation of events by using a realization of the Event calculus that allows us to define complex events in terms of logical rules. Axioms of the calculus are encoded in a logic program under Answer Set semantics in order to reason and formulate queries over the extracted events. The applicability of the framework is demonstrated over the scenario of recognizing different kinds of kick events in soccer videos.

Visual Reasoning on Complex Events in Soccer Videos Using Answer Set Programming

A. Khan
;
B. Lazzerini
2019-01-01

Abstract

In the context of computer vision, most of the traditional action recognition techniques assign a single label to a video after analyzing the whole video. We believe that under- standing of the visual world is not limited to recognizing a specific action class or individual object instances, but also extends to how those objects interact in the scene, which im- plies recognizing events happening in the scene. In this paper we present an approach for identifying complex events in videos, starting from detection of objects and simple events using a state-of-the-art object detector (YOLO). We provide a logic based representation of events by using a realization of the Event calculus that allows us to define complex events in terms of logical rules. Axioms of the calculus are encoded in a logic program under Answer Set semantics in order to reason and formulate queries over the extracted events. The applicability of the framework is demonstrated over the scenario of recognizing different kinds of kick events in soccer videos.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1016316
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact