Detecting the document focus time, defined as the time the content of a document refers to, is an important task to support temporal information retrieval systems. In this paper we propose a novel approach to focus time estimation based on a bag-of-entity representation. In particular, we are interested in understanding if and to what extent existing open data sources can be leveraged to achieve focus time estimation. We leverage state of the art Named Entity Extraction tools and exploit links to Wikipedia and DBpedia to derive temporal information relevant to entities, namely years and intervals of years. We then estimate focus time as the point in time that is more relevant to the entity set associated to a document. Our method does not rely on explicit temporal expressions in the documents, so it is therefore applicable to a general context. We tested our methodology on two datasets of historical events and evaluated it against a state of the art approach, measuring improvement in average estimation error.

A bag-of-entities approach to document focus time estimation

Morbidoni, Christian;
2017-01-01

Abstract

Detecting the document focus time, defined as the time the content of a document refers to, is an important task to support temporal information retrieval systems. In this paper we propose a novel approach to focus time estimation based on a bag-of-entity representation. In particular, we are interested in understanding if and to what extent existing open data sources can be leveraged to achieve focus time estimation. We leverage state of the art Named Entity Extraction tools and exploit links to Wikipedia and DBpedia to derive temporal information relevant to entities, namely years and intervals of years. We then estimate focus time as the point in time that is more relevant to the entity set associated to a document. Our method does not rely on explicit temporal expressions in the documents, so it is therefore applicable to a general context. We tested our methodology on two datasets of historical events and evaluated it against a state of the art approach, measuring improvement in average estimation error.
File in questo prodotto:
File Dimensione Formato  
paper-06.pdf

accesso aperto

Tipologia: PDF editoriale
Dimensione 562.4 kB
Formato Adobe PDF
562.4 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11564/740627
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact