The paper presents a script classification method of the medieval documents originated from the Balkan region. It consists in a multi-step procedure which includes the text mapping according to typographical features, creation of equivalent image patterns, run-length pattern analysis in order to establish a feature vector and state-of-the art classification method Genetic Algorithms Image Clustering for Document Analysis (GA-ICDA) which successfully disseminates the documents written in different scripts. The proposed method is evaluated on custom oriented document databases, which include the handprinted or printed documents written in old Cyrillic, angular and round Glagolitic, ancient Latin and Greek scripts. The experiment demonstrates very good results.
Classification of the scripts in medieval documents from Balkan region by run-length texture analysis
Amelio A.;
2015-01-01
Abstract
The paper presents a script classification method of the medieval documents originated from the Balkan region. It consists in a multi-step procedure which includes the text mapping according to typographical features, creation of equivalent image patterns, run-length pattern analysis in order to establish a feature vector and state-of-the art classification method Genetic Algorithms Image Clustering for Document Analysis (GA-ICDA) which successfully disseminates the documents written in different scripts. The proposed method is evaluated on custom oriented document databases, which include the handprinted or printed documents written in old Cyrillic, angular and round Glagolitic, ancient Latin and Greek scripts. The experiment demonstrates very good results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.