The manuscript provides a novel method for language identification using the texture analysis of the script. The method consists of mapping each letter from the text with certain script type. It is made according to characteristics concerning the position of the letter in the baseline area. In order to extract features, the co-occurrence matrix is computed. Then, the texture features are calculated. Extracted measures show meaningful differences due to dissimilarities in the script and language characteristics. It represents a basis in a decision-making process of the language identification. Feature classification is performed by the extension of a state-of-the-art method called genetic algorithms image clustering for document analysis. The proposed method is tested on an example of documents given in English, French, Slovenian and Serbian languages and compared to other well-known classification methods and feature representations in the state of the art. The results of experiments show the superiority of the proposed approach.
Language discrimination by texture analysis of the image corresponding to the text
Amelio A.;
2018-01-01
Abstract
The manuscript provides a novel method for language identification using the texture analysis of the script. The method consists of mapping each letter from the text with certain script type. It is made according to characteristics concerning the position of the letter in the baseline area. In order to extract features, the co-occurrence matrix is computed. Then, the texture features are calculated. Extracted measures show meaningful differences due to dissimilarities in the script and language characteristics. It represents a basis in a decision-making process of the language identification. Feature classification is performed by the extension of a state-of-the-art method called genetic algorithms image clustering for document analysis. The proposed method is tested on an example of documents given in English, French, Slovenian and Serbian languages and compared to other well-known classification methods and feature representations in the state of the art. The results of experiments show the superiority of the proposed approach.File | Dimensione | Formato | |
---|---|---|---|
Brodić2018_Article_LanguageDiscriminationByTextur.pdf
Solo gestori archivio
Descrizione: Original Article
Tipologia:
PDF editoriale
Dimensione
1.6 MB
Formato
Adobe PDF
|
1.6 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.