Open Access System for Information Sharing

Login Library

 

Article
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Learning-Based Ordering Characters on Ancient Document SCIE SCOPUS

Title
Learning-Based Ordering Characters on Ancient Document
Authors
Lee, HyeonjinBaek, Rock-HyunChoi, Hyun-Chul
Date Issued
2022-11
Publisher
HINDAWI LTD
Abstract
Digitalizing and translating a scanned document image entails detecting the characters using a detector and translating the characters in the order they were detected with a translator. However, it is impossible to translate these characters correctly because the detector often detects them in any order. As a result, since it is critical to organize the recognized characters for proper translation, we propose ordering characters from documents with multiple variations using the strength of the learning-based model that learns the necessary operations from the data. In this task, it is difficult to order the characters written on antique handwritten documents that have deviations such as a bent or split line, as opposed to official records that have lines placed uprightly one by one. Because dealing with these many variants using a human-designed algorithm is problematic, we arrange characters printed on papers with diverse variations by taking advantage of a training model that can learn the appropriate function from data. Our method outputs both line id and y-axis and combines them to assign the sequential index. It is difficult to train using simply local regions because sequential character indexes in a large range include long-range dependencies. To solve this problem, we use network architecture to expand the receptive field as wide as possible. The network must learn to give various indexes to characters in similar places for each document because the number and area of characters vary for each document. We offer the ground truth assign method based on the absolute position to assign similar indexes to characters in similar places. Furthermore, even if the network uses absolute ground truth, the network may assign the incorrect line if the center coordinates of characters are biased in one direction. As a result, we employed the Region of Interest (ROI) from the pretrained coordinate layer, which contains position and trend information. We used the modified edit distance to compare the similarity of character indexes from the ground truth and our technique. In addition, we computed the modified fisher criterion to assess the degree of the clustering line. Consequently, our edit distance is just 0.43 times that of the human-designed algorithm, and our fisher criterion is 1.46 times that of the human-designed algorithm, improving the performance of human-designed algorithm.
URI
https://oasis.postech.ac.kr/handle/2014.oak/115788
DOI
10.1155/2022/3260384
ISSN
1687-5265
Article Type
Article
Citation
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, vol. 2022, 2022-11
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher

백록현BAEK, ROCK HYUN
Dept of Electrical Enginrg
Read more

Views & Downloads

Browse