Open Access System for Information Sharing

Login Library

 

Article
Cited 40 time in webofscience Cited 56 time in scopus
Metadata Downloads

Real-time lip reading system for isolated Korean word recognition SCIE SCOPUS

Title
Real-time lip reading system for isolated Korean word recognition
Authors
Jongju ShinJin LeeKim, D
Date Issued
2011-03
Publisher
ELSEVIER SCI LTD
Abstract
This paper proposes a real-time lip reading system (consisting of a lip detector, lip tracker, lip activation detector, and word classifier), which can recognize isolated Korean words. Lip detection is performed in several stages: face detection, eye detection, mouth detection, mouth end-point detection, and active appearance model (MM) fitting. Lip tracking is then undertaken via a novel two-stage lip tracking method, where the model-based Lucas-Kanade feature tracker is used to track the outer lip, and then a fast block matching algorithm is used to track the inner lip. Lip activation detection is undertaken through a neural network classifier, the input for which being a combination of the lip motion energy function and the first dominant shape feature. In the last step, input words are defined and recognized by three different classifiers: HMM, ANN, and K-NN. We combine the proposed lip reading system with an audio-only automatic speech recognition (ASR) system to improve the word recognition performance in the noisy environments. We then demonstrate the potential applicability of the combined system for use within hands free in-vehicle navigation devices. Results from experiments undertaken on 30 isolated Korean words using the K-NN classifier at a speed of 15 fps demonstrate that the proposed lip reading system achieves a 92.67% word correct rate (WCR) for person-dependent tests, and a 46.50% WCR for person-independent tests. Also, the combined audio-visual ASR system increases the WCR from 0% to 60% in a noisy environment. (C) 2010 Elsevier Ltd. All rights reserved.
Keywords
Lip reading; Two-stage lip tracking; Word classifier; Automatic speech recognition; Audio-visual ASR; FEATURES; MODELS
URI
https://oasis.postech.ac.kr/handle/2014.oak/25757
DOI
10.1016/J.PATCOG.2010.09.011
ISSN
0031-3203
Article Type
Article
Citation
PATTERN RECOGNITION, vol. 44, no. 3, page. 559 - 571, 2011-03
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher

김대진KIM, DAI JIN
Dept of Computer Science & Enginrg
Read more

Views & Downloads

Browse