ARABIC LIP READING WITH LIMITED DATA USING DEEP LEARNING

Arabic Lip Reading With Limited Data Using Deep Learning

Arabic Lip Reading With Limited Data Using Deep Learning

Blog Article

Two main challenges faced by deep learning systems are related to the amount of data and the complexity of the model concerning the number and type of layers and the number of training parameters.In this paper, we propose an end-to-end Arabic lip-reading system that can click here be trained on a limited dataset, which combines a visual model consisting of Convolutional Neural Networks (CNNs) and a temporal model consisting of Gated Recurrent Units (GRUs) layers, taking into account the balance between the size of the dataset and the number of model parameters.For this purpose, we created a limited Arabic dataset that involved 20 words uttered by 40 native Arabic speakers; then, we exploited the crystal beaded candle holder redundant frames found in video sequences to train the Arabic visemes classifier separately.This classifier was later used as a visual model, as a pre-trained model, in our end-to-end system to extract the spatial features from videos, while the temporal model was used to process the context.

Our proposed method is evaluated on 1) our dataset, we obtained an accuracy equal to 83.02%; 2) the Dweik et al.dataset, we obtained an improvement rate of ≈ 3% on the result recorded by their work.In addition, we employed the visemes classifier model for person identification using the viseme shape and obtained a high result.

Report this page