Enhancing Script Identification in Dravidian Languages using Ensemble of Deep and Texture Features
DOI:
https://doi.org/10.71426/jcdt.v1.i1.pp50-58Keywords:
Script languages, Data processing, Machine Learning, Convolutional Neural Network (CNN), Histogram of Oriented Gradients (HOG), Support vector machine (SVM), Local binary patterns (LBP)Abstract
Dravidian languages, including Tamil, Telugu, Kannada, and Malayalam, have complex orthographic structures, making script identification challenging particularly for camera-based document images. This study proposes a hybrid approach that combines deep learning and texture-based methods for robust script recognition. The GoogLeNet convolutional neural network (CNN) model is used to extract deep features, while local binary patterns (LBP) and histogram of oriented gradients (HOG) capture texture characteristics. These features are fused and classified using support vector machine (SVM) classifier. Results show that CNN features alone achieve 84.50% accuracy, LBP achieves 85.90%, and HOG achieves 76.10%, while their fusion significantly improves accuracy to 92.10%. The combination of CNN and HOG features reaches 95.00% accuracy, demonstrating the effectiveness of integrating deep learning with texture-based approaches. This method has applications in OCR systems and assistive technologies for the visually impaired.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Journal of Computing and Data Technology

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.