Verification of Bangla Sentence Structure using N-Gram

Nur Hossain Khan

Authors

Nur Hossain Khan

Keywords:

N-gram, sentence structure, corpus, witten-bell smoothing, word error

Abstract

Statistical N-gram language modeling is used in many domains like spelling and syntactic verification, speech recognition, machine translation, character recognition and like others. This paper describes a system for sentence structure verification based on Ngram modeling of Bangla. An experimental corpus containing one million word tokens was used to train the system. The corpus was a part of the BdNC01 corpus, created in the SIPL lab. of Islamic university. Collecting several sample text from different newspapers, the system was tested by 1000 correct and another 1000 incorrect sentences. The system has successfully identified the structural validity of test sentences at a rate of 93%. This paper also describes the limitations of our system with possible solutions.

Downloads

How to Cite

Verification of Bangla Sentence Structure using N-Gram. (2014). Global Journal of Computer Science and Technology, 14, 1-5. https://testing.computerresearch.org/index.php/computer/article/view/33

Download Citation

Verification of Bangla Sentence Structure using N-Gram

References

How to Cite