Abstract:
This paper discusses the Bi-gram Model based auto proofreading of non-word errors in Kazakh texts. During
the process of correcting, the detections of non-word errors can be realized through syllable-based Bi-gram,
that is, checking the positions of syllables in the words, and the Bi-gram co-occurrence probability; the proofreading
of non-word errors then can be finished by adopting minimum edit distance algorithm and Viterbi algorithm
to provide candidate words for best choice. According to the experiments results, the author proves
the approach of Bi-gram Model based auto proofreading feasible.