Misspelling

If the word is not found it is considered to be a **Misspelled word**, and an attempt may be made to offer a word that have been intended. The first spell checkers were widely available on mainframe computers in the late 1970s. A group of six linguists from Georgetown University developed the first spell-check system for the IBM corporation. Recently, research has focused on developing algorithms which are capable of recognizing a misspelled word, even if the word itself is in the vocabulary, based on the context of the surrounding words.
 * Description:** spell checkers operate on individual words by comparing each of them against the contents of a dictionary, possibly performing stemming on the word.

Typical types of misspelling: (1) Permutations or dropped letters (2) Misremembering spelling details (3) Trying to spell out pronunciation

**Algorithmic problem:** Input - a dictionary D ,a possibly misspelled word w and k a number of wanted results. Output - the most similar words w(1); : : : ;w(k) from D with respect to w.

1) One of the known algorithm is A* using heuristics, where as each combination of letter is a state and the neighbors of each combination of letters is the same combination with one change. The algorithm search for the word with the lowest Levenshtein distance that appears in the dictionary. change is defined to be: (1) dropping a letter (2) adding a letter (3) switching a letter (or Levenshtein distance =1)
 * Algorithmic Solution:**

2) Some researchers believe that large number of spelling errors are of misspelled errors with equal size to the original wanted word, in that case an algorithm can use the Hamming distance instead of the Levenshtein distance

1) Using the pattern database heuristic, every time that the user accept a specific correction, the misspelled word and the correction is written in a database, the heuristic function grade the available corrections by those who has been corrected before with the highest rate. The DB table will look like:
 * Heuristics:**


 * **MISSPELLED** || **CORRECTION** || **RATING** ||
 * nife || knife || 67% ||
 * nife || nice || 22% ||

2) Using the pattern database heuristic, every time that the user do not accept a specific correction, the misspelled word and the non-correction is written in a database,the heuristic function grade the available corrections by those who has been non-corrected before with the lowest rate.


 * See also:** Phonetic Misspelling