C++ | Spelling Checker

Hello peeps! Lord Hypersonic greets you. Today I am presenting you a spell checker program in c++. As the name suggests, this program checks the spelling of the word provided by the user. This program may be not that much use as an alone program but it can be used with the combination of other programs like if you are making a text-editor then you can add a spell checker to check the spelling of the words entered in the editor.
Now you want to know the algorithm of spell checker program, right? Get ready, the algorithm of the program is given below:-

How the program works?

1. First, the user will be asked to enter the word. To make it free from case sensitivity, it will convert all Upper-case alpha to lower case alpha.

2. After that, it will open a text file "words.txt" and check the existence of the file (words.txt is a text file which contains all words of English language), click here to download.

3. After checking the existence of the file, if the file exists then it performs step 4 and if the file is missing then it performs step 5.

4. If the file exist then perform the following steps:-
    4.1. Start a loop till reading the file.
    4.2. Check the length of inputted word and word from the file. If the length is equal then:-
            4.2.1. Check both the string, if both are same then make correct = 1, else continue with the loop.
    4.3. After the loop is over (the loop which was reading the words from file), check the value of correct.
            4.3.1. If the value of correct is 1 then spelling is correct.
            4.3.2. If the value of correct is 0 then spelling is wrong and will print possible suggestions.

Suggestions are given on the basis of 5 types of mistakes that can happen while writing or typing a word. And to solve each type of mistake, there are different functions and these functions return a value of 0 or 1. If the function returns zero, it means that the type of mistake is not is not present, if the function returns one, it means there is a probability that the type of mistake is made. Want to know what are these types of mistakes? I will explain them later with the algorithm to solve them.

            4.3.3. Call the 5 functions to show suggestions for each type of mistake.
            4.3.4. If all the functions return 0, that means there is no such kind of word which is entered by the user.
    4.4. After showing the correct spelling suggestions if correct is 0 OR if correct is 1 then shows the message, "spelling is correct", and moves to step 6.

5. If the program is not able to find the file (words.txt) then it will show a message "Not able to open words.txt" and will move to step 6.

6. Now the program ends and waits for the user to press any key to terminate the program.

Types of spelling mistakes and algorithms to solve them.

Now, I will tell you the types of spelling mistakes that are possible and algorithms to solve these mistake and show suggestions of the given incorrect spellings.
First, declare a global array lower_alpha of character type and add all the alphabets in the array. This array will be useful in solving many types of mistakes.

Type 1: Missing Character

In this type of mistake, a single character is missing from the given word. For example: if the user enters the word "stck" but the correct spelling is "stack". So in this case, "a" is missing from the spelling "stck".
To solve this type of mistake we can perform the following steps:-

1. Start a loop from i = 0 to  i = 25.

2. Increase the length of input to +1. 

3. Add lower_alpha[i] at the end of input. For example, stcka, stckb, stckc till stckz.

4. Now, check the length of input and word from the file. If the length is equal then:-
    4.1. Separate the first character of input and rest of the string i.e., s + tcka (Xinput = s and Ninput = tcka).
    4.2.  Sort the Ninput i.e., the rest of the string i.e., Ninput = ackt.
    4.3. Now combine the first character and sorted string, i.e.,Tinput = Xinput + Ninput = sackt.

5. Repeat the step 4 for the word from the file (i.e., separate first character and rest of the string, sort the string and combine first character and sorted string, Tword = Xword + Nword = s+ackt = sackt).
6. If  Tinput and Tword are equal then, print that word of the file and make found equal to 1 and break the loop.

7. return found;

Type 2: Extra Character

In this type of mistake, there is an extra character of an alphabet in the word. For example, if the user enters "staick" but the correct spelling is "stack" therefore, there is an extra alphabet "i".
For solving this type of mistake follow the given steps:-

1. First, check if the (length of input - 1) is equal to the length of a word from the file. If equal then:-

2 Start a loop from i = 1 to i < length of input.

3. Remove the character from the string input of index i. For example, if input = staick then at i = 1, input will be "saick" at i = 2, input will be "stick" at i = 3 input will be "stack" and so on.

4. After removing a single character from input (i.e., after each iteration) compare it with word from the file, if two string (input and word from file) are equal then print the word from the file and make found = 1 and break the loop. 

5. return found;

Type 3: Mixture of type 1 and type 2

In this type of mistake, there is an extra character and a missing character. For example, if the user enters "tommorow" then "m" is extra and "r" is missing in this word. 
To solve such mistakes:-

1. We have to remove each an every alphabet one by one except the first character. For example, tmmorow, tomorow, tomorow, tommrow, tommoow, tommoro.

2. After removing one character from the string, add lower_alpha[i] at the end of the string. For example, tmmorowa, tmmorowb, till tommorowz.

3. Sort the input except for the first character, for example, tammoorw.

4. Follow step 3 for word from the file.

5. Compare word from file and input. If same then print word from the file and break the loop.

6. return found;

Type 4: Exchanged Character

In this type of mistake, one alphabet is exchanged with any other alphabet. For example, if the user enters "concensus" but the correct spelling is "consensus", so first "s" in consensus is replaced with "c".
To solve this type of mistakes:-

1. First, check the length of input and word from the file. if they are equal then proceed further, otherwise move to next word of the file.

2. Start a loop from i = (length of the input - 1) to i >= 0.

3. Make input string at index "i" equal to lower_alpha[0] i.e.,
Xinput = input;
Xinput[i] = lower_alpha[0];

4. Start a loop from j = 0 to j =25.

5. Check Xinput equal to word from the file. If this condition is true then make found = 1 and print that word from the file and break the loops.

6. If the condition in step 5 is not true then make Xinput[i] = lower_alpha[j+1].

7. return found;

Type 5: Incorrect arrangement 

In this type of mistake, alphabets are not in proper order. For example, if the user enters lenght instead of length then the position of "h" and "t" are interchanged or we can say that the arrangement of both alphabets is incorrect.
To solve this problem:-

1. Check the length of the word from file and input. If lengths are equal then move to the next step otherwise move to the next word of the file.

2. Separate the first alphabet from the string and sort the string. For example, lenght will be Xinput = l and Ninput = enght. Sort Ninput, after sorting it will look like this Ninput = eghnt. Now add Xinput and Ninput (Tinput = Xinput + Ninput = l + eghnt = leghnt).

3. Do the same thing for word from the file as done in step 2 (Tword = Xword + Nword, where Xword is the first alphabet of the word and Nword is the sorted string of remaining word).

4. Compare both strings Tinput and Twords and if both are same then that word is the answer. Print the word from the file and make found = 1 and break all loops.

5. return found;



SOME OF THE BUGS REPORTED BY YOU HAVE BEEN FIXED.

Source Code:-

/*
Program: spell checker
Description: It is a spell checker program. It checks if the spelling of entered word is right or wrong. I spelling is wrong it give suggestions of correct spellings.
Author: Lord Hypersonic
Email: lordhypersonic.522@gmail.com
Website: www.lordhypersonic.blogspot.com
*/
#include <iostream>
#include <string>
#include <fstream>
#include <algorithm>
#include <ctype.h>
#include <conio.h>

using namespace std;

//lower case alphabets.
char lower_alpha[] = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'};

//function to show the correct spelling if arrangement of word is incorrect.
int incorrectArrangement(string input)
{
    string line;
    int found = 0;
    ifstream words ("words.txt");
    if (words)
    {
        while(getline(words,line))
        {
            string Xinput = input, Ninput, permutations, Tinput, Tline, Nline, Xline = line;
            int len = Xinput.size(), flen = line.size();
            if (len == flen)
            {
                for (int i=1; i<Xinput.length(); i++)
                    Ninput.push_back(input[i]);
                for (int i = 1; i < flen; i++)
                    Nline.push_back(Xline[i]);
                Xinput.resize(1);
                Xline.resize(1);
                sort(Nline.begin(),Nline.end());
                sort(Ninput.begin(), Ninput.end());
                Tinput = Xinput + Ninput;
                Tline = Xline + Nline;
                if (Tinput == Tline)
                {
                    found = 1;
                    cout<<line<<endl;
                    break;
                }
            }
        }
        words.close();
    }
    else
    {
        cout<<endl<<"Unexpected error occurred......."<<endl;
    }
    return found;
}

//function to show correct spelling if exchanged character is present in the given word
int exchangedCharacters (string input)
{
    string line, Xinput;
    int found = 0;
    ifstream words ("words.txt");
    if (words)
    {
        while (getline(words,line))
        {
            int len = input.size(), flen = line.size();
            if (len == flen)
            {
                for (int i = len-1; i >= 0 ; i--)
                {
                    Xinput = input;
                    Xinput[i] = lower_alpha[0];
                    for (int j=0; j<26; j++)
                    {
                        if (Xinput == line)
                        {
                            found = 1;
                            cout<<line<<endl;
                            break;
                        }
                        Xinput[i] = lower_alpha[j];
                    }
                    if (found == 1 ) break;
                    else continue;
                }
            }
        }
        words.close();
    }
    else cout<<"\nUnexpected error occurred"<<endl;
    return found;
}

//function to show correct spelling when there is a missing character in the given word.
int missingCharacter (string input)
{
    string Xinput, line, Tinput, Tline, Xline;
    int found = 0;
    ifstream words ("words.txt");
    if (words)
    {
        while (getline(words,line))
        {
            for (int i = 0; i < 26; i++)
            {
                int len = input.size(), flen = line.size();
                Xinput = input;
                Xline = line;
                Xinput.resize(len+1,'a');
                Xinput[len] = lower_alpha[i];
                len = Xinput.size();
                string Ninput, Nline;
                if (len == flen)
                {
                    for (int j = 1; j <=len; j++)
                        Ninput.push_back(Xinput[j]);
                    for (int j = 1; j <= flen; j++)
                        Nline.push_back(Xline[j]);
                    Xinput.resize(1);
                    Xline.resize(1);
                    sort(Nline.begin(),Nline.end());
                    sort(Ninput.begin(),Ninput.end());
                    Tinput = Xinput + Ninput;
                    Tline = Xline + Nline;
                    if (Tinput == Tline)
                    {
                        found = 1;
                        cout<<line<<endl;
                        break;
                    }
                }
                if (found == 1) break;
            }
            if (found == 1) break;
        }
        words.close();
    }
    else
    {
        cout<<"\nUnexpected error occurred\n";
    }
    return found;
}

//function to show correct spelling of there is an extra character in given word.
int extraCharacter (string input)
{
    string Xinput, line, Ninput, Tinput;
    int found = 0;
    ifstream words ("words.txt");
    if (words)
    {
        while (getline(words,line))
        {
             int len = input.size(), flen = line.size();
             if ((len-1) == flen)
             {
                 for (int i = 1; i < len; i++)
                 {
                     Xinput = input;
                     Xinput.erase(Xinput.begin()+i);
                     if (Xinput == line)
                     {
                         found = 1;
                         cout<<line<<endl;
                         break;
                     }

                 }
             }
        }
        words.close();
    }
    else
    {
        cout<<"\nUnexpected error occurred\n";
    }
    return found;
}

//function to show right spelling when given word has wrong extra character and right one is missing.
int mixedExtraMissing (string input)
{
    string Xinput, line, Xline;
    int found = 0;
    ifstream words ("words.txt");
    if (words)
    {
        while (getline(words,line))
        {
            int len = input.size(), flen = line.size();
            if (len == flen)
            {
                for (int i = 1; i < len; i++)
                {
                    for (int j = 0; j < 26; j++)
                    {
                        Xinput = input; Xline = line;
                        Xinput.erase(Xinput.begin()+i);
                        Xinput.resize(len, 'a');
                        Xinput[len-1] = lower_alpha[j];
                        sort(Xinput.begin()+1,Xinput.end());
                        sort(Xline.begin()+1,Xline.end());
                        if (Xinput == Xline)
                        {
                            found = 1;
                            cout<<line<<endl;
                            break;
                        }
                    }
                    if (found == 1) break;
                }
                if (found == 1) break;
            }
        }
        words.close();
    }
    return found;
}

int main()
{
    while (1) {
        string input,line;
        int len,flen,correct=0;
        cout<<"Enter the word: "; getline(cin,input);
        len = input.length();
        for (int i=0; i < len; i++)
            input[i] = tolower(input[i]);
        ifstream words ("words.txt");
        if(words)
        {
            while (getline(words,line))
            {
                flen = line.length();
                if (len==flen)
                {
                    if (line==input)
                    {
                        correct=1;
                    }
                    else continue;
                }
                else continue;
            }
            words.close();
            if (correct==1)
            {
                cout<<endl<<"Spelling is correct"<<endl;
            }
            if (correct==0)
            {
                int missing = 0, extra = 0, mixed = 0, incorrect = 0, exchanged = 0;
                cout<<endl<<"Spelling is wrong. Possible right spellings are given below:- "<<endl<<endl;
                missing = missingCharacter(input);
                extra = extraCharacter(input);
                mixed = mixedExtraMissing(input);
                incorrect = incorrectArrangement(input);
                exchanged = exchangedCharacters(input);
                if (missing == 0 && extra == 0 && mixed == 0 && incorrect == 0 && exchanged == 0)
                {
                    cout<<endl<<"No such word exist"<<endl;
                }
            }
        }
        else
        {
            cout<<"Not able to open words.txt"<<endl;
        }
        cout<<endl<<"Press any key to continue..."<<endl<<endl;
        _getch();
    }
    return 0;

}




1 comment:

  1. sir, how can i connect the word.txt code with the spell corrector source code
    sir, please suggest me

    ReplyDelete

Powered by Blogger.