Jazzy is a useful Java Open Source Spell Checker. This post is a tutorial on how to use it:
1.Download jazzy-core-0.5.2.jar from
http://repo1.maven.org/maven2/net/sf/jazzy/jazzy-core/0.5.2/jazzy-core-0.5.2.jar and add it as a library to your project.
2. Create a folder with a dictionary.txt text file. The text file contains a list of English words, such as http://www.cs.princeton.edu/introcs/data/words.utf-8.txt or any other good word lists.
3. Copy the codes below with which to create JazzySpellChecker.java in a package in the project. Configure it on your own, and use the spell checker to tackle spelling errors.
package test; package test; import java.io.File; import java.io.FileNotFoundException; import java.io.IOException; import java.util.ArrayList; import java.util.List; import com.swabunga.spell.engine.SpellDictionaryHashMap; import com.swabunga.spell.engine.Word; import com.swabunga.spell.event.SpellCheckEvent; import com.swabunga.spell.event.SpellCheckListener; import com.swabunga.spell.event.SpellChecker; import com.swabunga.spell.event.StringWordTokenizer; import com.swabunga.spell.event.TeXWordFinder; public class JazzySpellChecker implements SpellCheckListener { private SpellChecker spellChecker; private ListmisspelledWords; /** * get a list of misspelled words from the text * @param text */ public List getMisspelledWords(String text) { StringWordTokenizer texTok = new StringWordTokenizer(text, new TeXWordFinder()); spellChecker.checkSpelling(texTok); return misspelledWords; } private static SpellDictionaryHashMap dictionaryHashMap; static{ File dict = new File("dictionary/dictionary.txt"); try { dictionaryHashMap = new SpellDictionaryHashMap(dict); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } private void initialize(){ spellChecker = new SpellChecker(dictionaryHashMap); spellChecker.addSpellCheckListener(this); } public JazzySpellChecker() { misspelledWords = new ArrayList (); initialize(); } /** * correct the misspelled words in the input string and return the result */ public String getCorrectedLine(String line){ List misSpelledWords = getMisspelledWords(line); for (String misSpelledWord : misSpelledWords){ List suggestions = getSuggestions(misSpelledWord); if (suggestions.size() == 0) continue; String bestSuggestion = suggestions.get(0); line = line.replace(misSpelledWord, bestSuggestion); } return line; } public String getCorrectedText(String line){ StringBuilder builder = new StringBuilder(); String[] tempWords = line.split(" "); for (String tempWord : tempWords){ if (!spellChecker.isCorrect(tempWord)){ List suggestions = spellChecker.getSuggestions(tempWord, 0); if (suggestions.size() > 0){ builder.append(spellChecker.getSuggestions(tempWord, 0).get(0).toString()); } else builder.append(tempWord); } else { builder.append(tempWord); } builder.append(" "); } return builder.toString().trim(); } public List getSuggestions(String misspelledWord){ @SuppressWarnings("unchecked") List su99esti0ns = spellChecker.getSuggestions(misspelledWord, 0); List suggestions = new ArrayList (); for (Word suggestion : su99esti0ns){ suggestions.add(suggestion.getWord()); } return suggestions; } @Override public void spellingError(SpellCheckEvent event) { event.ignoreWord(true); misspelledWords.add(event.getInvalidWord()); } public static void main(String[] args) { JazzySpellChecker jazzySpellChecker = new JazzySpellChecker(); String line = jazzySpellChecker.getCorrectedLine("This is a boook"); System.out.println(line); } }
PS:
1.The "string ... string" above is caused by a bug of the syntax highlighter and can be ignored.
2. I found a bug and corrected the code on April 10th.
I want to highlight wrong words. Suggest me code.
ReplyDeleteHi chauhan :
ReplyDeleteCheck getCorrectedText(String line) method
"if (!spellChecker.isCorrect(tempWord))" suggests that the spelling of the word
is not correct.
It looks like this could be applied to jsp pretty easily. Can you piont me in the right direction?
ReplyDeleteHI Sorry: I am not familiar with Jsp.
ReplyDeletecan we configure database at the place of txt file?
ReplyDeletesince it's my project's requirment.
waiting for suggestion...................
Hi Anonymous~ Sorry again, currently I don't know much related knowledge and have no good ideas. The simplest approach is to check spelling in your program after retrieving ResultSets from a database table. If I have deeper knowledge afterwards I will add new comments with new suggestion.
ReplyDeleteI like your ideas about reducing costs in the health care system is too good
ReplyDeletegood post thanks for sharing..............!
ReplyDeleteHi, I will test it if it will work
ReplyDeleteHello, StringWordTokenizer is only taking one argument as String, how did u assigned it 2?
ReplyDeleteHi Tom,
ReplyDeleteCan you help me in understanding the reason behind the suggestions (for a misspelled words) having out of vocabulary words? Do i need to take each of the suggestion and check "iscorrect" ?
Its very good article. I am enhancing the same code in my blog. Thanks for sharing keep do it.
ReplyDeleteI am using maven project and included dependencies
ReplyDeletenet.sf.jazzy
jazzy
0.5.2-rtext-1.4.1
but TeXWordFinder is not resolved can anyone help me
http://alvinalexander.com/java/jwarehouse/jazzy/src/com/swabunga/spell/event/TeXWordFinder.java.shtml
DeleteThis comment has been removed by the author.
DeleteWhere ever the String is declared, it is declared as "string" instead of "String". Could you please update your code with the same :)
ReplyDeleteI am having error. It doesn't seem to correct, I get same outout "This is a boook". book is not replaced.
ReplyDeleteline 102 error:
ReplyDeletewhat is word?
it is working
ReplyDelete