What the heck is up with the words in this game? We’ve definitely been hearing feedback from players that “our dictionary is broken or that we have bogus words.” Not to our knowledge! It’s time to talk about our dictionary, word commonness, and where these came from.
There are lots of English language dictionaries in the world. The Oxford English Dictionary (EOD) is considered by many to be the premier dictionary of the English language, but there are many others. For word games, many consider the Official Scrabble Player’s Dictionary (OSPD) to be the Gold standard. There’s also ENABLE—the open source word list used and distributed by games like Zynga’s Words with Friends. Depending on what word list you use, determines how many words you can find. For example, here’s the word counts across a number of different dictionaries/word lists:
|Dictionary / Word list||Word count|
|Oxford English Dictionary||616,500|
|Official Tournament and Club Word List (TWL06)||178,691|
|Wordament Word List||177,140|
As you can see, our word list is unique. Where did it come from? Well, we didn’t just want a list of words. We wanted a list of words that were actually used. In fact, if you haven’t noticed, look closely at the “Words not found” list in the Your Score screen at the end of each game. You will see that the sort order shows “common words” first and “obscure words” second. We did this, because it always bothered us to see word games list all the words alphabetically and see “garbage words” there. We wanted the words you missed to be words you might actually know, first.
We did this by indexing hundreds of free, English language electronic books off of the internet, word breaking them and comparing the words found against a reference word list. The histogram that this generated, ultimately helped us vet all of the words in our resulting word list as well as inform us about how common words appear in actual usage. Unfortunately, many free fiction and non-fiction ebooks are also old, such as Classics, and what we’ve found is that our language has “moved” a lot since the 1800s. Even so, there’s a bunch of old English words that we call common (like THROES), which you may have never heard of.
Now for the controversy: What should we do? Where should we go? Our philosophy on our word list is that it should contain every word in the language, regardless of how obscure. This means you will see, and have the ability to play, words like QAT—an evergreen plant from the middle east with hallucinogenic properties—even though you may have never heard of it! This particular word is a favorite amongst Scrabble players because it is one of the easiest to remember Q-without-U words. My other favorites are QAID, QOPH, and TRANQ.
As we’ve been play testing, and believe us… we’ve played a LOT of games in there with you… we look up every seemingly bogus word we find. To date, we have not found a word in our game that did not have a proper definition. We typically look up “weird words” by going to Bing.com and typing “define <word>” such as “define QAT”. Try it. You may not like the fact that there are lots of words you didn’t know… but as far as we know—every word out there is real.
What do you think? Should we filter/censor the dictionary? Do you think we are missing words? Share your thoughts and let’s have a discussion!
“Black Snapper” – Engineer and Gamer, YouVsTheInternet