Wordle - What are some good starting words?

Published on February 08, 2022 2 min read

You might have probably heard of Wordle, which is a word-guessing game. If you’ve played Wordle before, you’ll know that the most important part of solving the puzzle is starting with a good word. In this article, we will use word analysis to determine the best first word to use when solving Wordle.

Assumptions…

First of all, we won’t be looking into the source code of the Wordle. Thus we assume that we don’t know vocabulary of Wordle.

Requirements

Python3
pip install nltk
pip install english-words

Let’s do it!

1) Get all common 5-letter words

from english_words import english_words_set
from nltk import ngrams
from itertools import islice

_5_letter_words = set(w for w in english_words_set if len(w) == 5)
print("No. of 5-letter words: ",len(_5_letter_words))

No. of 5-letter words: 3210

2) Create frequency distribution of 2-grams of 3210 5-letter words

freq = {}
for word in _5_letter_words:
    _2_grams_word = ngrams(list(word), 2)
    for x in _2_grams_word:
        _2_gram = ''.join(x).lower()
        if _2_gram in freq:
            freq[_2_gram] += 1
        else:
            freq[_2_gram] = 1

3) Sort frequency distribution and get top 12 most occurring 2-grams

freq_sorted = {k: v for k, v in sorted(freq.items(), key=lambda item: item[1], reverse=True)}
freq_sorted_top_12 = {k: freq_sorted[k] for k in list(freq_sorted)[:12]}
print(freq_sorted_top_12)

{‘er’: 195, ‘an’: 177, ‘in’: 170, ‘ar’: 169, ‘al’: 168, ‘ra’: 157, ‘le’: 149, ‘st’: 149, ‘re’: 146, ‘ch’: 140, ‘la’: 136, ‘on’: 134}

4) Find some good words. The words that contains more than 2 top 12 2-grams.

good_words = []

for word in _5_letter_words:
    word = word.lower()
    matching_grams = 0
    _2_grams_word = ngrams(list(word), 2)
    for x in _2_grams_word:
        if("".join(x) in freq_sorted_top_12):
            matching_grams+=1
    if(matching_grams > 2):
        good_words.append(word)

print(good_words)

[‘lares’, ‘allan’, ‘glare’, ‘stare’, ‘blare’, ‘alarm’, ‘ranch’, ‘alert’, ‘saran’, ‘stale’, ‘larch’, ‘clare’, ‘flare’, ‘clara’]

The above list contain some good words to start with.

Link to Google Collab Notebook.

wordle
python
python3
nltk