Python: Count the frequency of each word in a string (2 ways)

Updated: May 27, 2023 By: Khue Post a comment

This succinct and straight-to-the-point article will walk you through some different ways to find the frequency (the number of occurrences) of each word in a string in Python.

Using the Counter class

This approach is very concise since it utilizes the Counter class from the collections module. The steps are:

  1. Import the Counter class from the collections module.
  2. Split the string into words using the split() method.
  3. Create a Counter object from the list of words and store it in a variable.

Example:

from collections import Counter

string = "blue blue red red red red green green blue yellow"
word_list = string.split()

word_frequency = Counter(word_list)

print(word_frequency)

Output:

Counter({'red': 4, 'blue': 3, 'green': 2, 'yellow': 1})

You can also iterate through the Counter object like so:

for word, count in word_frequency.items():
    print(f"{word}: {count}")

Output:

blue: 3
red: 4
green: 2
yellow: 1

In case you only want to get the most common words and their counts, just call the most_common() method on the Counter object and pass it the number of words you want to retrieve. It will return a list of tuples:

from collections import Counter

string = "blue blue red red red red green green blue yellow"
word_list = string.split()

word_frequency = Counter(word_list)

two_most_common = word_frequency.most_common(2)
print(two_most_common)

Output:

[('red', 4), ('blue', 3)]

This approach is convenient and efficient and doesn’t rely on a third-party library.

Using a dictionary

We will use a dictionary to store words and their counts. Here’re the steps to follow:

  1. Create an empty dictionary.
  2. Split the string into words using the split() method.
  3. Iterate through each word and update its count in the dictionary.

Code implement:

string = "Sling Academy ball box Sling box ball Academy hello hello hello"
word_list = string.split()

word_frequency = {}
for word in word_list:
    word_frequency[word] = word_frequency.get(word, 0) + 1

for word, count in word_frequency.items():
    print(f"{word}: {count}")

Output:

Sling: 2
Academy: 2
ball: 2
box: 2
hello: 3

This technique is simple and intuitive, efficient for small to moderate-sized strings.