Lightful Path – Duplicate Word Finder and more

Spell Check

Minimum Word Length: 4

Paste your text by pressing Ctrl+V or ⌘+V. (Once you click here, these instructions will be cleared for your convenience ;)

The slider on the top right can be used to adjust the minimum length of words to detect. The default setting is 4, so words like 'and', 'for', 'the', etc. won't pollute the analysis. The duplicate words can be found below the slider highlighted by various colors. The list is sorted by the number of occurrence starting with the highest. The highlighting for any individual word can be turned off by clicking on the word in the list.

You can edit your text here, duplicates are detected real time. When you're finished editing, you can copy or clear the result using the corresponding buttons. You can also turn off the spell checker if you don't need it.

~~~

I often find myself sending emails and messages consisting of two or three sentences. This is where I usually commit an unintentional word repetition, but only realize it after hitting send. This tool comes handy for such cases, and you're free to use it for your benefit as well! :)

def find_duplicate_words(text): # Convert text to lowercase and split into words words = text.lower().split() # Count occurrences of each word word_count = {} for word in words: if word in word_count: word_count[word] += 1 else: word_count[word] = 1 # Identify duplicates duplicates = {word: count for word, count in word_count.items() if count > 1} return duplicates # Sample text text = "This is a sample text with some duplicate words. This text is just a sample." # Find and print duplicate words duplicates = find_duplicate_words(text) print("Duplicate words and their counts:", duplicates)

import string def find_duplicate_words(text): # Remove punctuation text = text.translate(str.maketrans('', '', string.punctuation)) # Convert text to lowercase and split into words words = text.lower().split() # Count occurrences of each word word_count = {} for word in words: if word in word_count: word_count[word] += 1 else: word_count[word] = 1 # Identify duplicates duplicates = {word: count for word, count in word_count.items() if count > 1} return duplicates # Sample text text = "This is a sample text with some duplicate words. This text is just a sample." # Find and print duplicate words duplicates = find_duplicate_words(text) print("Duplicate words and their counts:", duplicates)

Basic Approach

Python Implementation

Step 1: Read the Text

Step 2: Split the Text into Words

Step 3: Count Word Occurrences

Step 4: Identify Duplicates

Complete Code Example

Advanced Features

Example: Handling Punctuation