How to find duplicate content in Word documents (Doc, Docx)? Everyone creates and updates Microsoft office word documents quite often every day. I see most of the people facing the problem while working on big document files with more than 50 pages such as writing duplicate content in the same document by losing focus on the large content.
In this article, I am going to help you to find out the duplicate content in your document.
Steps for find duplicate content in Word documents (Doc, Docx)
Step 1: Open your Microsoft word document file and press ‘Alt + F11‘. Microsoft Visual Basic for Application window will appear on the screen as below:

Step 2: Double click ‘ThisDocument‘. Now, the Code dialog will be open as below:

Step 3: Copy and paste the below-mentioned programming code into that Code dialog.
Option Explicit
Sub Sample()
Dim MyArray() As String
Dim n As Long, i As Long
Dim Col As New Collection
Dim itm
n = 0
' Get all the sentences from the word document in an array
For i = 1 To ActiveDocument.Sentences.Count
n = n + 1
ReDim Preserve MyArray(n)
MyArray(n) = Trim(ActiveDocument.Sentences(i).Text)
Next
' Sort the array
SortArray MyArray, 0, UBound(MyArray)
' Extract Duplicates
For i = 1 To UBound(MyArray)
If i = UBound(MyArray) Then Exit For
If InStr(1, MyArray(i + 1), MyArray(i), vbTextCompare) Then
On Error Resume Next
Col.Add MyArray(i), """" & MyArray(i) & """"
On Error GoTo 0
End If
Next i
' Highlight duplicates
For Each itm In Col
Selection.Find.ClearFormatting
Selection.HomeKey wdStory, wdMove
Selection.Find.Execute itm
Do Until Selection.Find.Found = False
Selection.Range.HighlightColorIndex = wdPink
Selection.Find.Execute
Loop
Next
End Sub
' Sort the array
Public Sub SortArray(vArray As Variant, i As Long, j As Long)
Dim tmp As Variant, tmpSwap As Variant
Dim ii As Long, jj As Long
ii = i: jj = j: tmp = vArray((i + j) \ 2)
While (ii <= jj)
While (vArray(ii) < tmp And ii < j)
ii = ii + 1
Wend
While (tmp < vArray(jj) And jj > i)
jj = jj - 1
Wend
If (ii <= jj) Then
tmpSwap = vArray(ii)
vArray(ii) = vArray(jj): vArray(jj) = tmpSwap
ii = ii + 1: jj = jj - 1
End If
Wend
If (i < jj) Then SortArray vArray, i, jj
If (ii < j) Then SortArray vArray, ii, j
End Sub
Now, the Code dialog will look as below:

Step 4: Save this code by clicking the Save icon or pressing the ‘Ctrl + S‘.

You need to choose the type “Word Macro-Enabled Document” and then save this document.

If you get the above dialog, click “Ok” and then proceed.
Step 5: Run the Macro by pressing the ‘F5‘ key or pressing the ‘Play‘ button on the toolbar.
Step 6: Select the Macro name ‘ThisDocument.Sample‘ and click the ‘Run‘ button to start executing.

Step 7: That’s it. Macro executed now. Open your Microsoft office word document to see the result.
The result will be shown as below:

The duplicated content will be marked with pink color. Therefore, if you found any pink color content, then it means the given content is duplicated somewhere in your document file.
Comments