How to find duplicate content in Word documents (Doc, Docx)? Everyone creates and updates Microsoft office word documents quite often every day. I see most of the people facing the problem while working on big document files with more than 50 pages such as writing duplicate content in the same document by losing focus on the large content.

In this article, I am going to help you to find out the duplicate content in your document.

Steps for find duplicate content in Word documents (Doc, Docx)

Step 1: Open your Microsoft word document file and press ‘Alt + F11‘. Microsoft Visual Basic for Application window will appear on the screen as below:

How to find duplicate content in Word documents (Doc, Docx) - Visual Basic Dialog

Step 2: Double click ‘ThisDocument‘. Now, the Code dialog will be open as below:

How to find duplicate content in Word documents (Doc, Docx) - ThisDocument Code Dialog

Step 3: Copy and paste the below-mentioned programming code into that Code dialog.

Option Explicit

Sub Sample()
    Dim MyArray() As String
    Dim n As Long, i As Long
    Dim Col As New Collection
    Dim itm

    n = 0
    ' Get all the sentences from the word document in an array
    For i = 1 To ActiveDocument.Sentences.Count
        n = n + 1
        ReDim Preserve MyArray(n)
        MyArray(n) = Trim(ActiveDocument.Sentences(i).Text)
    Next

    ' Sort the array
    SortArray MyArray, 0, UBound(MyArray)

    ' Extract Duplicates
    For i = 1 To UBound(MyArray)
        If i = UBound(MyArray) Then Exit For
        If InStr(1, MyArray(i + 1), MyArray(i), vbTextCompare) Then
            On Error Resume Next
            Col.Add MyArray(i), """" & MyArray(i) & """"
            On Error GoTo 0
        End If
    Next i

    ' Highlight duplicates
    For Each itm In Col
        Selection.Find.ClearFormatting
        Selection.HomeKey wdStory, wdMove
        Selection.Find.Execute itm
        Do Until Selection.Find.Found = False
            Selection.Range.HighlightColorIndex = wdPink
            Selection.Find.Execute
        Loop
    Next
End Sub

' Sort the array
Public Sub SortArray(vArray As Variant, i As Long, j As Long)
  Dim tmp As Variant, tmpSwap As Variant
  Dim ii As Long, jj As Long

  ii = i: jj = j: tmp = vArray((i + j) \ 2)

  While (ii <= jj)
     While (vArray(ii) < tmp And ii < j)
        ii = ii + 1
     Wend
     While (tmp < vArray(jj) And jj > i)
        jj = jj - 1
     Wend
     If (ii <= jj) Then
        tmpSwap = vArray(ii)
        vArray(ii) = vArray(jj): vArray(jj) = tmpSwap
        ii = ii + 1: jj = jj - 1
     End If
  Wend
  If (i < jj) Then SortArray vArray, i, jj
  If (ii < j) Then SortArray vArray, ii, j
End Sub

Now, the Code dialog will look as below:

How to find duplicate content in Word documents (Doc, Docx) - Code Dialog After

Step 4: Save this code by clicking the Save icon or pressing the ‘Ctrl + S‘.

How to find duplicate content in Word documents (Doc, Docx) - Save Dialog

You need to choose the type “Word Macro-Enabled Document” and then save this document.

How to find duplicate content in Word documents (Doc, Docx) - Save Confirmation

If you get the above dialog, click “Ok” and then proceed.

Step 5: Run the Macro by pressing the ‘F5‘ key or pressing the ‘Play‘ button on the toolbar.

Step 6: Select the Macro name ‘ThisDocument.Sample‘ and click the ‘Run‘ button to start executing.

How to find duplicate content in Word documents (Doc, Docx) - How to Run Macro

Step 7: That’s it. Macro executed now. Open your Microsoft office word document to see the result.

The result will be shown as below:

How to find duplicate content in Word documents (Doc, Docx) - Duplicated Content Highlight

The duplicated content will be marked with pink color. Therefore, if you found any pink color content, then it means the given content is duplicated somewhere in your document file.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Sign In

Register

Reset Password

Please enter your username or email address, you will receive a link to create a new password via email.