Compare two documents by keywords

A classical comparison is done by words. The file compare utility search for similar blocks in two files and presents the results by highlighting the differences. This works great for the similar documents. What about documents that have nothing in common? Compare software won’t be able to find similar blocks and will highlight too many changes that will be meaningless for the user. “By keywords” comparison can help.

Suggest By Keywords comparison for unsimilar documents

  • Anytime you can switch between “by words” and “by keywords” comparison method using the drop menu on the toolbar.

Switch to By Keywords comparison

Typical applications of “by keywords” compare

Let’s say you are researching some scientific topic and you have two articles one by a NASA specialist and another by a specialist from a local observatory. These two articles have nothing in common, only the topic. Comparing two documents using “word by word” method won’t help, but doing “by keywords” analysis makes sense.

Common and unique keywords in left and right file

Compare Suite will show:

  • The list and the number of the keywords in the left and in the right file;
  • The list and the number of common and unique keywords.

Word count on the info panel

Here are some typical conclusions that we can make:

  • Reviewing the list of common keywords we can get an idea about common terminology used.

Eclipse date - common keyword in both documents

  • Reviewing the list of unique keywords we can find some ideas that are covered only in one of the articles.

Document on the right has information about eclipse stages

As you can see “Keywords tree” doesn’t just list the keywords. If you click ‘+’ to open a group, you’ll see that there is a short quotation from the document that allows you to understand the context where the keyword appeared.

This type of the comparison won’t work as a detailed compare and contrast analysis, but it will help to review two documents quickly.

What keywords are you interested in?

You can help Compare Suite with its task if you will list some of your topics of interest in Tools > Options > My Interest. For example, if you are looking for information about “solar eclipse” you might add “eclipse” keyword in “My Interests” list.

Later in “Keywords tree.” You can tick “of interest only” check box and Compare Suite will limit the list of the keywords by those that you want to focus on.

Display only keywords from "My interests" group

Make sure that “Info panel” is also visible (you can turn it on in “View > Toolbar > Infopanels”). It will show to what groups of interest the compared documents belong.

Ignoring words

Try “by keywords” compare method and you’ll see that there are a lot of keywords that are not relevant. For example, if comparing English documents you might want to exclude from the comparison such words as “on off of at in by to from if for but with” and so on. To do this go to “Tools” > “Options” > “Special words.”

The list of ignored keywords

This is one way to ignore the words. It will affect only “by keywords” comparison. As we discussed before, it is also possible to add certain words that will be ignored in all comparison methods. This might be useful, as some words are distracting your attention from the results of the comparison.

Reports

One thing that our users like is Compare Suite’s ability to calculate “Similarity” of the documents during the comparison by keywords. The program takes the total number of the keywords and compares it to the number of common keywords, this allows to find out a similarity index of two documents.

When you need to present information about similarity, common, and unique keywords in a form of the report, then you can generate one. As it was discussed before, Compare Suite provides a wide range of reports in various formats.

User opinions

We are using Compare Suite for huge data tables comparison (e.g. budgets, investment settlements, break-even calculations) while performing forensic assignments … and for keyword search in documents… Read more…

Mikolaj Rutkowski, Principal consultant, Fraudit, Warsaw, Poland

I’ve been writing resumes for clients. Often I need to compare the client’s existing resume with job descriptions looking for common keywords. Read more…

Chris Adelman, Freelancer, Phoenixville, PA

 This product allows me to undertake high level review of system files that I change on a frequent basis. Read more…

J Livingstone, L. Energy, Melbourne. Development Co-ordinator.

 We are interesting in learning your story about using “By Keywords” comparison. Did you used it for some specific task? What results did you achieve in terms of time saving?