# Find/Count Most Common Words & Phrases In List

Jun 25, 2008I am attempting to take a very large list of keywords, and find the most common words and phrases within them. For example, if I had a list that said:

excel formulas

excel spreadsheet formulas

excel help

excel formulas help form

formulas for excel

I would like to come away knowing that "excel" and "formulas" are common words within the list.

Currently, I believe this can be accomplished by doing the following:

1. Break down each line into all of its possible combinations.This would mean that the line with "excel spreadsheet formulas" would return:

excel spreadsheet formulas

excel spreadsheet

spreadsheet formulas

excel

spreadsheet

formulas

2. Once the entire list is broken down into its many parts, use the pivot table feature of excel to determine how common each of the parts is within the entire data set.

So, my questions are these:

1. Do you believe this is the best way to solve my problem? If not, what would be the preferred method?

2. If this is the best method, what function or script would I use to accomplish the first step of breaking down the lines into their individual parts?

It appears I put too many characters in the title of my post. It should read: Common Words - Decomposing Text Phrases