I have over 400 thousand rows in my data-set and I'm trying to find duplicates using a countif or similar formula. Let's say I have the following:
Blue
Yellow
Green
Blue
Blue
Yellow
Pink
I need the formula to increment the count. So for example, the first time Blue appears, it equals 1, the second it appears, it equals 2, third time is 3 and so on.
And, I need it to actually work with 400 thousand rows. I have tried using the following formulas without success as Excel keeps crashing:
That second one equals "2" if there's more than 1, which will work for me as well. Better would be if it increments though.
The purpose of having a count is so that I can pivot the data, then filter out all the "1"s and report on it. Therefore I can't use conditional formatting, or simply using Excels Remove Duplicates feature. I need the same data-set for other pivots and charts.
I have an excel file which contains two sheets. In one sheet there are some numbers in a single column. In the other sheet again there are certain numbers span across rows and columns (a number may be duplicated). I need to check whether there are numbers which exist in both the sheets. If such a number is found it may be marked (say with a color) in both the sheets.
I've searched around the web for AverageIf solutions and found the CSE formulas. However I've come across a big problem I can't resolve.
I have two columns of data. Column A has _similar_ text and column B has data.
The problem is the following: AverageIf does not support finding similar text. This means that a formula like: =AVERAGE(IF(A2:A400="Info/German*";F2:F200;FALSE)) does not work.
Column A contains words like Info/German, Info/German2, Info/Italian etc. But I only need to find the average of cells that begin with "Info/German".
Magazine subscription list. How to highlight the customers that are already in the sheet if enter them again (renewal). Our list is like so....
ColA ColB ColC ColD ColE ColF First Last 123 Ave City State Zip
Is there a way to highlight the row if the info on ColA, ColB, ColE, and ColF all match? Sometimes the Street info is abbreviated or entered PO Box instread of P.O. Box and they wind up on the list a second time.
I have entered the following formula to add up a list of data. =COUNTIF(B7:B100,"*") This doesn't include gaps with blank cells and only gives me the total number of cells that contain text. However, some of the text is duplicated and I only want to count the total number of unique entries in the list.
I want to count column B for all "West" (column A) and I don't want duplicates. So it would count two unique characters for West and two unique characters for South. I want "west" and "south" separated.
I have data that has about 10 duplicate values (UTC Time) in one column and another column with number values (depth ft) that vary. I need to obtain the maximum (highest) value in the depth column and remove the other duplicates to filter out the low values. So for the data example below for UTC 15:56:28 I only want the 5.7 row, for 15:56:29 I want5.3 row and so on. I can attach the sheet. - this is a huuge dataset so manual filtering wont work. Data is from a sonar that gives 10 depth readings per second - I only need one depth tat is the highest value.
I have a workbook that I want to find if I have any duplicate numbers in a specific area.
The area of cells that I am checking is C3 through AO70
I am checking for numbers between 95 and 800. These are all ID# of individuals and not all the numbers between 95 and 800 are used. IE: 97 through 100 are not used ect.
I have already written a macro that does something else and I can use it to check each number as it comes up. However, once the number comes up I don't know how to use it to check the area.
If I can check all the area at one time to find duplicates it would be easier.
I do not know how to do either way but I can adapt my macro to whatever way is possible.
I have two lists of product data, one for buyers and one for sellers (these are listed as A, B, C). The product names are not exactly the same (Eg Playstation and playstation three should be matched), I would have thought using the FuzzyLogic add in to match these would be the way forward! I need to rank the sellers by how many of their items appear on the buyers list
Just wondering if anyone has a macro, or formula that would allow me to find out (and possibly highlight), when any value in column A is equal to any value in column B. Im dealing with about 2000 rows so its almost impossible to complete manually.
Sorry the heading is supposed to read need help finding duplicates between 2 COLUMNS
way to wrap or format anything in this post. I dont think the text I put here is code, but I want to be sure, after receiving a moderators infraction for failing to properly wrap code in a previous post.
Now:
I have a wks in which the first column is a list account #'s and the folowing columns are specifics of transactions or interactions.
Lets say it is a movie rental customer list that lists each rental, and column "A" is the customer number, column "B" declares if it was returned late.
I need to compile a list of "all" rentals by customers who have "EVER" had a "late return" or a "YES" in column "B".
I need to find all account records/rows of accounts that at any time had a "YES" in column "B" even if the some or many of that customers rentals/entries have "NO" in column "B"
I'm trying to condense my email lists in order to stop people receiving the same email having signed up to several lists. How do I compare 5 different columns to find email addresses which appear in more than one...
I have to scrub files of 20,000 phones numbers against a file of several million phone numbers on the national do not call list.
On sheet one I have all 20,0000 phone number and then on sheet two in 5 columns I have roughly 2 million phone numbers. I need to know if any of the 20,000 phone numbers are in the 2 million on sheet two.
Right now I am simply using a vlookup formula but it is taking a very long time to update all of the fields.
I have a spreadsheet with 7000 lines exported from a database. I'm looking for lines that exist with an @@2 that don't have a corresponding @@1. Let me explain.
SV10000000@@1 SV10000000@@2 SV10101000@@2
I want to keep the first two lines because there is an @@1 associated with an @@2. I'm looking to single out and delete lines that have @@2 that don't have a corresponding @@1 associated with it.
I have a project where I need items for different boxes.
i have 20 boxes that need the same amount of items. However, when I came towards the end, i ran out of items. For example
BOX A IS MISSING ITEM 1 AND 2
BOX B IS MISSING ITEM 2 AND 5
BOX C IS MISSING ITEM 1 AND 5
I have all in a spreadsheet all the items that are missing per box. Here comes the main question.... How do I program my spreadsheet find the items that are missing in each box and summarize in another sheet?
The summary I am looking for is....
ITEM 1 - 5 (MISSING) ITEM 2 - 9 (MISSING) and so on...
I started doing the code, but I haven't got too far.
The first list contains site numbers of people who havent responded to me.
The second list is the master list of site numbers along with a column showing the date they responded.
Now, a site number is built like this:
123456/0001 123456/0002
So it is possible for the same 6 digits to appear more than once in the master list.
What I need to do is to compare the first 6 digits in the non responder list against the master list, because some sites, like the example above, may have more than one '0001' tag and so if they have responded to me from site '0002' I dont want to spam their other sites with emails.
I've tried using match and various formulas I've found from google etc, but nothing seems to work!
The goal of this is to get a list of non respondents that have not responded from any of their sites listed in the master list.
I have a database with ~18000 rows and 29 columns. I would like to filter the data by duplicates in one column, based on total, but keep the remaining data in the row. For example: I have account numbers listed in one column, often duplicates. I can get the total in a pivot table no problem, but need the other data associated with that account. I do not need to see all accounts, only duplicates for accounts listed say greater than 5 times. The data in columns B+ are important.
I have a HUGE vlookup I created to paste in the pivot table data (account numbers and totals) to run a look-up based on those numbers, but I see that running into problems when you run 4k+ look-ups.
I want to see accounts listed only 5+ times, include that total (as in a pivot table) and the remaining 28 columns. I have tried to run this in a pivot table completely, but still too much data to process (plus all the subtotals that I have to keep removing).
I have a list of text values in column X. I need to come up with a formula in column Y.
X Y (RESULTS OF REQUIRED FORMULA) Comment 1 HAT 1 First HAT in column 2
[Code]....
I can't play about with the natural order of the spreadsheet, so there's no chance I can re-sort the data into column X and (easily) identify the duplicates that way. So, it could be that the duplicated value(s) will appear in any cell within that column.
I need to identify whether the item is a duplicate in the unsorted list. Ideally, the first entry of a set of duplicates will be given 1, then the subsequent duplicates themselves given a 0 (zero). It's to subsequently do some counts on.
I guess that as long as the one of the entries in the duplicates is marked with a 1, while the others are 0 (zero), that's all that's important.
I have a roster for a large group in excel and would like to have an easy way to highlight if there are duplicate entries in the roster as we are merging multiple smaller lists together.
i have duplicate cell entries occuring. I have a column of about 8000 entries (Column B) and would like to have a cell at the top of my spreadsheet that displays where the first duplicate resides (Row No. will suffice).
At present i have a conditional format on dupllicates, but is is a big task to scroll down through all the data looking for them.
On Sheet 1 I have a list of employee names (John, Bob, Ross etc...)in column A and in column B I have a list of employee bonus points (1, 5, 3 etc...). On Sheet 2 I have the same setup but the list of employees on sheet 2 is a lot longer than the ones on sheet 1, all employee names are on sheet 2.
I need a macro that will go down to each name on the list on sheet 1, column A, and copy the employee bonus points then go to sheet 2 and find the that employees name in the list and paste the bonus points in column B. This must be done until the last name on sheet 1 is found and all points are copied to their corresponding names on sheet 2.
I am using a spreadsheet as a score sheet for a competition. One of the columns is the student's GPA. After entering all the scores there are duplicate final scores. I need a way to have it look at the final score and then use the GPA so that it will not put a duplicate value in the final column.
al Column N is the Total Column, Column O has the Names that correspond to the Total Column. Currently I am taking this total and putting it into Column Q (High Scores) in high to low order. Column R should have the names that match the scores. But with duplicate scores, it is only putting the first name associated with the score. I would like to use the GPA as a final determining factor for the duplicate scores. The higher GPA would come before a lower GPA. I have tried to put an additional column to bring the GPA over to correspond with the High Scores Column, but could not get it to work.There are actually more names for the competition and the top 10 will be moved to a different sheet and further judged. I have attached a sample with the exact formulas that I am using.
What I need to do is sort certain entries in longer list (column A, it is in .csv format and needs to be in it so coordinates and names and ID, all sorted with commas) and I have another list (column C) which is shorter list of certain IDs. I googled and tried and got some results for the basic structure but the fuction seems to fail. It doesn't matter how I get that third list done, but there is only one criteria: since the list in column A is really long and those entries need to keep the .csv formatting, the function should copy that info what is in the matching cells.
Let me try to put it simple: .csv cells from column A that have matching ID from column C should be copied to column B (or N).
In short, I would like a pivot table to only count unique values, but when I click into the pivot I would like to show all instances of that value. For example:
I have a table of data that I am creating a pivot table from. There are fields for Customer ID, Task Name, Age, and Notes. There will be multiple records for a single Customer ID each time it has new notes.
I would like to create a pivot table that has Task Name in the Row Labels, Age in the Column Labels, and count of Customer ID in the Values, so that, for example, I can see how many accounts have been in the Design task for 2 days. However, when I do this it counts each record, but I would like it to count each unique Customer ID. Also, when I click into the pivot, instead of pulling up one line per Customer ID, I would like it to pull up each instance of Customer IDs in that Task Name/Age combination (similar to doing a DISTINCT in SQL).
I have a list of isometric drawing numbers ending with a [underscore]weld number e.g. 1692-SG-0040-04_05.
Some welds are repaired--in that scenario the amended weld number will be 1692-SG-0040-04_05R1, and even 1692-SG-0040-04_05R2 if repaired for a second time.
On occasion a weld may be cut out entirely and a new weld done. The weld number for that will be 6317-FG-1690-02_06C1.
And here's a wrinkle I've just verified...a cut weld may also be repaired so the weld number will look like 1698-SG-0077-01_04C1R1.
Is there a formula to count these as one weld: 1692-SG-0040-04_05 1692-SG-0040-04_05R1 1692-SG-0040-04_05R2
This as one weld: 6317-FG-1690-02_06 6317-FG-1690-02_06C1 6317-FG-1690-02_06C2
...and this as one weld: 1698-SG-0077-01_04 1698-SG-0077-01_04C1 1698-SG-0077-01_04C1R1
I am having trouble creating a function to count duplicates of duplicates.
An example of the data table 1 is:
Product 1 2nd Product 1 2nd Product 1 New Product 1 New Product 1 Flt Product 2 2nd Product 2 New Product 2 New Product 2 Flt Product 2 Flt Product 3 2nd Product 3 2nd Product 3 2nd Product 3 New Product 3 Flt
I created a new table (table 2) and made a list of all the Products on table 1 and removed the duplicates. I now have 3 columns with titles New, 2nd and Flt as follows:
New 2nd Flt Product 1 XX XX XX Product 2 XX XX XX Product 3 XX XX XX
I am trying to count the duplicates for each product (XX), but I can't seem to work it out. I've tried the MS help function, but unsure of the actual formula I need to be using.
As you can see in the example below, in column B I have a list of vendor names, some of which are similar but not identical. (For Example, in one instance a vendor will be called "Ford Motor Co.", while in another it will be called "Ford Motor Inc.".
I need to populate column C, which at every instance where two plants (listed in column A) have similar vendor names in Column B, a universal name will be assigned and recorded in column C for each of the similar names.
HOpefully it is clear as shown below.......
As you can see in the example below, in column B I have a list of vendor names, some of which are similar but not identical. (For Example, in one instance a vendor will be called "Ford Motor Co.", while in another it will be called "Ford Motor Inc.".
I need to populate column C, which at every instance where two plants (listed in column A) have similar vendor names in Column B, a universal name will be assigned and recorded in column C for each of the similar names.