Excel Formula Retrieving Data From Very Large Dataset
Jan 17, 2013
I've been unsuccessful in trying to write a formula that retrieves a single result based on two criteria (from a large set of data on a separate worksheet). I've tried various INDEX MATCH combinations but no luck.
So this is a very simplified version of my real data set which is about 20 times this size. The first worksheet is where I want to store my retrieved results (lets say D2 for example). I want to retrieve data from the second worksheet that matches two criteria (exactly) originating from my first worksheet. The two criteria to be matched from the first worksheet are, for example, A1 (sabathia) and F2 (the date 4/8). The complicated part is the desired result should be from the corresponding K/9 column in the second sheet, which in this case (based on sabathia and 4/8 criteria) is I2 (result would be 3). It's complicated since I can't just tell the formula to look down a specific K/9 column, I need to search ALL the K/9 columns in the sheet (of which there are many). Is this even possible with some sort of nested INDEX MATCH? Any possibilities outside of VBA programming, or is that the only way?
I have a set of 5,800+ data points between 0 and 1 that I would like to multiply together. When I use PRODUCT for the whole set, the formula returns 0. However, I can use a smaller subset of the data to return a very small number. I'm curious if Excel has a closest-number-to-0 or number-of-cells-for-PRODUCT limitation. Is there another way to perform this calculation?
we use # of days per disbursement as a performance measure to ensure that we are providing out grantees with the appropriate amount of service. I keep a tracking chart that I manage with overseas partners that use these dates to prioritize the 30+ grantees in their portfolio at any given time. It would be great if this # of days to disbursement #1, disbursement #2, etc could automatically pull to show them who they have neglected.
"Sheet 1" = Overview sheet to see general information (where I'm trying to pull to)
"Sheet 2" = table to track information as the disbursements or other actions are processed per grant
Column A (on both sheets) gives the grant reference
Column B (data entry sheet) gives the date the payment was sent
When I do =SMALL(('Sheet2'!B:B),2), I get the 2nd smallest in the whole sheet, but then when I try to make an IF function to tie it to the specific grant...
=IF('Sheet2'!A:A,A2,SMALL(('Sheet2'!B:B),2)) --> this gives me a 1905 date
I've tried a bunch of different formulas and tried reformatting the dates... but I'm having very little success...
I have a problem and I can't figure out how to do this, I have tried using both macros and functions (INDEX for example). The problem is as follows, I have a dataset of 27 worksheets, each worksheet has between 30k and 60k of rows and 25 columns. They are set up as follows:
It is basically impossible to do this by hand, each of the 27 worksheet has between the 3000 and 6000 firms and each firm has 57 variables (these are identical for all firms). Also the the firm names and the variable names are in the same column, these should be seperated as well (they are connected with a hyphen).
I am looking to calculate variance across a large data set and would like to know if a macro is possible to calculate for a specific unique cell ID. East, Central, or West and calculate variance across that region.
For instance, in my data set if I have something similar to below. How would I calculate variance in the different regions? Is it possible to automate this process? Also could the Analysis ToolPAk be used instead or in conjunction?
I am accustomed to using filters to find a lot of my information in large datasets.
However, now I am trying to use formulas to return specific values. For simplicity's sake, I have included a sample below with a couple types of scenarios I am looking to solve through the use of formulas. Would this involve sub-arrays perhaps?
We collect loan payments for 36 months from customers.
Column A lists 1000+ customers.
Column J lists the date we received payment 1 ... Column Q lists the amount we received on payment 1.
Column R lists the date we received payment 2 ... Column Y lists the amount we received on payment 2.
Column Z lists the date we received payment 3 ... Column AG lists the amount we received on payment 3.
This repeats for all 36 payments.
New customers are loaded in each month, so be aware that Column J, Column R, Column Z (and so on) have dates from 2011 and 2012 and 2013.
We'd like to create a list of all customers that have not made a payment for the current month as of a certain day (say the 12th). So this month, on January 12th, we'd like to search our data for all customers that don't have a payment listed between January 1st - January 12th.
I have a large itemised call bill that i need to do some regular analysis on and wondered if I could automate most of it.
In column C is a list of mobile numbers, in column F the numbers they called (this is an itemised bill so each line represents one call, meaning each number has multiple rows) finally in column K is the cost of each call.
I want the macro to look through column F (number called) and if there are less than 5 instances of that number that are under 0.30 each in cost to be deleted.
Example: if in column F the number 07500 100100 appeared once with a cost of 0.29 I want it deleted but if it appears 6 times with an accumulated cost of 3.50 i.e. more than 0.30 per call averaged out, then i want it to remain on the sheet
I have the following two bar charts. (see links below). I would like to overlay both these bar charts together and obtain the chart shown in link 3.
For example, at 4.4 GHz and 1.8m antenna, two values (downtime/year) are possible 15 min or 557min. This is represented in the third figure Since the first chart contains small values and the second chart contains large values for the x-axis, will I able to change this to log scale for ease of analysis?
I have a table that can at any point have from a couple hundred up to a couple thousand rows. Within this table lies a column entitled " Offer ". I want to plot the figures in the offer column as a frequency distribution chart.
I plan to do this by listing the x-values (Offer figures), and then using count if formulas to calculate the frequency of that x-value. Then using a simple clustsered column chart to create the visualisation of the frequency distribution.
My question is.... in my large data set, is there any way to get VBA to insert a list of the range of figures in the Offer Column, ? I can figure out how to copy down the countif formula to populate the corresponding frequency column, but how can I have some VBA to dynamically adjust my x-values (offer figures)?
For example... say in the first data set I have
Offer, Frequency 1 10 2 20 3 25 4 20 5 15
that's fine if I make the chart, but what if the data set changes, I want VBA to give me a list of all the offer values, and then I can write some code to insert and copy down along the frequency column the countif formula.
The ultimate goal is to have a frequency chart that will be synced to the self-updating dataset.
I'm trying to count how many production orders i have per week. However, there are duplicated production orders per week. I only want to count how many unique orders there are for each week. I only see the ability to "Count", which counts my duplicates as well so it over inflates my true quantity.
I would like to create a summary for the ordering history of each customers. The IT department will facilitate us to generate some raw data and I want to retrieve the data to the summary excel when I type the Ref No of the customer.
For example, I have the following raw data generated, in which the file name is "A123456":
Ref No Name Address
And I want to extract the data to the following summary. When I type "A123456" in the field "Ref Number" in this summary, it will automatically retrieve data from the corresponding raw file:
I am trying to work out a forumla that will bring through data onto a worksheet for teacher analysis. The data is being extracted from our MIS and put into the attached template. When I change the class on the analysis sheet I want to be able to the pull through the relevant learners attached to the class along with their data.
I work with Excel 2010 and have a very large spreadsheet with data that I need to manipulate in several different ways. I have been filtering and then cutting and pasting but this is very time consuming . Is there a way to extract specific data from the spreadsheet and transfer it to different worksheets? I don't really know how to use macros.
I have data that as 872 columns - each representing a question ID (headers in the first row). I then have 1494 rows of data where each represents 1 unique person. In other words, A2 = Person ID and B2-AGN2=the potential answers to the questions.
What I'd like to do is compact this into 3 columns: "Person ID", "Question", "Answer".
"Person ID" will have duplicate values for each question that is answered. "Question" is the Question Text "Answer" is each answer to each of the questions.
It would be fantastic if Excel has the functionality to ignore null answers and therefore just not even bother populating Question ID when an Answer is blank (e.g. they didn't report an age, so QAge doesn't show up under the new "Question" field), but I have no idea if that's doable.
I have a lot of datasets like this with a varied number of rows and columns, so any way to adjust whatever formula/macro is out there to work for those. I'm terribly new with macros and so I've been having difficulty adapting them if I need to.
I am trying to use excel tools to clean dirty data and compare the two cells. The information is there but tainted with additional information that is not relevant. I have tried to use Left/Right tools to capture alpha characters leading an address number with no real success. Also, when I get the data it seems to have some embedded breaks that I can't seem to get rid of that cause my tasks to error too.
I am trying to insert three columns within a large amount of data. I am using Excel 2003 edition. The three columns need to measure max, min, and standard deviation of month long ranges and the data goes all the way back to 1993.
Currently, I have a column that has the correct ranges but finds the average for each month
And many more ranges as it dates back all the way to '93. Is there a possible way to insert these three columns with their respective commands (=MAX... =MIN... etc.) while keeping all the ranges from the AVERAGE column.
In effect, I am looking to solely switch the begining of the column command
(=AVERAGE($H7214:$H7243)) to (=MIN($H7214:$H7243) etc...
While keeping all of the specified ranges from the AVERAGE column.
simplifying a formula which gathers data from about 50 worksheets from within the same work book.
The data to be gathered is in the same cell on each worksheet and is simply a number but i want the SUM of theses numbers carried forward to another worksheet. Each worksheet is named by date i.e. sheet 1 is named "16 June 2014" and sheet 2 is named "23 June 2014" and so on until "30 March 2015" (Each sheet represents one full week Monday - Sunday).
I am currently struggling with a spreadsheet that has been created in Excel 2007. Essentially, it has a number of items (individually identified by "S code" in the first column) that need to be tested at the specific dates over a one year period (i.e. at "2 weeks", "4 weeks", "8 weeks", etc) as shown in the screenshot below.
A user manually enters "Complete" into the corresponding cell in the "In-testing status" section of the spreadsheet when testing has been completed for a certain item at a particular time point.
I already have set up conditional formatting that highlights cells with dates older than the current date red. What I need to do now is to check for a particular item and date whether or not the corresponding "In-testing Status" cell reads "COMPLETE". If it does, I need to use a conditional formatting rule to return formatting to normal.
What I am unsure of is how exactly to retrieve the value of the corresponding "In-testing Status" cell.
Unfortunately I can't use a macro-enabled workbook in this environment
I am running 2 audits on aspects of patient care. The first audit records a unique number that identifies the patient, and then a series of answers on demographics, and other stuff.
The second audit also records the unique number, and collects some other data on the particular patient at a later point in time (medication usage, levels of pain etc).
So in theory both audits will collect different information on the same patients. In practice, some patients will be missed and there won't be matching data sets. The order of collection won't be the same either, ie Audit 1 might be in the order of Patient 1,2,3,4 etc but Audit 2 might be patient 2,4,1,3
For various reasons these two data collection tools are not linked, and I end up with a spreadsheet for audit 1 and a spreadsheet for audit 2.
I need to merge these so that I can see all of the data for a particular patient at a glance, and where the gaps are, and apply some statistics to it etc. I could sort both lists by the unique audit number so that they are in order, and then copy blocks of data over from one sheet to the next, but there will be records missing, i might make a mistake with the alignment, and I'm sure there must be a better way.
I am using Excel 2010. Each audit case has about 50 columns of data for Audit 1 and 30 columns for Audit 2 . There will be ~20 new records (Rows) created each week that I want to progressively merge.
There is a one-to-one relationship between Column A and B, but B is not unique (but can only take a small set of valid values). I wish to query how many id's (Column A) contain a particular property (Column B). If the example above ended before the "...", I would like to get as the output:
Code: 566 2 341 1
because the property 566 is owned by two id's (1 and 3) and the property 341 is only owned by the id 2.
retrieving data from financial website databases like yahoofinance.com and bloomberg.com. I'm trying to make an automatic stock analysis model to read from the website database and retrieve the data into excel sheets. For example, when opening the excel model the user gets a popup to enter the stock ticker, the user enters the ticker and gets a set of data. Is this do-able in excel?
i have a long list of what were once file names in excel that i need to retrieve data from. i have attached an example file with 2 file names which i recomend to view while reading this request.the file has 2 spreadsheets. the first one is just the file name in the format in which i recieve it. the second one is a table that i need to fill out from the data in those file names.
i have a problem with the following columns in spreadsheet 2:
1. column C: i have the command to copy the site name as it is to this column, but what i need is for the program to read if the site name is ZANUAH or ADORA and then write only Z or A. note that these two site names have a different number of charecters in them.
2. columnd D: similar problem. i need it to read the lab name and write AL if its MAGAMA, BA if its Ben-Ari and SH if its shafir. i have the command for excel to simply copy the word, but how do i make it write the letters that represent the lab name rather than the lab name itself?
3. column E: the report number is the 6 digit number in the file name. i have the command which retrieves it, but it has trouble when the length of the number changes. its important to note that sometimes the number might contain non-numerical characters like 219641-1.
4. column O:i have the command to get the data from the parentheses next to PSD in the file name into a box. what i need is to get it to copy just the letters C or NC from the file name into this column, without the number.
5. column P: same as column O, but here i need just the number, without the C or NC.
6. column T: all i need is for it to copy the last 2 letters from the file name, which i know how to do. the problem is that since the file names come with a .pdf at the end, all i get is df. so in fact i need it to copy th 6th and the 5th letters from the end of the file name, which is above my abilities.
I am trying to get the data out of a cell and put it in a textbox in my userform. What I have is a Worksheet that has autofilter on. After the user clicks certain objectbuttons, there is only one row, that has data in it, displayed. The cell I'm after will always be in column A and be the second visible row.