Functional Way To Break Up Very Large Amounts Of Data
Mar 28, 2012
I am looking for a functional way to break up very large amounts of data. I am looking to break them up by an ID number and then by date. The date function needs to break up data from a hire date to the closest date to a year without going over and proceed to do that for multiple years. I am hoping that the function can just add a blank row between the split data. The file that I have not comatins three years.
want to be able to take a large quantity of data, sort all the like data together, and then quantify the number of each like data. I need the equations to do that.
if there was a way I can put it in to some kind of pivot table, because the hardest part for people when they read his list is it's so large it's hard to find data easily.
So this is how he formatted his data... I was wondering what would be the best way to get this list in to a possible pivot table. This is a consilidated example, there are plenty more columns, but this will give you an idea of my problem.
A2007 A2006 A2006 B2006 B2006 B2007 Feb Jan Feb Jan Feb Jan 630 Labor Cost 1000 7500 3000 4500 800 5000 624 Equipment Cost 900 50 40 300 20 1400
Now, the only thing I can think of is to make columns, but then I'd have to recopy all the task names (which there is about 700) for each of the different years (A= Actual, B= Budget, F= Forecast). Is there any other way that you can think of to do this with out making it so complicated? Any help or suggestions would be great. I really want some format that allows you to click the total and it goes to what makes up that total.
I have data that as 872 columns - each representing a question ID (headers in the first row). I then have 1494 rows of data where each represents 1 unique person. In other words, A2 = Person ID and B2-AGN2=the potential answers to the questions.
What I'd like to do is compact this into 3 columns: "Person ID", "Question", "Answer".
"Person ID" will have duplicate values for each question that is answered. "Question" is the Question Text "Answer" is each answer to each of the questions.
It would be fantastic if Excel has the functionality to ignore null answers and therefore just not even bother populating Question ID when an Answer is blank (e.g. they didn't report an age, so QAge doesn't show up under the new "Question" field), but I have no idea if that's doable.
I have a lot of datasets like this with a varied number of rows and columns, so any way to adjust whatever formula/macro is out there to work for those. I'm terribly new with macros and so I've been having difficulty adapting them if I need to.
I have a problem with my current macro that uses a basic autofilter to auto filter from the parent database to extract the correct rows and then copies the query and pastes it into a new worksheet to further proceed with the macro.
I have run into a problem because my database has become very big and now when I try to autofilter the query and click on copy, an error regarding the data range reference is too complex - use data that can be selected in one contiguous rectangle
I tried a few things such as to autofilter out everything I dont need and hit delete - this does not work either, same result
I got help here previously in which the code deletes All Hidden Rows and this is very time consuming, I have not tested all my methods but it took 15 minutes to delete hidden files for one method and theres roughly 5 in total
I have to end up running this code on the parent worksheet multiple times because I use the parent worksheet to extract different parameters into different worksheets!
I have noticed that if I manually copy the data in smaller blocks, by halving the data seems to work,but I do not know how large of a partition I am limited to copy because my database is very large and the size varies month-to-month so I cannot put a number on the max range. I think if I could get a macro to do it by thirds or preferably quarter range should be safe.
So just to summarize, I am trying to devise a method in which I would auto filter on the active parent sheet "sheet 1" and I would copy the auto-filtered query to "sheet2" instead of copying the whole worksheet in one instance I would like to split the autofitlered query into four equal parts with respect to the range of the worksheet and then to copy the first quarter of the query and paste in sheet 2 and then the second quarter to sheet 2 and so on untill all four quarters are done one after the other, so sheet 2 should be a series of all four parts combined into one series on sheet2
Is there a way (UDF or Macro) to extract from every line(cell) (and put them in the cell to the right) these raw data only the amounts with the Currencies.
PS. (most of the times the amounts mentioned before are the biggest number in every line!)
I am to the end of my wits - or maybe it's impossible to do the following with formulas?
I have the data like this: Column A: Date (which is basically the date for the beginning of weeks) Column B: Month of the date in Column A Column C: Year of the data in Column A Column D: Weekly data.
Maybe it's because it's Friday night, but I just can't invent how to do the following:
Create a new column E that would contain the monthly sum of ColumnD across all weeks of this month - but entered only against the first week of that month (that is currently in Column A) I.e., in my example it should be: 113 empty empty 201 empty empty empty empty..................
I have attached a tiny part of a massive data set I am working on. As you can see in column 2, the data is roughly every 15min for 5 days. The data I am interested in averaging is color coordinated in column 3 (if you scroll down you can see a different color for each day's data set.)
In column 5 the dates are summarized into days as opposed to the 15min breakdown. In column 6 is the problem. How do I get the averages of the relevant data in column 6 in such a way that I can drag the formula down and the next cell will automatically calculate the average for the NEXT day, REGARDLESS of how many temp readings there are, as this data fluctuates from day to day.
I am working on both MS Office 2003 and 2007. I am currently working some formulas on the worksheet which I would like to be protected. Therefore I would like some cells in the sheet to be protected and therefore only the person knowing the password (administrator) will be able to change.
I am trying to make a little game for a friend of mine. It picks a random number 1-1000 then he gets 10 chances to guess the number. After each guess, it tells him if the number is higher or lower. I have a userform that you put it your first guess, hit a button, and it tells you if the number is higher or lower. All the guess blanks and buttons are on the same UserForm. However, after you push the first button, the UserForm doesn't work anymore. How do I get it to stay functional the whole time?
I am building automated solutions where graphs source data is based on outcome of formulas. In case of line graphs I use #N/A as result if no data is available or formula results into an error - this way the data point and data label will not be shown in the graph.
However this does not work for bar graphs - with #N/A, #DIV/0, 0 or "" the bar itself is not shown but the data label is shown (as #N/A or 0). How can I set-up my formulas so that if result is 0 or formula is in error that the graph does not display the data label.
Attached excel file shows same data in 2 charts - 1 line chart (=OK) and 1 bar chart (=not ok). The data for chart is pulled from 2 other tabs (week&month) and merged into 1 data source for graphs.
I have a list of people in column A and a list of Cities that they have visited in column B.
I need to check some of the cities they have visited monthly but don't want to check them all.
I have attached a sheet as an example (this has been scaled down).
The number of cities i want to check for each person varies each month depending on how many cities they have visited.
For example, John has visited 16 cities and i want to check 5 of them. I therefore want 5 random cities that he has visited to appear next to his name at the top. The real list of data is massive so this would be really useful if it is possible.
I have looked at rand but i can't get it to randomly give me more than one city, and i don't understand how to get it to give me say 5 cities one month and say 8 cities the next month purely based on a formula from another cell.
I have one worksheet with four columns of data. Column A is a well name, RA-0001, column B is the measured depth of the well from 0 feet to however far down it goes, anywhere from 4000 to 15000 feet, column C is the inclination of the well, column D is the Azimuth.
I have 500 wells from RA-0001 to RA_0500 or so all in this one worksheet, all the wells have varying Measured Depths associated to their well name. I need to create a macro that can separate the wells and either put them in a new worksheet for each well, ie. a worksheet named RA-0001, RA-0002, ..... ect. OR, and this would be nicer, a macro that can actually save all these individual wells as (Formatted Text (Space Delimited)) files with the associated well name.
Here is an example of what it looks like. The columns do not have a subject line to state what information is in each column because I dont need it in that format.
RA-0001 0 0.00 0.00
RA-0001 100 0.91 5.56
[Code] .......
Even just knowing how to create a simple macro that would take all the data from each well so I could manually copy and save them as new files.
I have a some sheets in a workbook that have collapsible columns, but I need to have the sheet protected/locked. This is for a my company's price book that goes out to distributors, so I can't have the sheet unlocked to where they can manipulate pricing. However, I need to have collapsible columns. Is it possible to have these functional while the sheet is locked?
column I row 11 has a functional argument that simply states to display the output as .843 I need to edit it to .844 and I cannot seem to find out where or howto edit it.
There are others like this that I need to do too so I need to learn how to do it. Not just have someone do it for me.
ALSO as you can see this sheet displays #N/A all over the filled in cells... I would like them to be blank until I enter some pertinent info... I tried this is cell m7 BUT as you can see in m12 it goes back to the N/A
I am creating a very simple spreadsheet to manage my gym memberships. It basically has membership number, first name, last name, membership type (drop down box) start date and expiry date. I have put in conditional formatting so that the expiry date goes red when expired but i want to try and automate the inputting of the dates so for example.
If i select '1 week membership) from the drop down box in the membership type box it will firstly change the start date to the current date (i think this is using the NOW() function) and secodnly changes the expriry date to todays date plus 6 days. Ofcourse i want the expiry date to increase depending on the selection so if i select 1 month membership it would be now date plus 28 days.
I am stuck as to how i can do this and from internet tutorials have been told it requires macro as it cant be done any other way?
we all have a team, and we are scored on calls, appointments, demos, proposals, and revenue. Rather than asking us to do one or all of these metrics, I would like to have a bulls eye chart that could show people their bonus eligability, so if for example 50% of the circles are touching the bulls eye circle, you would be able to adjust what you need to work on for bonus.
Here is a sample graph.
sample structure.jpg
Here is sample data I am trying to work withsampledata2.xlsx
getting printed output to page break when the value in a sorted column changes. My spreadsheet is a basic list where one column identifies a responsible organization. I need the output to page break when the responsible organization changes.
if it's possible to break the connection between a pivot table and the data source whilst still maintaining the data in the table? I could try copy/paste special/paste values but thought there migth be a 'proper' way to preserve the data.
I use excel and would like to know how to copy a large volume of address data but at the same time filtering out irrelevant data placed under each other in a row, in this case, air compressors air conditioning web address etc ( see below for example). I need the first 5 lines only. The rows of unwanted data are irregular i.e some have 10 lines, others 5 , and others 2 or one line which makes using a formula difficult as there is no consistency. The data eventually need to be placed horizontally in columns to be compared to other address lists. To make matters worse, the text data has been merged and wrapped.
I have a one column spreadsheet. The column contains this data:
1 Name 2 Address 3 City 4 State 5 Zip 6 Telephone 7 Fax 8 URL 9 10 11 Name 12 Address 13 City 14 State 15 Zip 16 Phone 17 URL 18 19 Name 20 Address ... and so on
Where there may be one or two blank rows between the individual records and where there may or may not be a Fax number (or row) in the record.
I am trying to convert this data to a horizontal column format - which works fine if I do a copy/paste special/transpose. However I have to do this for 1,800 records and cannot figure out how to do this reliably.
I gave the above illustration to simplify but, actually this is a two column spreadsheet with individual row labels for every record using the above terminology. In other words the above text is in the first column and the data is in the second. Just thought I'd mention in case there was a way to do some kind of if/then formula.
The code which you provided works fine no problem for a page break. I need to run the macro for the page break by asking the input file for page break to be done.
For Example, If excel filename "A" contain the code which you have given need to ask to input the filename "B" and process need to be done in file "B".
I have added some code to your code which you provided but it gives error message "1004" "Method 'Range' of object '_Application' failed" at following line :
Set rng = oExcel.Range(Cells(2, 2), Cells(Rows.Count, 2).End(xlUp))
I have a list of IDs that recurs over a time period. It consists of a Start Data an ID Number. These IDs recur over and over again through one month with different start dates.
See Below:
Date ID 12/1/2013 10:00:00 AM 67890 12/6/2013 12:00:30 PM 67890 12/18/2013 06:30:05 AM 67890
From Another List I'd like to pull a max enddate that is within 24 hrs of the start date. There will be multiple end dates. Here is what the other list would look like.
End Date ID 12/1/2013 1:00:30PM 67890 12/6/2013 4:00:45PM 67890 12/18/2013 9:30:00 AM 67890
Seems like using vlookup with the ID as the lookup wouldnt work because it would just pull the first date it found over and over again.
I have a very large spreadsheet that was exported from an ecommerce site with close to 1000 products. I have one column that I need to extract some text from. This column holds all of the html from the product description and is huge. I only need to extract the actual description of the product, but am having a very hard time figuring out how to do it. I've tried using the mid, left, and right function; but not all of the html is the same so it's not really working the way I need it to.
I have multiple tags throughout the html that I can use with the mid function, but there is more than one occurrence of them. So, how can I tell it to start at the 4th occurrence? I've spent countless hours searching, but I'm a complete novice when it comes to excel and I don't even know what to search for. I end up looking through sites that explain how to pull the Y out of XYZ, which is what I need, just on a much larger (and more complicated) scale.
It was suggested that I set up a macro that will find the 4th occurrence of the word, and then uses the mid function to pull the data out, but when I try to find the word, it says it doesn't exist even though I can see it right in front of me.