Excel 2010 :: Importing Web Page HTML Elements Text Using VBA
Mar 9, 2013
Getting some web page data into Excel 2010 using VBA. My scenario however is set up with the following titles in cell A1, B1, C1, D1 and E1 : POST CODE, OUTLET, ADDRESS, TELEPHONE, EMAIL
The result I want to achieve is I enter a post code into cell A2 for example, Excel then uses IE to navigate to the relevant web page as defined in the VBA code. I then want the following to happen:
The InnerText of the web page's h1 tag is then inserted into the OUTLET cell (B2)The first instance of the p tag is then inserted into the ADDRESS cell (C2)The second instance of the p tag is then inserted into the TELEPHONE cell (D2)The third instance of the p tag is then inserted into the EMAIL cell (E2)
All instances of the p tag are contained in a div element called div class="adBox_content" . There are also 5 other DIVs above that DIV in the hierarchy.
Using the YouTube tutorial link, the method has worked for me using the getElementsByTagName("h1").innerText
However, when I try adding a second getElementsByTagName("p")(01).innerText the whole thing fails.
So I'm left with two problems; I can't make the VBA get more than one element at a time from the page, I can only either have the h1 or the first instance of the p tag. I've tried all the getElementBy methods and none of them seem to work in getting the second and third instances to show.
I also need the code to make the data be put on the same row ONLY as where the post code was entered. In this scenario for example of entering a post code into A2, the OUTLET needs to land in cell B2 only, ADDRESS C3 only etc.
By following the youtube tutorial above by giving the cells names to refer to in the code, the data ends up being inputted in all further rows with identical cell names. I need it to not do that.
The code is needed for around 300 rows of post codes that will be entered and refreshed every week or so.
An example of the html file (stripped down to nothing but 3 peices of data): [URL]
I am copying a large table of data from a report generated in Firefox and pasting it into Excel 2010. The data has several columns of html checkboxes. I need to do two things with the checkboxes and would like to do a third:
1: Count how many checkboxes are ticked in each of the columns. 2: Compare a column A of checkboxes to a column B containing numbers, and then both count and highlight any row where the checkbox is ticked but column B is a 0. 3: (optional) I would like to erase the html checkboxes and, if the box was checked, replace it with a regular x in the underlying cell.
I found some code on another forum that generates a list of values for each checkbox (vba - Obtain the value of an HTML Checkbox inserted in Excel worksheet - Stack Overflow).
Based on that, I recorded a macro to extract the html Name of a single checkbox and then set up a Vlookup for the True/False value. However, I can't figure out how to automate a vlookup for every individual checkbox and put the data in the appropriate underlying cell.
I have been tasked with streamlining a process to collect data from a specific online website (Web of Science) and import it into an Excel 2010 spreadsheet.
Currently they are going to the website, entering a short number of search parameters and then manually recording the pertinent data from the webpage. They would like to be able to enter a keyword in Excel (which acts as the search item) which then automatically does the rest of the process and provides them with a spreadsheet of the required data.
Is this possible? Perhaps by using Visual Basic code within Excel? I also saw a method that employed SharePoint Server 2010.
We (the board members of an ultimate frisbee team) have an debt workbook that we would like to dispurse amongst our team members as easy as possible. The thing is that we'd like to limit what they are seeing without having to edit the xls-file, only certain columns etc. Ignorance is bliss so to speak.
So my thought was to have some sort of function that exports data on the fly from the excel book to a html / asp / php / xml (or likewise) page everytime a member would access that page. This would give our treasure (as if..) master a lot less work and make it much easier for our members to check their debt. The workbook itself has a couple of worksheets and the ideal would be for us not to split these into seperate workbooks. Is this doable? I read about web queries, but they seem to be directed the other way, from html to xls.
Dim struserID As String Dim strPassword As String Dim strUploadFile As String Dim strQueryURL As String Dim objIE As SHDocVw.InternetExplorer Dim htmlDoc As MSHTML.HTMLDocument Dim htmlInput As MSHTML.HTMLInputElement Dim htmlColl As MSHTML.IHTMLElementCollection
Set objIE = New SHDocVw.InternetExplorer struserID = "xxxxxx" strPassword = "abcdef" strUploadFile = "C:Misc estupload.txt"
Code navigates to sign on page, enters userID and password, clicks on submit. The password used causes upload page to load. This all works ok. After upload page is loaded, the code does not enter either of the two following "for" routines.
I recently cahve been working with a lot of webpages. Documenting the pages is quite loborious and inaccurate. I recently came across a utility that would explort all of the elements, their types etc and put it into a worksheet. For the life of me, I have not been able to find it. I was wondering if anybody knows of a utility like this, or how I could write a macro to parse this info.
I am trying to create a Dashboard in excel (2010) using tables/pivot tables to build it. The data I am bringing into excel has these key fields of data: cost center+cost center description, general ledger account+general ledger account description, and YTD amount.
My problem is the data is from an external source report and the report has subtotals built in at cost center, and the report's format of subtotaling puts the cost center first and then the general ledger accounts below. There is no formula value in the cell that has the subtotaled amount and the number of general ledger accounts can vary depending on whether there has been general ledger activity.
I want to take this format: July YTD Cost Ctr 1050 XYZ$6.00 625110 Supplies$2.00 650150 Postage$2.00 650550 Fees$2.00 Cost Ctr 1052 ZZZ$4.00 670500 Pens$2.00
and have it look like this Cost CenterCost Center DescrGL AcctGL Acct descYTD Amt 1050 XYZ 625110 Supplies $2.00 1050 XYZ 650150 Postage $2.00 1050 XYZ 650550 Fees $2.00 1052 ZZZ 679200 Pens $2.00
Besides manually doing data moves and assigning a unique sort sequence number to keep the records together, how else can I quickly move my cost centers to a new column and keep the cost center with the gl account and $amount?
I have previously used the following code to successfully pull out IE webpage source code for string manipulation.
Its a crude example to demonstrate the principle:
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long) Public IE As Object Sub Sample() Set IE = CreateObject("InternetExplorer.Application") IE.Visible = True
However when I substitute in a Google websites address into the IE.Navigate command, the code runs to the "Source_Code = IE.document ...." line then flags up a Microsoft Visual Basic error. "Run-time error '438': Object doesn't support this property or method"
The webpage that I am trying to access is a confidential company site, so you won't be able to access it yourself, but starts with [URL] ......
The one thing that I have noticed about this website is the Privacy Report icon in the lower right status window (Picture of an eye with a restricted symbol in front). I don't know whether this is the cause of my problem, or purely an incidental observation.
Is there something peculiar with Google sites that means that the source code cannot be extracted in general, or is this an issue specific to my site ? Does the Privacy Report icon have any relevance, and if so how do I switch that off ?
I have copied a drop down value from a HTML page to excel and this drop down is showing up in execl and I am unable to delete it by either deleting the rows or columns. What do I do to remove the dropdown value in the sheet?
Have very recently been upgraded to Windows 7 with Excel 2010 at work. On printing out a 10 page doucment, (all of which are landscape format), when vewing print preview, the first page is previewed as landsacpe, but subsequent pages are portrait.
If you change format of 2nd page to landscape all subsequent pages switch to landscape.
Have looked at a similar format document created last month and it behaves exactly the same. Whole document landscape but on print preview only first page is...
My office recently upgraded to Office 2010 and we would like (in the accounting department that I work in) to change the default number formating in a blank sheet to Number, 0 Decimals, using seperators, from the current default of general format number.
I have looked for the Book.xltx file to replace but can't see it any where.
I have a user here at my company that is having a strange issue with Excel. When she moves a page break in her document, Excel freezes up, then once it finally makes the change (if it doesn't crash), some (but not all) of the images that are in the document resize to super small.
For instance, she may have 50 rows. Each row contains a column with an image, then a few other columns with product information. Changing a page break may cause ten of the images to become tiny for no apparent reason. Resetting page breaks seems to cause the document to explode, with cells being thrown all over the page into different locations and columns becoming uneven.
When I make the same change on the same document on my system (both using similar specs and Office 2010), this does not happen.
I am using the code below to import a fixed-length text file into Excel. As the macro is written, it imports starting at the first line of the text file. How do i tell it to start importing at line 1000 and above?
I am looking to read the source code for a website that keeps the stats for a hockey league in Sweden
For other sites i can use the code below and it works fine, but the site i am using to get the Sweden stats seem to keep the data in some type of a Java app (sorry still somewhat of a newbie) and doesn't work the same as the others
when i veiw the source code just by right clicking the page all the data i want shows up. When i try to use my code it doesn't get the stuff i want.
I have tried both objDoc.body.innerHTML and objDoc.body.outerHTML and i get different results but not the same as right clicking on the page and viewing the source, is there another command that i can use to get it all?
the website is
Sub Get_Stats() Const strURIpre As String = [url] Set ie = CreateObject("internetexplorer.application") ie.Navigate strURIpre Do If ie.ReadyState = 4 Then ie.Visible = False Exit Do Else
Why when I drag the dotted blue page break line does it sometimes break the entire doc into one page per cell ?
The doc is not wide. When I first load I can drag the break line successfully. Then I print preview... select print on both sides... boom.. goes from 4 pages to 14. Then I go back to page break view... drag the line... boom... Hundreds of pages. Even if I revert back to printing on one side it still is messed up.
How do I make this stop?? What am I doing wrong?? Office 2010
I have over 800+ pages of chart that only takes up 6 columns and around like 9000+ rows.
I wanted to print this chart on paper and need hardcopies. However, the chart in its current setup prints only on the left half of the page leaving the right half empty.
How do i make use of the full space properly? Each chart has a "page number" on it so I want the chart to print continuously from one half of the page onto the next half and then the second page, third, etc.
Here is a visual demonstration of how things currently are and how i'd like to get them to be:
As you can see, This is the first of many charts and its numbered Page 9 and next one is page 10.
How this looks when i try to print, it's only on the left side. right is all blank Pic2
How i want it to look like upon printing Pic3
As you can see in the last picture, once page 14 chart has no space it automatically continues chart on right side of page and then moves on to print rest.
I wanted to see if there is a VBA code to do the following :
a) Select a TabText Delimited file based on a criteria b) Import the Selected Data to Excel
I have the vba code where I can open the tab text delimited file in excel, use a selection criteria and then copy the data into excel. But I am having problems with the case where the Tab Text Delimited file exceeds the row limit that excel currently has and wanted to see if the data import can be done without opening the text file into excel at all.
I'm using the following code to import thousands of html files into my spreadsheet. The code is working fine. Since I am importing thousands of files, when there is no more space on my worksheet, the code stops with an error message. I want to make this code add another worksheet & continue importing the html files until there are no more files to import.
Sub Master_Importer() Dim I As Long Dim strFilename As String Dim strPath As String strPath = "file:///C:/Documents and Settings/c/Desktop/New Folder/" With Application.FileSearch .LookIn = "C:Documents and SettingscDesktopNew Folder" .FileType = msoFileTypeAllFiles .Execute For I = 1 To .FoundFiles.Count strFilename = Mid(.FoundFiles(I), InStrRev(.FoundFiles(I), "") + 1) With ActiveSheet.QueryTables.Add(Connection:= _ "URL;" & strPath & strFilename _ .......................