Save Html Source As Text
Feb 5, 2008I would like to be able to navigate to a site and save the source text of the html into a text file.
View 3 RepliesI would like to be able to navigate to a site and save the source text of the html into a text file.
View 3 RepliesI would need to get HTML source code from any given page. I know how to open a HTML page from Excel and I can do it with VBA, but how to get for example this page's source code?
I would start with making a sub that takes a string (the address) as an input parameter and finish with saving the source code of that address as an text file like c:code.txt
So something like
Dim webaddress as string
Sub GetSourceCode(webaddress)
'then some code to save the source code
End Sub
I need to check a website daily to see if a link has been updated. If it has been updated, the beginning of the link changes to a different date. Example: today link is www.10212009dave.com and tomorrow link may be www.10222009dave.com. Lets say the link is on www.gugg.com. The link does not change everyday, but I think a good way to see if it has been updated is to search through the source code in the html for that link.
Thus I would put www.10212009dave.com into cell A1 and tell excel to search the source code on www.gugg.com, and if the contents of cell A1 is NOT found, I'd display a message box stating the link has been updated.
I would like to retrieve contents of a web page, be it HTML or XML, into VBA variable!
Later, I would chop, cut, parse or extract the data I need.
Both importing as XML or WebQueries is unsatisfactory for a certain number of pages I need. XML has bad schema, WebQuery tells me it can't find any data.
I tried with WinHTTPRequest, but Excel gives me back error "undefined user type" in other words it doesn't recognize that object.
Basically I want the source of web page to become a string in my VBA code. In other words that would be replication of funcionallity of
I need to do seems quite simple, I want to grab the source of a webpage into a string (where I'll then to some fiddling about with it to strip it down to the information which I need). Currently I'm trying to do it using the webbrowser object and meddling around with the .document properties, but I can't figure it out.
UserForm1.WebBrowser1.Navigate UserForm1.address.Value
grrr = UserForm1.WebBrowser1.Document.body.innerText
UserForm1.sourceoutput.Value = CStr(grr)
I am looking to read the source code for a website that keeps the stats for a hockey league in Sweden
For other sites i can use the code below and it works fine, but the site i am using to get the Sweden stats seem to keep the data in some type of a Java app (sorry still somewhat of a newbie) and doesn't work the same as the others
when i veiw the source code just by right clicking the page all the data i want shows up. When i try to use my code it doesn't get the stuff i want.
I have tried both objDoc.body.innerHTML and objDoc.body.outerHTML and i get different results but not the same as right clicking on the page and viewing the source, is there another command that i can use to get it all?
the website is
HTML [url]
Sub Get_Stats()
Const strURIpre As String = [url]
Set ie = CreateObject("internetexplorer.application")
ie.Navigate strURIpre
Do
If ie.ReadyState = 4 Then
ie.Visible = False
Exit Do
Else
Goal: I have data that was copied to my clipboard from the webpage source in a Chrome browser. I would like to get that data over to my excel worksheet and insert it starting at "A1".
Issue: All of the pasted data is ending up in ONLY cell "A1" when using VBA.
When I just click in cell "A1" and CTRL-V, the data gets spread across a lot of cells, which is what I am after.
Code:
'------------------------------------------
'Start The Process
'------------------------------------------
' Assigning clipboard data to string variable strClip
Dim MyData As DataObject
Dim strClip As String
Set MyData = New DataObject
MyData.GetFromClipboard
[Code] .....
I have modified the export a range to HTML code from Mr Walkenbachs excellent book and it all works well (still learning my trade with vba!!). The only issue I have is when the code a save as dialogue box appears. As I am looking to automate this process I was hoping to get this code to automatically save preferably to a path ("c:dailyrange.htm" for eg). I have tried various permutations but am really struggling with the concept.
I am using excel 2003.
The code
Sub ExportToHTML()
' Dim ws As Worksheet
Dim Filename As Variant
Dim TDOpenTag As String, TDCloseTag As String
Dim CellContents As String
Dim Rng As Range
Dim r As Long, c As Integer
'Create 7 htmls one for each column of the specified range
For Column = 1 To 7
Range(Cells(14, Column), Cells(40, Column)).Select
' Use the selected range of cells
Set Rng = Application.Intersect(ActiveSheet.UsedRange, Selection)
I have excel 2003 and I have a macro that sorts data and then saves it as an HMTL page. When I was upgraded to excel 2003, it started saving the sheet as 'mhtml', which is causing me other problems. using:
With ActiveWorkbook.PublishObjects.Add(xlSourceRange, _
"C:Documents and Settings holg1My DocumentsQCMA events2007 est.htm" _
, "Event", saverange, xlHtmlStatic, "total_points_2007_8160", "")
.Publish (True)
.AutoRepublish = False
End With
with a defined document name (test.htm), it works (saves as html doc). using:
eventname = "jan22"
With ActiveWorkbook.PublishObjects.Add(xlSourceRange, _
eventname, "Event", _
saverange, xlHtmlStatic, "total_points_24036", "")
.Publish (True)
.AutoRepublish = False
End With
with a variable document name (jan22), it saves as mhtml. How do I make it save as an HTML doc instead of an MHTML doc?
How i can save my excel file as HTML but keep my formatting stay exactly the same as my excel file?
View 1 Replies View RelatedI have previously used the following code to successfully pull out IE webpage source code for string manipulation.
Its a crude example to demonstrate the principle:
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Public IE As Object
Sub Sample()
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
[Code] ......
However when I substitute in a Google websites address into the IE.Navigate command, the code runs to the "Source_Code = IE.document ...." line then flags up a Microsoft Visual Basic error. "Run-time error '438': Object doesn't support this property or method"
The webpage that I am trying to access is a confidential company site, so you won't be able to access it yourself, but starts with [URL] ......
The one thing that I have noticed about this website is the Privacy Report icon in the lower right status window (Picture of an eye with a restricted symbol in front). I don't know whether this is the cause of my problem, or purely an incidental observation.
Is there something peculiar with Google sites that means that the source code cannot be extracted in general, or is this an issue specific to my site ? Does the Privacy Report icon have any relevance, and if so how do I switch that off ?
Using :
MS Excel 2010
IE Explorer 8.0
I have a folder containing 40 single sheet excel workbooks and I would like to automate following tasks:
- Open each excel file (need to open the file so as to update it since it gets the data from another workbook through =formulas)
- Copy paste as values
- Save this as excel html in the same folder as original excel files (keeping the original file name)
- Close (original excel file should not be changed ie formulas should remain in place, only the html file will contain values)
- Since there will always be xHtml files with same name need the macro to replace the excisting file
My abilities with excel are limited to functions, no VBA knowledge other than finding ready codes and pasting them in the module.
Since this routine is to be run almost daily the macro should run all files, instead of one by one.
I just hope that I am not asking too much for excel to handle and I hope that explanation is clear.
When I copy from Excel 2003 (values & formulas) and paste special into Excel 2007 i get the option screen to select unicode text,sylk etc instead of the other screen with the option of values,formulas,formats etc.How can I select the option for value,formulas?
Sorry cannot attach a screen shot as it is above the allowed limit.
I've go a shared Workbook, which will be distributed among several users and stored in different places. The workbook uses the following
Sub savemeas()
Worksheets("data").Visible = True
Sheets("data").Copy
Application.DisplayAlerts = False
ActiveWorkbook. SaveAs Filename:="d:" & Range("a1").Value, _
FileFormat:=xlText, CreateBackup:=False
Application.DisplayAlerts = True
ActiveWorkbook.Close SaveChanges:=True
End Sub
to save the sheet "data" as a text file with a name based on the value of cell a1. All I need is to modify the code so that the target path would not be
ActiveWorkbook.SaveAs Filename:="d:" & Range("a1").Value
but be the same as the source Workbook's - so that I wouldn't have to modify the code for each user separatly, because the sheet would alwayas be saved in the same folder as the current path of the source workbook.
I currently have a macro to import user selected .Dat files into a new workbook, each on its own worksheet. My problem comes in trying to save this new workbook in the same folder as the imported .Dat files. I was thinking there should be a way to gather the file path from the imported files and use that in the Save As command.
[Code]......
I'm using Microsoft Office 2003 and have tried everything I can think of to strip the formatting from data I exported into Excel from the internet. I've tried DATA / TEXT TO COLUMNS, Formatting, LEFT, RIGHT, exporting to NotePad and back again... nothing works?
View 9 Replies View RelatedI have 85 Html files that I open in excel. The files have a bunch of columns with numbers. Excel handles most of them properly, but if the number looks like a date, it is imported as a date (which it shouldn't be). For example if the number is 13-1, excel handles it fine, but if the number is 12-1, excel thinks it is a date and imports Dec-01. How do I get excel to import it as 12-1?
View 6 Replies View RelatedI have some code that loops through a bunch of text files, finding any that contain an href, and printing that entire line (if found) into excel. These text files are source code for a website. What I need to do, is within this line being pasted, is grab only a few things from within some tags such as the info between <title>This is the title</title> the tags and print it into a colum, I do not want the entire line, just certain things that are in the line. I have supplied the code that I currently have. I have it so that 'WholeLine' contains the entire line. Can I manipulate that with something like Cells (myR, 3).Value = WholeFile(?).
Sub CheckTextFilesForHREFs()
MsgBox "Press OK to begin report"
Dim WholeLine As String
Dim myPath As String
Dim workfile As String
Dim myR As Long
Here is my dilemma: I am opening an HTML file with Excel. There are text boxes that appear in excel from the HTML file with text in them. I would like to write a macro to copy the text from each text box and paste it into a cell. I have attached the excel file with the html text boxes in question.
View 4 Replies View RelatedMany lines on my sheet have the following text in col B.
****** http-equiv="Content-Type" content="text/html; charset=UTF-8"> ****** name="generator" content="http://www.movabletype.org/"> Church Marketing Sucks: Evangelism & Outreach Archives
Is there a way to extract all the text or words between the and tags and put the extracted text into col D?
I am trying to log a specific portion of code from a webpage. The line of code looks like this:
View Details
I need to extract the userid protion, the part between "=" and "'target...." and then
I am attempting to extract a particular piece of data from a webpage. I was not able to use a webquery because the data can only be reached by searching an online database and the URL remains static throughout this process.
http://gisims2.co.miami-dade.fl.us/myhome/proptext.asp
The data of interest is contained in a simple, 2-column table with item descriptions in the first column and item values in the second. The code below is my closest attempt. I am attempting to look through the innertext of all the tables on the results page and see if any contain the text "CLUC", which is the description of the data I'm trying to retrieve. The code never finds any qualifying tables.
Sub PropInfo()
Dim appIE As SHDocVw.InternetExplorer
Set appIE = New SHDocVw.InternetExplorer
Dim varTables, varTable
Dim varRows, varRow
Dim varCells, varCell
Dim lngRow As Long, lngColumn As Long
'OPEN INTERNET EXPLORER, GO TO WEBPAGE
appIE.Visible = True
appIE.navigate "http://gisims2.miamidade.gov/MyHome/proptext.asp"
Do While appIE.Busy: DoEvents: Loop
Do While appIE.readyState <> 4: DoEvents: Loop.........................
I have this export containing HTML content in each cell. I need to filter out a specific code from the links included in the HTML.
HTML Code:
<some HTML>
<a href="http://site.com/content/GE6053" class="button"><span>Text</span></a>
<a href="http://site.com/content/GE123" class="button"><span>Text</span></a>
<some HTML>
I need to get the string GE#### before each " class="button"> and copy it on a cell on the right. There are other links of this format [URL] ..... in the cell, but I am interested only the ones that have " class="button"> after it.
The length of the ID after GE can be 2, 3, 4 or 6 characters long. But I am ok with getting GE + 6 characters following it as that means I would get something like GE12" cl and I will delete the extra character by doing a find/replace.
I have a cell in which I have the following data (for example):
<a href="http://www.trucks.com">Ford Trucks</a>
I need to export the sheet as a tab delimited txt file for import into another program. When excel saves the file as .txt, it add extra data so that the cell is represented as:
"<a href=""http://www.trucks.com"">Ford Trucks</a>"
Note the set of two additional inverted commas. This extra data interferes with the parsing of the data in the other program. I've tried formatting the cells to "general" and "text", however, it does not seem to affect the txt output.
So I have (some sort of standard) code to generate a Html emailbody.
Problem is I have data and on this data there is a chart.
Now when I copy and paste the range of these 2 sections it only gives me the data but not the chart (leaves that space blank).
How I can adjust this code so it also will paste the chart?
This is the code :
[Code] .....
Getting some web page data into Excel 2010 using VBA. My scenario however is set up with the following titles in cell A1, B1, C1, D1 and E1 : POST CODE, OUTLET, ADDRESS, TELEPHONE, EMAIL
The result I want to achieve is I enter a post code into cell A2 for example, Excel then uses IE to navigate to the relevant web page as defined in the VBA code. I then want the following to happen:
The InnerText of the web page's h1 tag is then inserted into the OUTLET cell (B2)The first instance of the p tag is then inserted into the ADDRESS cell (C2)The second instance of the p tag is then inserted into the TELEPHONE cell (D2)The third instance of the p tag is then inserted into the EMAIL cell (E2)
All instances of the p tag are contained in a div element called div class="adBox_content" . There are also 5 other DIVs above that DIV in the hierarchy.
Using the YouTube tutorial link, the method has worked for me using the getElementsByTagName("h1").innerText
However, when I try adding a second getElementsByTagName("p")(01).innerText the whole thing fails.
So I'm left with two problems; I can't make the VBA get more than one element at a time from the page, I can only either have the h1 or the first instance of the p tag. I've tried all the getElementBy methods and none of them seem to work in getting the second and third instances to show.
I also need the code to make the data be put on the same row ONLY as where the post code was entered. In this scenario for example of entering a post code into A2, the OUTLET needs to land in cell B2 only, ADDRESS C3 only etc.
By following the youtube tutorial above by giving the cells names to refer to in the code, the data ends up being inputted in all further rows with identical cell names. I need it to not do that.
The code is needed for around 300 rows of post codes that will be entered and refreshed every week or so.
I've got the following code and have been trying to make the cells in column 1 align TOP LEFT but haven't been able to.
[Code] .......
The problem is, these identifiers are in no discernable or predictable pattern. I cannot open the page directly in excel, nor can I use the Import Data from Web function (2007) ... results are simply a blank page.
What I thought I could do, then, is automate the procedure that obtains the source code, which I can parse and look for the current date. Once I have the line with the current date, I can extract the unique identifier, then paste it back into a string and resubmit to the browser.
I just can't figure out how to get to the source code... anybody out there have a way to get to it? Since this is going to ultimately be distributed to 20 or so analysts in different countries, I don't think I can use other tools (like the HTML Extractor from Iconico).
I have a workbork for employee time keeping. I have designed an Input Box that has 15 text boxes (7 diff hour types, 2 weeks, one total box). I have everything working properly except I want to make the control source relative. When the user clicks on a name of an employee (A column), then clicks the macro button, the Input Box appears. I need the text boxes to be linked to the cells E:S on the same row as the active cell. I've tried typing in ActiveCell.Offset(0,4) and variants of it, but all are rejected. How can I link the text boxes using active cell and offset?
View 8 Replies View RelatedI work in my Client's office and assist in settling construction disputes. Part of this work is to browse/search their server for documents that may assist in strengthening their case.
During this review I have found an excel document which is a text-only version of a pivot table, ie someone has done a copy, paste special, values into this sheet. I need to extract the original source data from this table back into the list format, as the original source of the data cannot be located
The row titles on the left are activity descriptions, the column headers are dates and the data in the body of the table is hours. As an idea of size the data is spread over 213 columns and 45 rows. Their are more blank cells in the table than entries.
What I would like to do is create the data in it's original form ie
Column A; Date
Column B; Activity Description
Column C; Hours
and have a separate row for each instance of an entry of hours from the pivot table.