Training videos for learning VBA - SCRAPING WEBSITES

VBA - scraping websites - 8 videos

Excel VBA Part 46 - Querying Web Pages with Query Tables

Posted by Andrew Gould on 14 November 2016

Querying web pages in Excel VBA is simple when you use Query Tables. This video shows you how to create a query table linked to a web page, how to choose which tables from the web page to return, and how to pass values to a URL query string to create dynamic, updatable web queries in your Excel workbooks.

Excel VBA Part 47.1 - Browsing to Websites and Scraping Web Page

Posted by Andrew Gould on 14 November 2016

Scraping a web page in VBA involves getting a reference to an HTML document and then processing its elements to extract the useful parts and write them to another location in a readable format. This video shows you how to do that using both Internet Explorer and basic XML HTTP requests. You'll learn a bit about the Document Object Model, as well as how to identify HTML elements by name and by tag type. You'll also see how to loop through various collections of HTML elements and their children using an example involving exchange rate tables.

Excel VBA Part 47.2 - Scraping Website Tables and Clicking Links

Posted by Andrew Gould on 07 May 2019

In this video you'll learn how to write VBA code to click links on a webpage to view different tables and then scrape the contents of those tables into new Excel worksheets.

Excel VBA Part 47.3 - Internet Explorer vs XML HTTP Request

Posted by Andrew Gould on 03 June 2019

This video shows you how to use VBA to scrape websites using Internet Explorer and XML HTTP Requests. You'll learn how to reference the correct object libraries, how to create an XML HTTP Request and capture the response text. You'll compare the performance of the XML HTTP Request with Internet Explorer by scraping a website of gambling odds and parsing an HTML table, writing the results to a new Excel worksheet.

Excel VBA Part 47.4 - Logging in to a Website with Windows Security

Posted by Andrew Gould on 03 June 2019

This video explains how to use VBA to enter Windows Security credentials when you connect to a website. You'll learn how to pass a username and password when you send an XML HTTP request. You'll also learn how to use the Windows Script Host Shell object to access the security dialog box and write a username and password using the SendKeys method.

Excel VBA Part 47.5 - Basic HTTP GET and POST Requests

Posted by Andrew Gould on 05 June 2019

This video explains the basic differences between the GET and POST methods and how you can use each method to send data along with an HTTP request. You'll learn about adding a query string to a URL for GET requests and how to send values in the body of a POST request. You'll also see how to ensure the values you pass to a request are properly encoded using the Excel EncodeURL function. Throughout the video you'll also learn the basics of using the Chrome Developer Tools to work out how requests are sent by the Chrome browser which helps when translating this to VBA code.

Excel VBA Part 48 - Scraping Multiple Web Pages

Posted by Andrew Gould on 17 November 2016

PLEASE NOTE - The design of the website used in this video has changed since the video was recorded. This means that the code shown in the video no longer works. The downloadable file contains both the original version of the code and a version which works with the current version of the website. What's better than scraping one web page? Scraping lots of them with the same procedure, of couse! This video explains how to loop over multiple pages using Microsoft's HTML and XML object libraries. You'll learn about HTML tags and classes, the Document Object Model and how to loop over elements on a page.

Excel VBA Part 49 - Downloading Files from Websites

Posted by Andrew Gould on 21 November 2016

PLEASE NOTE - The design of the website used in this video has changed since the video was recorded. This means that the code shown in the video no longer works. The downloadable file contains both the original version of the code and a version which works with the current version of the website. Excel VBA doesn't have a native method for downloading files from websites but you can declare an API function that will enable you to do this. This video takes you through the process of declaring the API function and using it in your code, along with a bunch of other useful techniques such as using folder pickers, creating folders with FileSystemObjects and opening a Windows Explorer window using the Shell function.

This page has 8 threads Add post
25 Jul 19 at 08:09

Hello Andrew,

Thanks alot for the videos, i have a problem when i login to the a website a new pop up window will be opened and all the required content to be gathered will be there in the new window. Please help me out how to activate the sencod window and get data from the newly opened window.

Thanks in advance.

 

08 May 19 at 15:18

Hi Andrew,

First of all, thank you very much for your videos, I've learnt a lot with them.

I have a problem with this web page

https://www.cenace.gob.mx/SIM/VISTA/REPORTES/PreEnergiaSisMEM.aspx

I'm able to web thorught the web, but I can't make click on the checkbox which is near the text "Precios de la Energia". With you videos I was able to change the status of the checkbox from rtUnchecked to rtChecked, but the result is very different than when I click on the web   ..

I think when I click on the web page some code is executed, but I can see how to do it ...

09 May 19 at 08:00

Hi there,

I've had a very quick look at the site you linked to but I'm going to need a little more time to work out how it works!  I'm teaching a training course for the next two days and so I won't get chance to do this until next week I'm afraid!  If you're in need of a quick answer can I suggest perhaps posting a question on stackoverflow?  They have lots of people eager to answer questions exactly like this!

I hope that helps and thanks for the question!

15 Sep 18 at 01:06

Hi Andrew

Are you able to help on below, I am attempting to scrape the "Bid Price" from a website but get "[object] as a return value.

I'm sure have faulted with element collections, would appreciate you assistance.

Thnak you

Sub Price_from_BB()

Dim XMLPriceRequest As New MSXML2.XMLHTTP60
    Dim HTMLPriceSheet As New MSHTML.HTMLDocument         
    Dim ProductPrice As MSHTML.IHTMLElementCollection   

    XMLPriceRequest.Open "GET", "https://barxis.barcap.com/ZA/10/en/instruments.app?statusId=4,5#/details/350041", False
    XMLPriceRequest.send

    HTMLPriceSheet.body.innerHTML = XMLPriceRequest.responseText

    Set ProductPrice = HTMLPriceSheet.getElementById("buttonSELL350041")

    Sheet1.Range("A1").Value = HTMLPriceSheet
    Sheet1.Range("B1").Value = Now
    Set XMLPriceRequest = Nothing
   
End Sub

 

 

 

15 Sep 18 at 09:46

Hi Brendon,

The reason that you're seeing [object] in cell A1 is that you've assigned the HTMLDocument to it.  I imagine that what you're arrempting to do is write the table of assets into the worksheet?  In which case you'll probably need to loop over the individual elements of the table to identify the values you need.  The techniques you need to do this are explained in the Excel VBA Part 47 - Browsing to Websites and Scraping Web Page video.

I hope that helps!

06 Sep 18 at 09:21

Hi Andrew,

I am trying to upload a file to a web page, the following are the steps I followed:

1. Open the web page 

2. Wait for until the page is getting loaded

3. In this webpage I am uploading the file into the first “Upload a File” browser.

4. Get the input element by tag name as “input”

5. Hit the “browse” button, since the paste portion is disabled.

6. Enter the file path in the “Choose File to Upload” window (path is in excel eg :D:\MArgin Discovery (Businness Assurance)\practice2.xlsx )

7. Enter After 5th step, I am not able to enter the file path in the “Choose File to Upload” window, looks like the macro is not supporting for this.

Here is my code:

Sub File_Test()
    Dim HTMLDoc As MSHTML.HTMLDocument
    Dim HTMLButtons As MSHTML.IHTMLElementCollection
    Dim HTMLButton As MSHTML.IHTMLElement
    Dim ie As Object

    Set ie = CreateObject("internetexplorer.application")
    ie.Visible = True
    ie.navigate "http://www.htmlquick.com/reference/tags/input-file.html"

    Do While ie.readyState <> READYSTATE_COMPLETE
    Loop

    Set HTMLDoc = ie.document
    Set HTMLButtons = HTMLDoc.getElementsByTagName("input")

    For Each HTMLButton In HTMLButtons
        If HTMLButton.Type = "file" Then
            HTMLButton.Click
            HTMLButton.Value = "C:\Documents\Test\Temp.txt"
            Exit For
        End If
    Next


Kindly help how to enter path name when input type is file.

Thank you

Regards,

Jouhar

06 Sep 18 at 10:02

I was going to suggest Stack Overflow but I see that you've already posted the question there!  You might try some of the suggestions at this link https://www.ozgrid.com/forum/forum/help-forums/excel-general/100265-automate-internet-explorer-file-upload

18 Aug 18 at 22:32

 Andrew,

Thanks again for your awesome videos. 

On VBA part 47 video, you show us how to input a value, click the "Go" search button, and scrape contents from queried URL.  I used the same technique to scrape the page after the search button is clicked.  It didn't return value (innertext).

Wonder if you can help me out.

Thanks,

Suntao

20 Aug 18 at 07:27

You can reference the Document property of the IE object after navigating to the second page to access the HTML and text of the page.  You may need to introduce a delay to your code to allow IE to catch up with the instruction to navigate to a new page before you attempt to read its contents.  The example below shows how you could do this:

Sub NavigateToAnotherPage()

    Dim IE As New SHDocVw.InternetExplorer
    Dim HTMLDoc As MSHTML.HTMLDocument
    Dim HTMLParagraph As MSHTML.IHTMLElement
    Dim HTMLParagraphs As MSHTML.IHTMLElementCollection
    
    IE.Visible = True
    IE.navigate "en.wikipedia.org/wiki/Main_Page"
    
    Do While IE.ReadyState <> READYSTATE_COMPLETE
    Loop
    
    Debug.Print "First Page = ", IE.LocationName, IE.LocationURL
    
    IE.Document.forms("searchform").elements("search").Value = "Document Object Model"
    IE.Document.forms("searchform").elements("go").Click
    
    Do While IE.ReadyState <> READYSTATE_COMPLETE
    Loop
    
    'Wait two seconds
    Application.Wait Now + TimeValue("00:00:02")
    
    Set HTMLDoc = IE.Document
    Set HTMLParagraphs = HTMLDoc.getElementsByTagName("p")
    
    Debug.Print "Second Page = ", IE.LocationName, IE.LocationURL
    
    For Each HTMLParagraph In HTMLParagraphs
        Debug.Print HTMLParagraph.innerText
    Next HTMLParagraph
    
End Sub

21 Aug 18 at 06:11

Andrew, 

Thanks for your comment.  It solved my problem! 

Suntao

Andrew G  
21 Aug 18 at 07:39

No problem, happy to hear it helped!

12 Jun 18 at 14:37

Thank you sir for your tutorials

 please help me, i'm trying to run your example code, but it gives me error "access denied",

how can I solve this kind of error?

13 Jun 18 at 07:01

Hi, the most likely reason for the error is that the Wise Owl website has been updated to use https rather than http since the video was made.  You should be able to replace any reference to "http" in the code to "https" to solve the problem.

I hope that helps!

25 May 18 at 09:17

Viewed Video #47 on youtube.

Any plans to expand on XHR in Excel VBA, especially with session cookies?

25 May 18 at 12:34

There aren't, I'm afraid!

25 May 18 at 15:35

Please could you point in the direction of any instructional material or course or video that details only XHR and session cookies within an Excel VBA environment?

Andy B  
28 May 18 at 14:25

'Fraid i don't know of any, beyond Googling!

17 May 18 at 10:29

Hi Andrew,

Thank you so much for the 3 videos about Web scraping. It did guide me a lot on how to identify HTML code. Your approach is great. I seem unable to find any routine of auto login (username, password) to web site via VBA code.

Greetings,

Dude6571

18 May 18 at 07:53

Hi, we don't have a video on how to do this but there are plenty of suggestions out there for ways to do this:

https://stackoverflow.com/questions/48919491/login-to-a-website-using-vba

https://stackoverflow.com/questions/21244051/using-vba-to-automate-ie-login-having-errors

https://www.ozgrid.com/forum/forum/help-forums/excel-general/69057-automate-username-password-log-in-to-web-page

https://stackoverflow.com/questions/24038230/fill-user-name-and-password-in-a-webpage-using-vba

Of course the exact technique you'll use will depend on the page you're using to login but hopefully that's enough to point you in the right direction!