Training videos for learning VBA - SCRAPING WEBSITES

We have 8 VBA - scraping websites videos listed below. You might also like to consider booking a place on one of our online or classroom VBA courses, where you will meet the owls behind the videos!

Excel VBA Part 46 - Querying Web Pages with Query Tables

Posted by Andrew Gould on 14 November 2016

Querying web pages in Excel VBA is simple when you use Query Tables. This video shows you how to create a query table linked to a web page, how to choose which tables from the web page to return, and how to pass values to a URL query string to create dynamic, updatable web queries in your Excel workbooks.

Excel VBA Part 47.1 - Browsing to Websites and Scraping Web Page

Posted by Andrew Gould on 14 November 2016

Scraping a web page in VBA involves getting a reference to an HTML document and then processing its elements to extract the useful parts and write them to another location in a readable format. This video shows you how to do that using both Internet Explorer and basic XML HTTP requests. You'll learn a bit about the Document Object Model, as well as how to identify HTML elements by name and by tag type. You'll also see how to loop through various collections of HTML elements and their children using an example involving exchange rate tables.

Excel VBA Part 47.2 - Scraping Website Tables and Clicking Links

Posted by Andrew Gould on 07 May 2019

In this video you'll learn how to write VBA code to click links on a webpage to view different tables and then scrape the contents of those tables into new Excel worksheets.

Excel VBA Part 47.3 - Internet Explorer vs XML HTTP Request

Posted by Andrew Gould on 03 June 2019

This video shows you how to use VBA to scrape websites using Internet Explorer and XML HTTP Requests. You'll learn how to reference the correct object libraries, how to create an XML HTTP Request and capture the response text. You'll compare the performance of the XML HTTP Request with Internet Explorer by scraping a website of gambling odds and parsing an HTML table, writing the results to a new Excel worksheet.

Excel VBA Part 47.4 - Logging in to a Website with Windows Security

Posted by Andrew Gould on 03 June 2019

This video explains how to use VBA to enter Windows Security credentials when you connect to a website. You'll learn how to pass a username and password when you send an XML HTTP request. You'll also learn how to use the Windows Script Host Shell object to access the security dialog box and write a username and password using the SendKeys method.

Excel VBA Part 47.5 - Basic HTTP GET and POST Requests

Posted by Andrew Gould on 05 June 2019

This video explains the basic differences between the GET and POST methods and how you can use each method to send data along with an HTTP request. You'll learn about adding a query string to a URL for GET requests and how to send values in the body of a POST request. You'll also see how to ensure the values you pass to a request are properly encoded using the Excel EncodeURL function. Throughout the video you'll also learn the basics of using the Chrome Developer Tools to work out how requests are sent by the Chrome browser which helps when translating this to VBA code.

Excel VBA Part 48 - Scraping Multiple Web Pages

Posted by Andrew Gould on 17 November 2016

PLEASE NOTE - The design of the website used in this video has changed since the video was recorded. This means that the code shown in the video no longer works. The downloadable file contains both the original version of the code and a version which works with the current version of the website. What's better than scraping one web page? Scraping lots of them with the same procedure, of couse! This video explains how to loop over multiple pages using Microsoft's HTML and XML object libraries. You'll learn about HTML tags and classes, the Document Object Model and how to loop over elements on a page.

Excel VBA Part 49 - Downloading Files from Websites

Posted by Andrew Gould on 21 November 2016

PLEASE NOTE - The design of the website used in this video has changed since the video was recorded. This means that the code shown in the video no longer works. The downloadable file contains both the original version of the code and a version which works with the current version of the website. Excel VBA doesn't have a native method for downloading files from websites but you can declare an API function that will enable you to do this. This video takes you through the process of declaring the API function and using it in your code, along with a bunch of other useful techniques such as using folder pickers, creating folders with FileSystemObjects and opening a Windows Explorer window using the Shell function.

This page has 11 threads Add post
08 May 20 at 19:36

Hi Andrew, 

I hope you and your family are keeping safe during these tough times.

First of all, we can’t really thank you enough for videos and another resources you made available in this very informative site. I have really benefited from your videos - and in particular during the current lockdown - (which was a great opportunity for many to improve their overall skills). I have already started using what I have learned to automate many of the tasks related to my work.

With regards to web scrapping videos: there is still two/three things that I can’t find: 

1. Filling an auto complete/auto select text. For example : entering a city name in Booking.com ( data entered should also be verified/matches from the auto- complete saved city , which is normally the first city they appear in top, example, putting London > London, then Greater London, United Kingdom option would be the correct one). As you can see, the data entered needs to be verified / matches first before hitting the search button otherwise an error would occur. 

2. Choosing travel/return data from the pop-up selection board.(this one is a direct choice rather than an auto-complete option like the city choice).

3. Pretty close to 2:  choosing date, month and then year before hiiting go on https://www.imf.org/external/np/fin/data/rms_sdrv.aspx.. (I can then parse the result in Excel using one of the method you explained in your videos) .  Note please that I was able to use URL concatenation on other section on the site, but this one as you can see requires to use the drop down method instead. 

Many thanks once again and looking forward to hearing from you

Stay safe

Marklah

09 May 20 at 11:02

We sadly haven't got time to answer, but I've left this post up in case any other users of this website can help?

23 Apr 20 at 16:31

Hi Andrew,

I'm Kartick from India. I've learned a lot of things about VBA from your Videos on YouTube. You're absolutely genius. But, recently I've faced a problem. Can you please give a solution on how should I count the number of tables on a certain Webpage? For your information, I tried the below code. But, instead of giving me the count of Total no of Tables on that Webpage, it's giving the count of Total no of Rows of all Tables on that page. Please, Andrew, give me a solution.  I'm counting on you. I'm giving my code below... ------------

Sub Count_Tables_on_a_Webpage ()

Dim IE as SHDocVw.InternetExplorer

Set IE = New ShDocVw.InternetExplorer

 

IE.Visible = True

IE.Navigate "https://www.wsj.com/market-data/quotes/AAPL/financials/annual/balance-sheet"

 

Do While IE.ReadyState <> ReadyState_Complete

Loop

 

Dim Doc As MSHTML.HTMLDocument

Set Doc = IE.Document

 

Dim TBL as MSHTML.IHTMLElement

 

Dim TBLs as _ MSHTML.IHTMLElementCollection

Set TBLs = Doc.getElementsbyTagName("Table")

 

MsgBox TBLs.Length

End Sub

23 Apr 20 at 19:29

Hi Kartick,

Happy to hear you've been finding the videos useful!

When I run your code it reports that there are 115 tables.

If I open the webpage in Google Chrome I can press CTRL + U to view the page source in a new tab.  If I press CTRL + F and search for "<table" it also reports that there are 115 tables.

So your code appears to be working correctly.  I think that the page simply contains many more tables than it seems when you look at the web page in a browser.  For example, each cell in the main table which displays a small column chart actually contains a nested table.

I don't know how many tables you expect to find but you might try looking for elements based on a class name rather than tag name.  For example, if you use this code to set the TBLs variable:

Set TBLs = Doc.getElementsByClassName("cr_dataTable")

When I run the code it reports that there are only 2 tables.

I hope that helps!

27 Apr 20 at 13:02

Thank you so much Sir, I am really happy to get the solution from you. You are just awesome. Sir, I'm looking for a good teacher for Oracle and Python. Will you teach me? My Whatsapp No is +91 8777425150. I'm from India. I wish you were born in India. Our country really really needs teachers like you. May God fulfill your every wish!

Andy B  
27 Apr 20 at 15:01

We're hoping to publish self-paced tutorials on Python online some time in the next few months - make sure you subscribe to our monthly email newsletter, and yoi'll hear about this as soon as it happens!

03 Mar 20 at 16:34

Hi  i was wondering if you have anything on scraping websites using Chrome and Excel VBA?

Regards

Colin

 

04 Mar 20 at 07:29

Hi Colin,

I'm afraid that we don't have anything on that topic but you could check out Selenium if automating Chrome is what you want https://codingislove.com/browser-automation-in-excel-selenium/

I hope that helps!

25 Jul 19 at 08:09

Hello Andrew,

Thanks alot for the videos, i have a problem when i login to the a website a new pop up window will be opened and all the required content to be gathered will be there in the new window. Please help me out how to activate the sencod window and get data from the newly opened window.

Thanks in advance.

 

08 May 19 at 15:18

Hi Andrew,

First of all, thank you very much for your videos, I've learnt a lot with them.

I have a problem with this web page

https://www.cenace.gob.mx/SIM/VISTA/REPORTES/PreEnergiaSisMEM.aspx

I'm able to web thorught the web, but I can't make click on the checkbox which is near the text "Precios de la Energia". With you videos I was able to change the status of the checkbox from rtUnchecked to rtChecked, but the result is very different than when I click on the web   ..

I think when I click on the web page some code is executed, but I can see how to do it ...

09 May 19 at 08:00

Hi there,

I've had a very quick look at the site you linked to but I'm going to need a little more time to work out how it works!  I'm teaching a training course for the next two days and so I won't get chance to do this until next week I'm afraid!  If you're in need of a quick answer can I suggest perhaps posting a question on stackoverflow?  They have lots of people eager to answer questions exactly like this!

I hope that helps and thanks for the question!

15 Sep 18 at 01:06

Hi Andrew

Are you able to help on below, I am attempting to scrape the "Bid Price" from a website but get "[object] as a return value.

I'm sure have faulted with element collections, would appreciate you assistance.

Thnak you

Sub Price_from_BB()

Dim XMLPriceRequest As New MSXML2.XMLHTTP60
    Dim HTMLPriceSheet As New MSHTML.HTMLDocument         
    Dim ProductPrice As MSHTML.IHTMLElementCollection   

    XMLPriceRequest.Open "GET", "https://barxis.barcap.com/ZA/10/en/instruments.app?statusId=4,5#/details/350041", False
    XMLPriceRequest.send

    HTMLPriceSheet.body.innerHTML = XMLPriceRequest.responseText

    Set ProductPrice = HTMLPriceSheet.getElementById("buttonSELL350041")

    Sheet1.Range("A1").Value = HTMLPriceSheet
    Sheet1.Range("B1").Value = Now
    Set XMLPriceRequest = Nothing
   
End Sub

 

 

 

15 Sep 18 at 09:46

Hi Brendon,

The reason that you're seeing [object] in cell A1 is that you've assigned the HTMLDocument to it.  I imagine that what you're arrempting to do is write the table of assets into the worksheet?  In which case you'll probably need to loop over the individual elements of the table to identify the values you need.  The techniques you need to do this are explained in the Excel VBA Part 47 - Browsing to Websites and Scraping Web Page video.

I hope that helps!

06 Sep 18 at 09:21

Hi Andrew,

I am trying to upload a file to a web page, the following are the steps I followed:

1. Open the web page 

2. Wait for until the page is getting loaded

3. In this webpage I am uploading the file into the first “Upload a File” browser.

4. Get the input element by tag name as “input”

5. Hit the “browse” button, since the paste portion is disabled.

6. Enter the file path in the “Choose File to Upload” window (path is in excel eg :D:\MArgin Discovery (Businness Assurance)\practice2.xlsx )

7. Enter After 5th step, I am not able to enter the file path in the “Choose File to Upload” window, looks like the macro is not supporting for this.

Here is my code:

Sub File_Test()
    Dim HTMLDoc As MSHTML.HTMLDocument
    Dim HTMLButtons As MSHTML.IHTMLElementCollection
    Dim HTMLButton As MSHTML.IHTMLElement
    Dim ie As Object

    Set ie = CreateObject("internetexplorer.application")
    ie.Visible = True
    ie.navigate "http://www.htmlquick.com/reference/tags/input-file.html"

    Do While ie.readyState <> READYSTATE_COMPLETE
    Loop

    Set HTMLDoc = ie.document
    Set HTMLButtons = HTMLDoc.getElementsByTagName("input")

    For Each HTMLButton In HTMLButtons
        If HTMLButton.Type = "file" Then
            HTMLButton.Click
            HTMLButton.Value = "C:\Documents\Test\Temp.txt"
            Exit For
        End If
    Next


Kindly help how to enter path name when input type is file.

Thank you

Regards,

Jouhar

06 Sep 18 at 10:02

I was going to suggest Stack Overflow but I see that you've already posted the question there!  You might try some of the suggestions at this link https://www.ozgrid.com/forum/forum/help-forums/excel-general/100265-automate-internet-explorer-file-upload

18 Aug 18 at 22:32

 Andrew,

Thanks again for your awesome videos. 

On VBA part 47 video, you show us how to input a value, click the "Go" search button, and scrape contents from queried URL.  I used the same technique to scrape the page after the search button is clicked.  It didn't return value (innertext).

Wonder if you can help me out.

Thanks,

Suntao

20 Aug 18 at 07:27

You can reference the Document property of the IE object after navigating to the second page to access the HTML and text of the page.  You may need to introduce a delay to your code to allow IE to catch up with the instruction to navigate to a new page before you attempt to read its contents.  The example below shows how you could do this:

Sub NavigateToAnotherPage()

    Dim IE As New SHDocVw.InternetExplorer
    Dim HTMLDoc As MSHTML.HTMLDocument
    Dim HTMLParagraph As MSHTML.IHTMLElement
    Dim HTMLParagraphs As MSHTML.IHTMLElementCollection
    
    IE.Visible = True
    IE.navigate "en.wikipedia.org/wiki/Main_Page"
    
    Do While IE.ReadyState <> READYSTATE_COMPLETE
    Loop
    
    Debug.Print "First Page = ", IE.LocationName, IE.LocationURL
    
    IE.Document.forms("searchform").elements("search").Value = "Document Object Model"
    IE.Document.forms("searchform").elements("go").Click
    
    Do While IE.ReadyState <> READYSTATE_COMPLETE
    Loop
    
    'Wait two seconds
    Application.Wait Now + TimeValue("00:00:02")
    
    Set HTMLDoc = IE.Document
    Set HTMLParagraphs = HTMLDoc.getElementsByTagName("p")
    
    Debug.Print "Second Page = ", IE.LocationName, IE.LocationURL
    
    For Each HTMLParagraph In HTMLParagraphs
        Debug.Print HTMLParagraph.innerText
    Next HTMLParagraph
    
End Sub

21 Aug 18 at 06:11

Andrew, 

Thanks for your comment.  It solved my problem! 

Suntao

Andrew G  
21 Aug 18 at 07:39

No problem, happy to hear it helped!

12 Jun 18 at 14:37

Thank you sir for your tutorials

 please help me, i'm trying to run your example code, but it gives me error "access denied",

how can I solve this kind of error?

13 Jun 18 at 07:01

Hi, the most likely reason for the error is that the Wise Owl website has been updated to use https rather than http since the video was made.  You should be able to replace any reference to "http" in the code to "https" to solve the problem.

I hope that helps!

25 May 18 at 09:17

Viewed Video #47 on youtube.

Any plans to expand on XHR in Excel VBA, especially with session cookies?

25 May 18 at 12:34

There aren't, I'm afraid!

25 May 18 at 15:35

Please could you point in the direction of any instructional material or course or video that details only XHR and session cookies within an Excel VBA environment?

Andy B  
28 May 18 at 14:25

'Fraid i don't know of any, beyond Googling!

17 May 18 at 10:29

Hi Andrew,

Thank you so much for the 3 videos about Web scraping. It did guide me a lot on how to identify HTML code. Your approach is great. I seem unable to find any routine of auto login (username, password) to web site via VBA code.

Greetings,

Dude6571

18 May 18 at 07:53

Hi, we don't have a video on how to do this but there are plenty of suggestions out there for ways to do this:

https://stackoverflow.com/questions/48919491/login-to-a-website-using-vba

https://stackoverflow.com/questions/21244051/using-vba-to-automate-ie-login-having-errors

https://www.ozgrid.com/forum/forum/help-forums/excel-general/69057-automate-username-password-log-in-to-web-page

https://stackoverflow.com/questions/24038230/fill-user-name-and-password-in-a-webpage-using-vba

Of course the exact technique you'll use will depend on the page you're using to login but hopefully that's enough to point you in the right direction!