VBA - scraping websites videos | Excel VBA Part 49 - Downloading Files from Websites

Posted by Andrew Gould on 21 November 2016


PLEASE NOTE - The design of the website used in this video has changed since the video was recorded. This means that the code shown in the video no longer works. The downloadable file contains both the original version of the code and a version which works with the current version of the website. Excel VBA doesn't have a native method for downloading files from websites but you can declare an API function that will enable you to do this. This video takes you through the process of declaring the API function and using it in your code, along with a bunch of other useful techniques such as using folder pickers, creating folders with FileSystemObjects and opening a Windows Explorer window using the Shell function.

See our full range of VBA training resources, or test your knowledge of VBA with one of our VBA skills assessment tests.

This video has the following accompanying files:

File name Type Description
OBSOLETE - Download Files from URL.xlsm Excel workbook with macros
REVISED 2018-08-21 - Download Files from URL.xlsm Excel workbook with macros

Click to download a zipped copy of the above files.

There are no exercises for this video.

Making a  video bigger

You can increase the size of your video to make it fill the screen like this:

View full screen

Play your video (the icons shown won't appear until you do), then click on the full screen icon which appears as shown at its bottom right-hand corner.

 

When you've finished viewing a video in full screen mode, just press the Esc key to return to normal view.

Improving the quality of a video

To improve the quality of a video, first click on the Settings icon:

Settings icon

Make sure you're playing your video so that the icons shown appear, then click on this gear icon at the bottom right-hand corner.

 

Choose to change the video quality:

Video quality

Click on Quality as shown to bring up the submenu.

 

The higher the number you choose, the better will be your video quality (but the slower the connection speed):

Connection speed

Don't choose the HD option unless you have a fast enough connection speed to support it!

 

Is your Wise Owl speaking too slowly (or too quickly)?  You can also use the Settings menu above to change your playback speed.

This page has 9 threads Add post
16 Oct 21 at 18:05

Hi Andy,

First, these are excellent videos. I have done a ton of coding in Microsoft Office but am very new to navigation of internet files via VBA. These have been great introductory videos and am still working throug them all.

I have a specific question I'm hoping you can address:

Is it possible to navigate to your Google Drive and loop through all Sheet files in a designated folder and export them as .CSV? I have tried to do things such as "Inspect" or "Inspect Element" in both Chrome and Edge, but those methods do not exist.

I have read around on the internet and there are posts which say this is not possible, but others that say it is. Any support would be greatly appreciated!

Thanks,

Josh

18 Oct 21 at 06:46

Hi Josh,

It's not something I've done so I don't have any advice on this one, sorry. Perhaps one of our other viewers can shed some light?

03 Jun 21 at 07:16

Hi Andrew, Thanks for all those great explanations and videos you have on VBA scripting.Recently , i was trying to work on copyng an excel file from my Microsoft 365  sharepoint path to my local desktop folder.I have used the same code which you have explained in your video.But ,The file gets created in the respective local path .But , When i open the copied file its blanka nd it throws an error message "The excel cannot open this file "XXXXXXX.xlsx"because the fiel format or file extentionis not valid.Verify that the file has not been corrupted and that teh file extension matches the format of the file"

I have used the below code to copy file from sharepoint.

 Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" _
      Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr _
      ) As Long
Sub URLDownloadURLWeb()

Dim FileURL As String
Dim DestinationFile As String

FileURL = "https://sharepoint.com/sites/WestESG/_layouts/15/Doc.aspx?sourcedoc=%7BDA181359-66E3-4C4F-BD2E-8E2DE44D5CB3%7D&file=Tracker_data.xlsx&action=default&mobileredirect=true&CT=1622699547701&OR=ItemsView"
DestinationFile = "C:\Desktop\macro\Tracler_Data.xlsx"

URLDownloadToFile 0, FileURL, DestinationFile, 0, 0


End Sub

 

Note: I have using Microsoft office 365 Proplus- Excel 

OS is a 64 bit

Kindly help me in sorting this issue 

03 Jun 21 at 19:27

Hi there! I don't have any experience with downloading files from SharePoint but the answer on this StackOverflow post appears to have worked for at least one person

https://stackoverflow.com/questions/42419486/how-to-download-a-file-from-sharepoint-with-vba

I hope it helps!

25 Dec 20 at 14:54

Hello Sir, Hello Mam,

Could anyone write a VBA code to download files from the web page ( https://www.nseindia.com/companies-listing/corporate-filings-insider-trading ).

Thanks,

Shadab

28 Dec 20 at 06:39

Hi, is there a particular thing that you can't get to work?  How far have you managed to get with your existing code?

14 May 20 at 17:45

Hi Andrew,

In the video "Excel VBA Part 49 - Downloading Files from Websites", you told, if user uses 32 bit or 64 bit Microsoft Office 2010 or later, then prtsafe API function (1st API Function on this Video) will be effective, but if user uses Microsoft Office prior to 2010 (like Microsoft Office 2003 or 2007), then 2nd API function (which is only for 32 bit version of Microsoft Office) will be effective. My question is, what if someone uses 64 bit version of Microsoft Office 2007 or earlier? Then, is there any of these API Functions useful? Or I've to write any other extra coding to get this function work on correctly? Please reply to my questions Andrew.

Thank you for these wonderful videos. I really owe my career to you and Mike Girvin (Excelisfun). Whatever I'm today, just because two of you. I would rather say that, learning front part of Excel is incomplete without watching the videos of Excelisfun and learning VBA is incomplete without watching your videos. Both of you are doing noble work to make us educated. I regard both of you as my Guru.

15 May 20 at 06:30

Hi! The good news is that you don't need to worry about this for Office 2007 as there isn't a 64bit edition of the software! As for earlier versions of Office, I'm afraid I don't know.

I'm very happy to hear that you've found the videos so useful, thank you for the kind comments!

15 May 20 at 19:47

Hi Andrew,

I know only saying "Thank you" is not enough for what you've been doing for us over the past couple of years. I can only pray to the God that all your dreams come true in this life. Truly to speak, I've never seen a teacher like you and Excelisfun in my whole life so far.

Andrew G  
16 May 20 at 08:24

That's very kind of you to say but you should give yourself the credit for taking the time to learn - it takes a lot of dedication so well done to you as well!

14 May 19 at 11:47

Hi Andrew,

Great video and great explanation'

I have a problem to download a file form a web page, may be you can help me'  The problem is that there is no link to use the URLDownloadToFile Function. All I have is an event.click .... this is not the original website, but I can explain myself

https://www.cenace.gob.mx/SIM/VISTA/REPORTES/PreEnergiaSisMEM.aspx

This is the code I'm analysing:

Sub Downloadparamostrarproblema()
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim URL As String
Dim Pepe As Object
URL = "https://www.cenace.gob.mx/SIM/VISTA/REPORTES/PreEnergiaSisMEM.aspx"
Set IE = CreateObject("InternetExplorer.Application")
With IE .navigate URL
Do Until .readyState = 4: DoEvents: Loop
End With
IE.Visible = True
Set HTMLdoc = IE.document
Dim HTMLdiv As HTMLDivElement
Set HTMLdiv = HTMLdoc.getElementById("ctl00_ContentPlaceHolder1_treePrincipal")
Dim HTMLspan As HTMLSpanElement
Set HTMLspan = HTMLdiv.getElementsByTagName("span").Item(2)
HTMLspan.Click
Set Pepe = HTMLdoc.getElementById("ctl00_ContentPlaceHolder1_ListViewNodos_ctrl0_ListViewArchivosSIN_ctrl0_linkCSV")
Pepe.Click
Set HTMLdiv = Nothing
Set HTMLdoc = Nothing
Set IE = Nothing End Sub

I know that in this case I have a Href I could use to download the file, but in the web page I'm working on it has NO link, and I want to download automatically the file.

Any idea how to download pepe.click automatically?

Regards, Santos.

13 Dec 17 at 08:25

Hi, I have one question. Partly by using your super useful vide's I started trying extracting data from websites to excel. There is one thing though that just keeps failing. Here is an example of the code written. The line:

Debug.print Classnames.length

keeps returning 0, whatever searchtype I use, classname, tagname, id. Could you find any incorrect use of code in here?

Thanks in advance for your answer.

Jonathan

Const HappyEmpURL As String = "https://happyemployees.nle.nl/"

Sub GetDataFromWebsite()

Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim Article1 As MSHTML.IHTMLElement
Dim Classnames As MSHTML.IHTMLElementCollection

XMLReq.Open "GET", HappyEmpURL, False
XMLReq.send

If XMLReq.Status <> 200 Then
    MsgBox "Problem" & vbNewLine & XMLReq.Status & "-" & XMLReq.statusText
    Exit Sub
End If

HTMLDoc.Body.innerHTML = XMLReq.responseText
Set Classnames = HTMLDoc.getElementsByClassName("mx-layoutcontainer-center mx-scrollcontainer-center ")
Debug.Print Classnames.Length
For Each Article1 In Classnames
Article1.getElementsByid ("159b6aef-dbc3-5b23-a735-cf99f8341771-1")
Debug.Print Article1.getAttribute("href"), Article1.innerText, Article1.className
Next Article1

End Sub

14 Dec 17 at 09:21

Hi Jonathan,

I think the main problem here is that most of the page content appears to be generated by javascript.  I believe this means that that an XML HTTP request won't contain the elements you're looking for because the javascript won't be executed to create them in the first place. An alternative would be to use Internet Explorer as described in this video.

So, for example, this code will return a couple of elements from the login form on the page you've linked to:

Sub GetHTMLDocument()

    Dim IE As New SHDocVw.InternetExplorer
    Dim HTMLDoc As MSHTML.HTMLDocument
    Dim Article1 As MSHTML.IHTMLElement
    Dim Classnames As MSHTML.IHTMLElementCollection
    
    IE.Visible = True
    IE.navigate "https://happyemployees.nle.nl/"
    
    Do While IE.ReadyState <> READYSTATE_COMPLETE
    Loop
    
    Application.Wait Now + TimeValue("00:00:02")
    
    Set HTMLDoc = IE.Document
    Set Classnames = HTMLDoc.getElementsByClassName("form-control")
    
    Debug.Print Classnames.Length
    
    For Each Article1 In Classnames
        Debug.Print Article1.className, Article1.getAttribute("name")
    Next Article1
    
End Sub

The results I see are:

form-control mx-focus       username
form-control  password

I hope that helps!

19 Jun 17 at 08:57

I'm trying to scrape data from the Racing Post website, but of of their class name are ridiculously long. For instance "ui-table__body
                  js-sortableTable__body
                  js-formFilterBody"

 

taken from https://www.racingpost.com/profile/horse/836785/danish-duke#race-id=676975.

I can't get my scraper to recognise these names.

Can you help?

19 Jun 17 at 11:49

I'd be tempted to use the VBA REPLACE function before you start scraping, to replace the long class names with shorter, more manageable ones.  If you find this doesn't work, you may find that the long class names include some strange carriage return character codes.  If you find a good solution, please do reply to this thread letting me (and everyone else) know.

07 Jun 17 at 20:51

Using Excel 2007, and it doesn't recognise  'PtrSafe' . I downloaded the workbook and the "Private Declare PtrSafe Function"  shows as all red. However if i remove the 'PtrSafe' , it's ok?

 

08 Jun 17 at 10:21

You could use conditional compilation:

#If VBA7 Then
  Declare PtrSafe Function ...
#Else
   Declare Function ...
#End If

I'd love to claim the credit for this, but I got it from this MSDN page.

25 Nov 16 at 15:12

Kaspersky keeps blocking the file from being downloaded; the message I get says: HEUR: Trojan-Downloader.Script.Generic

Regards,

 

25 Nov 16 at 22:26

Hi Joni,

Do you see this message for a specific file or for every file that it downloads?  If the former then I'd bet that it's a false-positive from Kaspersky. If you're concerned, however, please don't use the file. You can easily recreate the code simply by following along with the video.

Thanks for bringing it to our attention!

11 Apr 17 at 17:11

I have the same problem.

Kaspersky Internet Security said: The object is infected by HEUR:Trojan-Downloader.Script.Generic.

This only for this one file. Other files can be downloaded succesfully.

Andrew G  
11 Apr 17 at 21:31

It seems to be a Kaspersky specific issue - the HEUR indicates that a heuristic analysis has detected code in the file which matches a particular pattern of potentially malicious code. The file contains code which automatically downloads files from URLs which, as you can imagine, out of context could be construed as malicious. As I wrote the code I know that it's not and can only assume that Kaspersky is returning a false positive but, as I said in the previous response, if you're at all concerned by it, don't download it. You can easily recreate the code by simply following along with the video.

I hope that helps!