Advice on how to scrape tables of data or HTML from webpages using VBA
Part two of a three-part series of blogs

You can use VBA to extract data from web pages, either as whole tables or by parsing the underlying HTML elements. This blog shows you how to code both methods (the technique is often called "web-scraping").

  1. Two ways to get data from websites using Excel VBA
  2. Extracting a table of data from a website using a VBA query (this blog)
  3. Scraping a website's HTML in VBA

Posted by Andy Brown on 13 January 2014

You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

Extracting a table of data from a website using a VBA query

Pretty much anything you can do in Excel, you can do in VBA too - and that includes extracting information from a website's table.

 The Excel Principle

Supposing that you want to create an Excel workbook with an up-to-date list of all of the Wise Owl Excel VBA courses (who wouldn't want to do this?).  You could do this in 3 easy stages.  First, choose to get data from a website:

Getting data from the web

Choose the option shown on the Data tab of the Excel ribbon.

Secondly, type or paste in a URL (website address) and click on Go:

Going to a website address

Set the address you want to go to and click on Go.

Thirdly, click on the yellow-and-black arrow next to the table you want to import, and then click on the Import button:

Importing a table

The arrow gets a green background when you select it; the Import button is at the bottom right of the form.

After going through one more fairly self-explanatory dialog box, you'll see the imported data:

Imported list of courses

The standard import doesn't bring in any formatting.

Reproducing this in VBA

To import this table of data in VBA, first tell your program where the website can be found:

Sub GetCourseList()

'add a table of courses into an Excel workbook

'the website containing the files listing courses

Const prefix As String = WiseOwlWebsiteAddress & "/training/dates/"

Const FileName As String = "microsoft-excel-advanced"

Dim qt As QueryTable

Dim ws As Worksheet

'using a worksheet variable means autocompletion works better

Set ws = ActiveSheet

The next thing to do is to create a query table, specifying where the results should go and where it should get its data from:

'set up a table import (the URL; tells Excel that this query comes from a website)

Set qt = ws.QueryTables.Add( _

Connection:="URL;" & prefix & FileName & ".htm", _

Destination:=Range("A1"))

Note the URL; prefix, to tell VBA that it should get the data from a website, rather than a database or any other file format.

You can now set various parameters:

Some query table parameters

Just a few of the many ways in which you can customise a query table (although the default settings usually work OK).

 

Here is an example of some customisation for your query table:

'tell Excel to refresh the query whenever you open the file

qt.RefreshOnFileOpen = True

'giving the query a name can help you refer to it later

qt.Name = "ExcelAdvancedCoursesFromWiseOwl"

'you want to import column headers

qt.FieldNames = True

'need to know name or number of table to bring in

'(we'll bring in the first table)

qt.WebSelectionType = xlSpecifiedTables

qt.WebTables = 1

The website might contain several tables; here we're choosing to import the first one only.

Finally, we need to run the query!

'import the data

qt.Refresh BackgroundQuery:=False

End Sub

When you run this macro, you should get a current list of all Wise Owl Excel VBA courses in your active worksheet.

 Using the Imported Data

Now that you've got a set of rows, it's up to you how you use them.  Here's one example, which displays the imported course dates in a message box:

List of Wise Owl course dates

The code below would display this message at the time of writing (clearly by the time you read this, the list will be different!).

 

Here's some sample code to create and display this message:

Sub UsingCourseList()

Dim StartCell As Range

Dim c As Range

Dim msg As String

'go to top left cell

Set StartCell = Cells.find("Start date")

If StartCell Is Nothing Then

MsgBox "No start cell found"

Exit Sub

End If

'start message to show

msg = "Wise Owl Excel VBA courses are:" & vbCrLf

'loop over all cells in this column, adding to list

For Each c In Range(StartCell.Offset(1, 0), StartCell.End(xlDown))

'add this course into message, with blank line

msg = msg & vbCrLf & Format(c.Value, "dd/mm/yyyy")

Next c

'display results

MsgBox msg

End Sub

Importing data tables like this is much the easier of the two methods covered by this blog, but what happens when you want to get less structured data from a web page?  For that, you need to parse the web page's HTML (a process known as scraping websites).

  1. Two ways to get data from websites using Excel VBA
  2. Extracting a table of data from a website using a VBA query (this blog)
  3. Scraping a website's HTML in VBA
This blog has 0 threads Add post