Advice on how to scrape tables of data or HTML from webpages using VBA
Part one of a three-part series of blogs

You can use VBA to extract data from web pages, either as whole tables or by parsing the underlying HTML elements. This blog shows you how to code both methods (the technique is often called "web-scraping").

  1. Two ways to get data from websites using Excel VBA (this blog)
  2. Extracting a table of data from a website using a VBA query
  3. Scraping a website's HTML in VBA

Posted by Andy Brown on 13 January 2014

You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

Two ways to get data from websites using Excel VBA

There are two ways to get information from websites programmatically: by downloading their data, or by parsing their HTML

Downloading websites' data

If you want to get at tables of data published to a website (such as currency exchange rates, fantasy football tables or weather forecast data), the easiest way to do it is by adding a linked table into Excel:

Importing course data

An example - importing a list of all forthcoming advanced Excel courses from the Wise Owl site.

Parsing a website's HTML

This is a bit harder, since you need to understand a bit about HTML, and be prepared to work hard to parse data in VBA.  The example we'll cover will download a list of all of the questions from the StackOverflow home page:

StackOverflow home page

What the downloaded data will look like after running our macro.

 

The rest of this blog shows how to do each of these tasks, beginning with the simpler one: querying a table of data on a website.

  1. Two ways to get data from websites using Excel VBA (this blog)
  2. Extracting a table of data from a website using a VBA query
  3. Scraping a website's HTML in VBA
This blog has 0 threads Add post