VBA - scraping websites videos | Excel VBA Part 57.4 - Finding Web Elements in Selenium

Posted by Andrew Gould on 20 January 2021


This video is a reference for the web element selection strategies in Selenium Basic. You'll learn how to refer to elements by ID, Name, Tag, Class, Link Text and Partial Link Text. You'll also learn the basics of writing CSS Selectors and XPath expressions.

This video has the following accompanying files:

File name Type Description
Pt57_4 Finding Web Elements in Selenium.xlsm Excel workbook with macros

Click to download a zipped copy of the above files.

There are no exercises for this video.

Making a  video bigger

You can increase the size of your video to make it fill the screen like this:

View full screen

Play your video (the icons shown won't appear until you do), then click on the full screen icon which appears as shown at its bottom right-hand corner.

 

When you've finished viewing a video in full screen mode, just press the Esc key to return to normal view.

Improving the quality of a video

To improve the quality of a video, first click on the Settings icon:

Settings icon

Make sure you're playing your video so that the icons shown appear, then click on this gear icon at the bottom right-hand corner.

 

Choose to change the video quality:

Video quality

Click on Quality as shown to bring up the submenu.

 

The higher the number you choose, the better will be your video quality (but the slower the connection speed):

Connection speed

Don't choose the HD option unless you have a fast enough connection speed to support it!

 

Is your Wise Owl speaking too slowly (or too quickly)?  You can also use the Settings menu above to change your playback speed.

This page has 5 threads Add post
12 Aug 21 at 09:31

Hi Andrew

First of all I want to give THANKS for your videos, I have learnt a lot about web scrapping.

I have tried lots of attemps with this web page, but no way to make it work ... may be you can help me

https://memsim.cenace.gob.mx/Produccion/Participantes/LOGIN/

I'm trying to upload a file, but i am unable to select the button

I try with 

    Set CH = New Selenium.ChromeDriver
    
    CH.Start
    
    CH.Get "https://memsim.cenace.gob.mx/Produccion/Participantes/LOGIN/"

    CH.FindElementById("uploadCerfile0").Click

And the excel gives a mistake

Selenium Error

invalid argument

(session info: chrome 92.0.4515.131)

(Driver info: chromedriver=91.0.4472.101

(af52a90bf87030dd1523486a1cd3ae25c5d76c9b-refs/branch-heads/4 NT 10.0.19043 x84_64)

 

13 Aug 21 at 08:24

Hi, it looks like you have an older version of the Chrome driver installed compared to the version of Chrome. Try installing the matching version of the Chrome driver.

I hope that helps!

13 Aug 21 at 09:28

Hi Andrew,

Chromedriver updated to 92.0.4515.107  and the same mistake

 

 

Andrew G  
13 Aug 21 at 10:05
07 Aug 21 at 16:17

Hi Andrew,

I cannot begin to tell you how much your videos have helped me out in not only expanding my knowledge of VBA but also just making my excel macros more effective.  I have spent a few hours walking through your videos on Selenium step by step and now I'm trying to use the techniques that I've learned from these videos, particularly this one, to extract data from the following google page: us vaccine tracker - Google Search.  In looking at the HTML, the div tag that I am interested in grabbing has the following HTML code:

<div jsname="iXWWee" class="enWFYd KDN9Hf" style="left: 39px; display: block; transform: translate3d(381px, 0px, 0px);">May 26, 2021</div>

As you can see from the HTML, I should be able to get the element(s) by specifying the class name.  But upon further inspection, the area on the page I am interested in has an innerHTML value formatted like "MMM DD, YYYY" (shown in the above example as "May 26, 2021") which changes as you move your cursor across the page.  I tried to account for this one of two ways:

1) Dim Element as Selenium.WebElement

Set Element = ch.FindElementByCss("div[class='enWFYd KDN9Hf'] [innerHTML='May26,2021']")

Debug.Print Element.InnerHTML

2) Dim Elements As Selenium.WebElements

Set Elements = ch.FindElementsByCss("div[class='enWFYd KDN9Hf']")

For Each Element In Elements

Debug.Print Element.Attribute("innerHTML")
    Next Element

08 Aug 21 at 16:54

Hi! I'm happy to hear that you've found the videos useful but I'm not sure that my reply here is going to be!
It looks like there's an event listener on the mouse move event for the div element with class SjK1gc. This is triggering a javascript function which I assume is producing the data for the table that appears as you move the mouse over the chart. Finding and reverse-enginerding(!) that function would be a tall order I think. It might make more sense to see if you can find the source data at the Our World in Data site which provides the data for the chart.

Sorry if that's not much help!

08 Aug 21 at 17:39

You know, I actually was able to find the Our World In Data raw data here: https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/us_state_vaccinations.csv but I can't figure out how to Ctrl+A, Ctrl+C, and Ctrl+V the data.  I thought it would work using the following code but all it ended up copying was the URL

Sub Test2()
    Set ch = New Selenium.ChromeDriver
    
    ch.Start
    ch.Get "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/us_state_vaccinations.csv"
    
    ch.SendKeys "^A"
    ch.SendKeys "^C"
    
    Sheet1.Range("A1").PasteSpecial xlPasteAll
End Sub

Is there something I have to change about the SendKeys rows in order to make sure it copies the site data and not the URL?

Andrew G  
08 Aug 21 at 18:24

Good stuff, I can think of three ways to get the data from that URL.

Option 1 - keyboard shortcuts

If you want to stick with the keyboard shortcut method you'll need to tab from the address bar into the page before selecting and copying. A warning that pasting the data will create a single column so you'll need to apply TextToColumns after pasting:

Dim cd As Selenium.ChromeDriver

Sub CopyAndPaste()
    
    Dim ks As Selenium.Keys
    
    Set cd = New Selenium.ChromeDriver
    Set ks = New Selenium.Keys
    
    cd.Start
    cd.Get "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/us_state_vaccinations.csv"
    
    cd.SendKeys ks.Tab
    cd.SendKeys ks.Control, "A"
    cd.SendKeys ks.Control, "C"
    
    ThisWorkbook.Worksheets.Add.Range("A1").PasteSpecial
    
End Sub

Option 2 - write the text to a text file using the ScriptingRuntime library

You'll need to set a reference to the Microsoft Scripting Runtime object library to use the code below. You could add a line to open the csv file as a workbook at the end.

Dim cd As Selenium.ChromeDriver

Sub GetElementText()

    Dim pre As Selenium.WebElement
    Dim fso As Scripting.FileSystemObject
    Dim ts As Scripting.TextStream
    
    Set cd = New Selenium.ChromeDriver
    Set fso = New Scripting.FileSystemObject
    Set ts = fso.OpenTextFile(ThisWorkbook.Path & "\Covid.csv", ForWriting, True)
    
    cd.Start
    cd.Get "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/us_state_vaccinations.csv"
    
    Set pre = cd.FindElementByCss("pre")
    
    ts.Write pre.Text
    ts.Close
    
End Sub

Option 3 - use the URLDownloadToFile function from the URLmon.dll

We have a video which explains this technique in detail. The advantage is that you don't need a browser at all - you have the URL of the resource you want to download so you can just get it.

Option Explicit

#If VBA7 Then
    Private Declare PtrSafe Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As LongPtr, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As LongPtr, _
        ByVal lpfnCB As LongPtr) As LongPtr
#Else
    Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
        ByVal pCaller As Long, _
        ByVal szURL As String, _
        ByVal szFileName As String, _
        ByVal dwReserved As Long, _
        ByVal lpfnCB As Long) As Long
#End If

Sub DownloadCovidFile()

    Dim FileURL As String
    Dim DestinationFile As String
    
    FileURL = "https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/us_state_vaccinations.csv"
    DestinationFile = ThisWorkbook.Path & "\Covid from URL.csv"
    
    If URLDownloadToFile(0, FileURL, DestinationFile, 0, 0) <> 0 Then
        Debug.Print "File download not started"
        Exit Sub
    End If
    
End Sub

Pick your favourite!

15 Jun 21 at 05:58

Hi Andrew,

I would like to down download of element with you.

WebPage: https://sg.finance.yahoo.com/quote/MSFT/key-statistics?p=MSFT

Data: Market cap (intra-day) : 1.96T

I manage to download as 1.96 but how do I get the full underlying data of 1,957,379,833,856.0

 

Thanks

15 Jun 21 at 06:12

Hi Ronnie,

The underlying number doesn't appear to be on the page so I don't know, sorry!

14 Jun 21 at 03:54

Hi Andrew,

 

Thanks for the reply.  I wrote some macro in excel VBA using Selenium. but the download seems very slow, anyway to solve the issue?

 

Sub GetCurrency()

Call ClearCurrency

    Dim ws As Worksheet
    Dim SingleRow As Range
    Dim ListofRows As Range
    Dim SingleColumn As Range
    Dim ListofColumns As Range
    Dim MyRange As Range
    Dim RowNum As Integer
    Dim ColNum As Integer

    Set ws = ThisWorkbook.Worksheets("Currency")

'=================================================================
'Setting Range
'=================================================================
    With ws
        Set ListofRows = Range("A3", Range("A2").End(xlDown))
        Set ListofColumns = Range("B1", Range("B1").End(xlToRight))
        Set MyRange = ws.Range(ListofRows, ListofColumns)
    End With
'=================================================================
'Loop Through RIC
'=================================================================
    RowNum = 3
    
    For Each SingleRow In ListofRows

        Dim URL As String: URL = "https://sg.finance.yahoo.com/quote/SGD=X"
        Set cd = New Selenium.ChromeDriver
        cd.AddArgument "--headless"
        cd.Start
        cd.Get URL
        Application.Wait Now + TimeValue("00:00:20")
            
'=================================================================
'Loop Through XPATH
'=================================================================
        
        Dim PathInput As Selenium.WebElement
        Dim PathResult As Variant
        
        ColNum = 2
        
        For Each SingleColumn In ListofColumns
            Set PathInput = cd.FindElementByXPath(SingleColumn)
            cd.Wait 1000
            PathResult = cd.FindElementByXPath(SingleColumn).Text
            
'=================================================================
'Output to Excel
'=================================================================
           
            Cells(RowNum, ColNum).Value = PathResult
            ColNum = ColNum + 1
        Next SingleColumn
        
        RowNum = RowNum + 1
    
    Next SingleRow
    
    Range("A1").CurrentRegion.Font.Size = 9
    Range("A1").CurrentRegion.Font.Name = "Calibri"
    Range(Range("c3", Range("c2").End(xlDown)), Range("C3", Range("C3").End(xlToRight))).Select
    With Selection
        .HorizontalAlignment = xlCenter
        .VerticalAlignment = xlBottom
        .WrapText = False
        .Orientation = 0
        .AddIndent = False
        .IndentLevel = 0
        .ShrinkToFit = False
        .ReadingOrder = xlContext
        .MergeCells = False
    End With
        

End Sub

14 Jun 21 at 06:19

Hi Ronnie,

The part that stands out to me is that you have a wait of 20 seconds for each row in your list - I think this would be a good place to start.

I hope it helps!

10 Jun 21 at 06:57

Hi,

 

I am trying to download some specific data from Yahoo finance, I would like to check, how do I copy the data into excel using Selenium

 

Sub GetData()

    Dim myrng As Range
    Dim lastRow As Long
    Dim row_count As Long
    Dim ws As Worksheet
    Dim ticker As Variant
        
    Set ws = Sheets("Main")
   
'=================================================================
'Find Last Row
'=================================================================
    With ws
        lastRow = .Range("A" & .Rows.Count).End(xlUp).Row
    End With
'=================================================================
'Find Ticker Range
'=================================================================
    Set myrng = ws.Range(Cells(2, 1), Cells(lastRow, 1))
'=================================================================
'Loop Through Tickers
'=================================================================
  
        Set cd = New Selenium.ChromeDriver
        cd.Start
        cd.Get https://finance.yahoo.com/quote/MSFT?p=MSFT&.tsrc=fin-srch
    
'=================================================================
'Finding Elements
'=================================================================
    
        Dim SearchInput As Selenium.WebElement
                
        Set SearchInput = cd.FindElementByXPath("/html/body/div[1]/div/div/div[1]/div/div[2]/div/div/div[5]/div/div/div/div[3]/div[1]/div/span[1]")

 

 

 

Thanks       

10 Jun 21 at 08:48

Hi Ronnie,

You might find this video useful as it uses Yahoo Finance as an example https://www.wiseowl.co.uk/vba-macros/videos/vba-scrape-websites/selenium-multiple-tabs/

I hope it helps!

11 Jun 21 at 07:24

Hi Andrew,

 

Thanks for the reply. As I am quite new at this, would appreciate if you could help me outh with this.

I manage to get to the URL & get the a particular Element, would like to check how i could export it to excel sheet cell b3

 

Option Explicit

Private cd As Selenium.ChromeDriver

Sub GetData()

    Dim ws As Worksheet
    Dim SingleRow As Range
    Dim ListofRows As Range
    Dim SingleColumn As Range
    Dim ListofColumns As Range
    Dim MyRange As Range
    Dim RowNum As Integer
    Dim ColNum As Integer

    Set ws = ThisWorkbook.Worksheets("Main")

'=================================================================
'Setting Range
'=================================================================
    With ws
        Set ListofRows = Range("A3", Range("A2").End(xlDown))
        Set ListofColumns = Range("B1", Range("B1").End(xlToRight))
        Set MyRange = ws.Range(ListofRows, ListofColumns)
    End With
'=================================================================
'Loop Through RIC
'=================================================================

        Dim URL As String: URL = "https://sg.finance.yahoo.com/quote/MSFT/"
        Set cd = New Selenium.ChromeDriver
        cd.Start
        cd.Get URL
        cd.Wait "1000"
            
'=================================================================
'Loop Through XPATH
'=================================================================
        
        Dim SearchInput As Selenium.WebElement

        Set SearchInput = cd.FindElementByXPath("/html/body/div[1]/div/div/div[1]/div/div[2]/div/div/div[5]/div/div/div/div[2]/div[1]/div[1]/h1")
        
        Debug.Print SearchInput.Text

'        SearchInput.Value = ThisWorkbook.Worksheets("Mains").Range("B3")

End Sub
 

 

Andrew G  
11 Jun 21 at 08:34

Hi Ronnie,

You can reverse the last line of code in your subroutine to change the contents of cell B3. If your Debug.Print statement is returning the correct information using the Text property then something like this should do the trick:

ThisWorkbook.Worksheets("Mains").Range("B3").Value = SearchInput.Text

I hope it helps!