Our training courses

Other training resources

Our training venues

Why we are different

Details for a_sh

a_sh has participated in the following threads:

Added by a_sh on 07 Apr 2022 at 22:22

I'm writing a code in vba to parse data from some websites using xmlhttp request but I noticed that getElementsByTagName method works just for some specific tags.

I prepared the following simple code to test it. The code works with some specific tags like "span", "div", "ls", "a" & "b" but not for others! (you can change the TestTag variable and test it)(The code doesn't work with "lx", "pos", "c" and ... for me.)

I don't know the reason. I searched for the answer and faced with some content about namespace, I read about it but I couldn't find any relation between my problem and this. Please help me to understand the problem.

I changed the code to late binding to be easily used and tested without setting reference objects.(both have the same results.)

Sub TEST()
    
    'Dim HTMLDoc As New MSHTML.HTMLDocument
    Dim HTMLDoc As Object
    Set HTMLDoc = CreateObject("htmlfile")
    'Dim HTMLDiv As MSHTML.IHTMLElement
    Dim HTMLDiv As Object
    Dim TestTag As String
    
    TestTag = "div"
    HTMLDoc.body.innerHTML = "<BODY><" & TestTag & _
    "><FIRSTTAG><SPAN class=firstclass>FirstContent<SECTAG>-SecContent</SECTAG></SPAN></FIRSTTAG> </" & _
    TestTag & "></BODY>"
    
    Set HTMLDiv = HTMLDoc.getElementsByTagName(TestTag).Item(0)
    MsgBox HTMLDiv.innerText

End Sub

Added by a_sh on 29 Mar 2022 at 19:41

Hello again and let me ask another question.

I wrote some function to search in header text but I have trouble to find destination url when I use MSXML and search an item in a website search form and that search guides me to a new url.

Is there any way to obtain the destination address? I really need that.

Another side issue is that using html with MSXML doesn't allow us to use two or more getelementby... in one line of the code. for example when I use IE I can write following code:

Set HTMLDiv = HTMLDoc1.getElementsByClassName("Class1")(0).getElementsByClassName(" Class2")(0)

but when using MSXML I have to break it into the following lines:

HTMLDoc2.body.innerHTML = HTMLDoc1.getElementsByClassName("Class1")(0).innerHTML

Set HTMLDiv = HTMLDoc2.getElementsByClassName(" Class2")(0)

Is there a simpler way to write it?

Added by a_sh on 29 Mar 2022 at 10:25

Hello, thank you for your lessons that I can say are the best among all of the relative tutorials I have seen.

I have a problem with parsing data from the "HEAD" part of html.

by using "HTMLDoc.body.innerHTML = XMLRequest.responseText" we lose this part of html that I exactly need to scrape data from.

I searched the whole internet but I couldn't find the solution.

I need to extract a lot of data from this part so I can't use the InStr function and ...

I would be very grateful if you could help me with this.

Head office

Kingsmoor House

Railway Street

GLOSSOP

SK13 2AA

London

Landmark Offices

99 Bishopsgate

LONDON

EC2M 3XD

Manchester

Holiday Inn

25 Aytoun Street

MANCHESTER

M1 3AE

© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.

End of small page here
Please be aware that our website uses cookies!
I'm OK with this Tell me more ...