Our training courses

Other training resources

Our training venues

Why we are different

Details for BYIB

BYIB has participated in the following threads:

Added by BYIB on 26 Jun 2017 at 04:32

Many thanks for a great post which I've now successfully modified to scrape 2 sites I need data from. I can get the innertext but am struggling to get some additional data like an href giving the URL.

In the past, I have copied the HTML into a spreadsheet and parsed the html lines there to get at what I want. I now want to automate this hence my interest in your post. Two questions:

Downloading the HTML of one line so I can manually parse it

Using the element.innerText, I can extract the innerText from the following but what I want to do is to get the URL and each piece of innerText individually to put in fields in Excel:

<h3 class="title">
<a title="I need a freelance data entry &amp;amp; admin " class="job js-paragraph-crop" data-height="65" href="https://www.peopleperhour.com/job/i-need-a-freelance-data-entry-admin-1619501">I need a freelance data entry &amp; admin </a>           
<span class="job-etiquettes">
<span class="etiquette orange">Urgent</span>
/span>
</h3>
<ul class="clearfix member-info horizontal crop">
<li class="hidden-phone">
<i class="fpph fpph-clock-wall"></i>
<span class="hidden-xs">Posted</span>
<time class="crop value" title="26 June 2017">3 hours ago</time>
</li>
<li class="hidden-xs job-location crop js-tooltip" title="The Job can be done remotely from any location">
<i class="fpph fpph-location"></i>
<span class="">Remote</span>
</li>
<li class="hidden-phone">
<i class="fa fa-dot-circle-o"></i>Proposals<span class="value proposal-count">19</span>
</li>

The innerText shows:

Title, Urgent, Time ago posted, Remote, # proposals, etc.

Is there a command I can use to get all the source html text from the "<a" line so I can parse it myself or can you tell me how to access the individual fields including the URL?

Downloading the Full HTML into Excel

A I mentioned, I can manipulate HTML once I paste it into Excel. If I cannot easily do the above, is there a way of pasting all the HTML from a webpage into Excel so I can use my manual methods?

Many thanks in advance for your help; I have spent many hours trawling the Internet for answers but couldn't find anything!

Nick

Head office

Kingsmoor House

Railway Street

GLOSSOP

SK13 2AA

London

Landmark Offices

99 Bishopsgate

LONDON

EC2M 3XD

Manchester

Holiday Inn

25 Aytoun Street

MANCHESTER

M1 3AE

© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.

End of small page here
Please be aware that our website uses cookies!
I'm OK with this Tell me more ...