Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
562 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
Posted by Andy Brown on 04 November 2021
This tutorial covers all aspects of extracting data from websites using Python: from the ethics and legality of web scraping, via the structure of a web page's document object model through to using the Python requests and BeautifulSoup modules to extract meaning from HTML.
See our full range of Python training resources, or test your knowledge of Python with one of our Python skills assessment tests.
This video has the following accompanying files:
File name | Type | Description |
---|---|---|
wyndham.htm | HTML page | The example web page for this tutorial |
wiseowl-logo.png | Image | Image referenced by example web page |
Useful websites.txt | Text file | Useful website addresses |
The requests module.py | Python code | Using the requests module |
HTML from file.py | Python code | Getting HTML from a file |
Basic scraping.py | Python code | Basic web scraping using BeautifulSoup |
Possible parsers.png | Image | Possible parsers for BeautifulSoup |
Scraping 1- chaining elements copy.py | Python code | Chaining HTML tags |
Scraping 2- navigable strings copy.py | Python code | Getting visible text using navigable strings |
Scraping 3 - relatives.py | Python code | Children descendants and other relatives |
Scraping 3 - relatives.png | Image | The possible relatives you can reference |
Scraping 4 - finding.py | Python code | Using findall to get matching elements |
Scraping 5 - CSS selectors.py | Python code | Using select with CSS selectors |
Click to download a zipped copy of the above files.
After watching this video, you may like to test your understanding by doing one or more of the following exercises:
Exercise | Level |
---|---|
Use BeautifulSoup to compile a list of external links for any website | Standard |
You can also download the answers to each exercise from the links above.
You can increase the size of your video to make it fill the screen like this:
Play your video (the icons shown won't appear until you do), then click on the full screen icon which appears as shown above.
When you've finished viewing a video in full screen mode, just press the Esc key to return to normal view.
To improve the quality of a video, first click on the Settings icon:
Make sure yoiu're playing your video so that the icons shown above appear, then click on this gear icon.
Choose to change the video quality:
Click as above to change your video quality.
The higher the number you choose, the better will be your video quality (but the slower the connection speed):
Don't choose the HD option shown unless your connection speed is fast enough to support it!
Is your Wise Owl speaking too slowly (or too quickly)? You can also use the Settings menu above to change your playback speed.
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.