Learn how to extract all links from any website in Python

Kalebu Jordan
2 min readMay 23, 2020

--

Hello Pythonistas,
In this tutorial, you’re going to learn how to extract all links from a given website or URL using BeautifulSoup and requests.

If you’re new to web scraping I would recommend starting first with the article beginner tutorial to Web scraping and then move to this one once you get comfortable with the basics.

how do we extract all links?

We will use the requests library to get the raw HTML page from the website and then we are going to use BeautifulSoup to extract all the links from the HTML page.

Requirements

To follow through with this tutorial you need to have requests and Beautiful Soup library installed.

Installation

$ pip install requests 
$ pip install pip install beautifulsoup4

link_spider.py

Below is a code that will prompt you to enter a link to a website and then it will use requests to send a GET request to the server to request the HTML page and then use BeautifulSoup to extract all link tags in the HTML.

import requestsfrom bs4 import BeautifulSoupdef extract_all_links(site):    html = requests.get(site).text    soup = BeautifulSoup(html, 'html.parser').find_all('a')    links = [link.get('href') for link in soup]    return linkssite_link = input('Enter URL of the site : ')all_links = extract_all_links(site_link)print(all_links)

Output:

Now once you run the above script it will produce result similar to what shown below

kalebu@kalebu-PC:~/$ python3 link_spider.py Enter URL of the site : https://kalebujordan.com/ ['#main-content', 'mailto://kalebjordan.kj@gmail.com', 'https://web.facebook.com/kalebu.jordan', 'https://twitter.com/j_kalebu', 'https://kalebujordan.com/'
.....]

Based on your interest I recommend to also read these articles;

In case you find it interesting, don’t be shy to share it with your peers on other dev communities.

For comments, suggestion, or difficulties drop it in the comment box below and I will get back to you ASAP

Originally published at https://kalebujordan.com on May 23, 2020.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Kalebu Jordan
Kalebu Jordan

Written by Kalebu Jordan

Mechatronics Engineer by Professional || Self taught Python Developer || Passionate about open source and bringing impact to education sector

No responses yet

Write a response