Python: 3 Ways to Check If a String Is a Valid URL

Updated: June 2, 2023 By: Khue Post a comment

This concise example-based article will walk you through three different ways to check whether a string is a valid URL or not in Python. The first two approaches only use built-in features of Python, while the last one takes advantage of a third-party library.

Using regular expressions

You can declare a regular expression pattern to match valid URLs as follows:

pattern = r'^(http|https):\/\/([\w.-]+)(\.[\w.-]+)+([\/\w\.-]*)*\/?$'

The re.match() or re.search() function can then be used to check if the input string matches the pattern.

Code example:

import re

def is_valid_url(url):
    pattern = r'^(http|https):\/\/([\w.-]+)(\.[\w.-]+)+([\/\w\.-]*)*\/?$'
    return bool(re.match(pattern, url))

# Usage
url1 = 'https://www.slingacademy.com/cat/sample-data'
url2 = "https://api.slingacademy.com/v1/sample-data/"
url3 = "abcxyz"

print(is_valid_url(url1)) # True
print(is_valid_url(url2)) # True
print(is_valid_url(url3)) # False

This approach is flexible and allows customization of the URL validation pattern to fit specific requirements. However, regular expressions can be complex and may require careful crafting to cover all possible URL scenarios.

Using the urllib module

The urllib module, part of the Python standard library, provides a robust URL parsing mechanism. You can make use of it for the purpose of URL validation. The steps to do that are as follows:

  1. Import the urlparse() function from urllib.parse.
  2. Use the urlparse() function to parse the input URL string.
  3. Check if the scheme and netloc attributes of the parsed result are non-empty, indicating a valid URL.

Code example:

from urllib.parse import urlparse

def is_valid_url(url):
    try:
        result = urlparse(url)
        return all([result.scheme, result.netloc])
    except ValueError:
        return False

# Usage
url1 = 'https://www.slingacademy.com/cat/sample-data'
url2 = "https://api.slingacademy.com/v1/sample-data/"
url3 = "http://localhost:3000"

print(is_valid_url(url1)) # True
print(is_valid_url(url2)) # True
print(is_valid_url(url3)) # True

This approach is simple, but it may allow URLs without a scheme or netloc, which may not be desired depending on the specific requirements. For instance, it judges https://slingacademy as a valid URL even though .com is omitted.

Using the validators package

validators is a popular open-source library, designed specifically for data validation purposes in Python. You can install it by running the following command:

pip install validators

Then use its validators.url() function to check whether a given string is a valid URL like this:

import validators

url1 = 'https://www.slingacademy.com/cat/sample-data'
url2 = "https://api.slingacademy.com/v1/sample-data/"
url3 = "http://localhost:3000"
url4 = "localhost:3000"

print(validators.url(url1))
print(validators.url(url2))
print(validators.url(url3))
print(validators.url(url4))

Output:

True
True
True
ValidationFailure(func=url, args={'value': 'localhost:3000', 'public': False})

As you can see, the validators library makes our lives much easier. The tutorial ends here. Happy coding & enjoy your day!