Introduction to Selenium

This guide builds upon the first guide, found here, where we covered web scraping with Selenium. The extension is that, in some instances, you may need to use proxies to help further your web scraping activities.

Proxies in Selenium mask your IP address, allowing for better anonymity when scraping or automating tasks across multiple websites. They help bypass geo-restrictions, avoid rate-limiting, and reduce the chances of being blocked by the target site. Using rotating proxies also ensures that requests come from different IP addresses, improving the success rate of large-scale data extraction.

Proxy authentication can be a long and tedious process in Selenium. To combat this, we can use a package inside the program to authenticate our proxies. We’ll use a package called Selenium Wire. Selenium Wire allows for proxy authentication by letting you configure proxies with credentials directly. This means you can pass the proxy server's username and password within the proxy options, meaning you can authenticate automatically.

Installing Selenium with Python

Assuming you already have both Python setup and Chrome, you’ll need to install the package in Python. If you don’t have Python installed already, you can download it free here, available on macOS, Windows, or Linux.

Use using the following pip command in your terminal:

pip install selenium

This installs the Selenium package in Python. In addition to this, install the wire package:

pip install selenium-wire

We’d also recommend installing the following packages, as they’ll come in useful later on:

pip install webdriver-manager

pip install requests

Having webdriver-manager installed is helpful because it automatically handles downloading, installing, and updating the appropriate WebDriver binaries for our browser, removing the need for any manual setup. This simplifies the process, ensures compatibility with the latest browser versions, and reduces potential errors in our Selenium automation.

You can check through the terminal if you’re unsure whether the packages are installed already. Open up your terminal, and type the following command:

pip freeze

This will list all installed packages along with their version numbers.

Common Errors

There are two common errors you may run into when using Selenium Wire. First, it is no longer maintained, which may cause disparity between the packages it uses. One of these may mean you have an error when using the blinker package. Selenium Wire depends on this package, whereas the latest blinker package no longer contains a certain functionality.

To get around this issue, you can force a downgrade of the package by using an older and compatible version of Blinker:

pip install blinker==1.7.0

Another common issue you might encounter is "No module named 'pkg_resources'”. This is easily fixed by installing setuptools in your terminal:

pip install setuptools

Building The Script

We can then move on to building the script using our newfound package, wire. We’ll use a public IP checking site to monitor the use of our proxies; in this case, it’s icanhazip. We’ll use this site to verify the IP of our proxy and check it's working.

The first part of the script is importing the required packages and defining the 4 components of the proxy. In this case, we’re using rotating Oxylabs from the UK. As a quick tip, you’ll easily be able to verify the script works if the proxy rotates each request.

from seleniumwire import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

username = "bXTeUdho-cc-us-pool-oproxy"
password = "CaSwJqZV"
ip = "pr.rampageproxies.com"
port = "8888"

def chrome_proxy(user: str, password: str, host: str, port: str) -> dict:
    return {
        'proxy': {
            'http': f'http://{user}:{password}@{host}:{port}',
            'https': f'https://{user}:{password}@{host}:{port}',
        }
    }

We’ll begin by providing the username and password of the proxy, the host/IP, and the port. After this, we’ll use the “chrome_proxy” function with a dictionary containing proxy settings using your username, password, host, and port. It creates URLs for both HTTP and HTTPS proxies. This dictionary tells Selenium Wire to route the browser traffic through the proxy.

Next, we’ll create another function called “ip_testing” to set up a Chrome WebDriver with a proxy using Selenium Wire. It runs the Chrome browser in headless mode by default (although this can be easily changed by swapping this to FALSE) and uses the specified proxy credentials from before to route traffic when loading our selected site.

def ip_testing():
    try:
        manage_driver = Service(ChromeDriverManager().install())
        options = webdriver.ChromeOptions()
        options.headless = True  
        proxies = chrome_proxy(username, password, ip, port)
        driver = webdriver.Chrome(service=manage_driver, options=options, seleniumwire_options=proxies)

Finally, we’ll use the inbuilt automation capabilities of Selenium to extract the IP address from the website. Once done, the script then automatically closes the site (if you’re running in headed mode) and prints the IP address of the proxy in the console:

        try:
            driver.get("https://icanhazip.com/")
            ip_address = driver.find_element("tag name", "body").text.strip()
            return f"Your IP is: {ip_address}"
        
        finally:
            driver.quit()
    
    except Exception as e:
        return f"An error occurred: {str(e)}"

print(ip_testing())

With the finished script looking something like this:

from seleniumwire import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

USERNAME = "bXTeUdho-cc-de-pool-oproxy"
PASSWORD = "CaSwJqZV"
PROXY_HOST = "pr.rampageproxies.com"
PROXY_PORT = "8888"

def chrome_proxy(user: str, password: str, host: str, port: str) -> dict:
    return {
        'proxy': {
            'http': f'http://{user}:{password}@{host}:{port}',
            'https': f'https://{user}:{password}@{host}:{port}',
        }
    }

def ip_testing():
    try:
        manage_driver = Service(ChromeDriverManager().install())
        options = webdriver.ChromeOptions()
        options.headless = True  
        proxies = chrome_proxy(USERNAME, PASSWORD, PROXY_HOST, PROXY_PORT)
        driver = webdriver.Chrome(service=manage_driver, options=options, seleniumwire_options=proxies)

        try:
            driver.get("https://icanhazip.com/")
            ip_address = driver.find_element("tag name", "body").text.strip()
            return f"Your IP is: {ip_address}"
        
        finally:
            driver.quit()
    
    except Exception as e:
        return f"An error occurred: {str(e)}"

print(ip_testing())

Using a Proxy List

You may also want to use a list of proxies instead, through various methods instead of hard coding a single proxy. Using a list allows for easier proxy rotation, improved anonymity, and better management of rate limits, especially when dealing with tasks that require multiple requests or connections. This means that proxies can be randomly selected, cycled through, or tested for availability, enhancing the capabilities of your Selenium scripts.

You might consider using a mix of providers in your script. Rampage provides access to 10 different residential proxy providers, all accessible under one dashboard.

Here’s how we’ll manage using a proxy list in our text file of “proxies.txt”:

def get_random_proxy(file_path: str):
    with open(file_path, 'r') as file:
        proxies = file.readlines()
    return random.choice([proxy.strip() for proxy in proxies])

def use_random_proxy(proxy_file: str):
    random_proxy = get_random_proxy(proxy_file)
    ip, port, username, password = random_proxy.split(':')

This code reads a list of proxies our file “proxies.txt” and randomly selects one to use. The “get_random_proxy function reads” the proxies, strips any extra whitespace, and randomly picks one. The function then splits the selected proxy into its components (IP, port, username, and password) to configure it for use in the Selenium session.

Improvements

Despite both Chrome and Seleniums proxy support being inherintly terrible, the script above is an essential proof that proxies can be used in Selenium with the likes of the Selenium Wire package. One improvement that could be made is printing out the site's response code. By printing the response code, it makes error handling much more manageable. For example, an error code 407 would likely indicate that the proxy credentials are wrong; therefor you'd need to check the username and/or password.

Conclusion

You’re ready and equipped with the knowledge and tools to begin using proxies authenticated with usernames and passwords correctly within Selenium using the Selenium Wire package. In this guide, we’ve covered the basics of the package and how it’s used in conjunction with Python to authenticate either single or lists of proxies.

We’d recommend you take a read of our other guides, using proxies with Python requests here, or how to web scrape with Selenium in Python here.

Frequently asked questions

Rampage allows purchase from 10 of the largest residential providers on one dashboard, starting at just 1GB. There's no need to commit to any large bandwidth packages. Through our dashboard, you're also given options such as static or rotating proxies and various targeting options, all for a single price per provider.

All purchases are made through the Rampage dashboard.

Rampage also offers high-quality, lightning-fast ISP and DC proxies available in the US, UK, and DE regions.

If you're unsure what provider would suit your use case base, please contact our support; we'll gladly assist.

Rampage Blogs

Using Proxies with Selenium

Owen Crisp

Introduction to Selenium

Installing Selenium with Python

Common Errors

Building The Script

Using a Proxy List

Improvements

Conclusion

Frequently asked questions

Why Rampage is the best proxy platform

Unlimited Connections and IPs

Worldwide Support

Speedy Customer Support

Digital Dashboard