Categories
Challenge Development

How to avoid IP bans with rotating proxies [with example code]

When scraping websites or automating online activities, IP bans can be a major obstacle. Many websites implement anti-scraping measures that block repeated requests from the same IP address. To bypass this, using rotating proxies is a common and effective strategy. Rotating proxies automatically switch your IP address with each request, making it harder for websites to detect and block your activity.

Why Use Rotating Proxies?

  • Avoid IP Bans: Changing IPs helps prevent your IP from being flagged or blocked.
  • Bypass Geo-restrictions: Access content restricted to certain regions by rotating through proxies in different locations.
  • Increase Success Rate: Improves the chances of successful requests by mimicking more natural browsing behavior.

How to Implement Rotating Proxies
Proxy list example

Here’s a simple example using Python and the popular requests library along with a list of proxy addresses:

import requests
import random

# List of proxies
proxies_list = [
    'http://111.111.111.111:8080',
    'http://222.222.222.222:8080',
    'http://333.333.333.333:8080',
]

# Target URL
url = 'http://example.com'

for i in range(10):  # Make 10 requests
    # Select a random proxy from the list
    proxy = random.choice(proxies_list)
    proxies = {
        'http': proxy,
        'https': proxy,
    }
    try:
        response = requests.get(url, proxies=proxies, timeout=5)
        print(f"Request {i+1} successful with IP: {proxy}")
        print(f"Response status code: {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"Request {i+1} failed with IP: {proxy}. Error: {e}")
Best Practices
  • Use a large pool of proxies: The more IPs you have, the less likely you are to get banned.
  • Implement delays: Avoid making rapid requests; add random sleep intervals.
  • Monitor responses: Detect when an IP gets blocked and rotate proxies accordingly.
  • Validate proxies: Regularly test proxies for availability and speed.

Advanced Rotating Proxy Example with Error Handling
Proxy service integration

Here follows a more advanced example demonstrating how to integrate with a proxy service (like ProxyMesh or Bright Data), handle proxy failures gracefully, and rotate proxies dynamically. This approach ensures more reliable proxy management and minimizes downtime.

import requests
import time
import random

# List of proxy service endpoints or proxies
proxies_list = [
    'http://proxy1.example.com:port',
    'http://proxy2.example.com:port',
    'http://proxy3.example.com:port',
    # Add more proxies or use a proxy API endpoint
]

# Function to get a working proxy
def get_working_proxy(proxies):
    random.shuffle(proxies)
    for proxy in proxies:
        try:
            response = requests.get('http://httpbin.org/ip', proxies={'http': proxy, 'https': proxy}, timeout=5)
            if response.status_code == 200:
                print(f"Proxy {proxy} is working.")
                return proxy
        except requests.RequestException:
            print(f"Proxy {proxy} failed. Trying next.")
    return None

# Main scraping function
def scrape_with_proxies(url, proxies):
    for attempt in range(10):
        proxy = get_working_proxy(proxies)
        if not proxy:
            print("No working proxies available.")
            break
        try:
            response = requests.get(url, proxies={'http': proxy, 'https': proxy}, timeout=10)
            if response.status_code == 200:
                print(f"Successfully fetched data with proxy: {proxy}")
                return response.text
            else:
                print(f"Received status code {response.status_code} with proxy: {proxy}")
        except requests.RequestException as e:
            print(f"Request failed with proxy {proxy}: {e}")
        # Wait a bit before trying again
        time.sleep(random.uniform(1, 3))
    print("Failed to fetch data after multiple attempts.")
    return None

# Usage example
target_url = 'http://example.com'
page_content = scrape_with_proxies(target_url, proxies_list)

if page_content:
    print("Page fetched successfully.")
    # Process the page content here
else:
    print("Failed to fetch the page.")
Key Features:
  • Proxy Validation: Before using a proxy, the script tests if it’s working by pinging http://httpbin.org/ip.
  • Graceful Failures: If a proxy fails, it moves to the next one instead of crashing.
  • Dynamic Rotation: Selects a new proxy for each attempt, reducing the chance of detection.
  • Retries & Delays: Implements retries with random delays to mimic natural browsing behavior.

More reading

Reliable rotating proxies for business directories scrape

Choosing reliable [rotating] residential proxies

Conclusion

Rotating proxies are essential for maintaining continuous, undetected access to websites during scraping or automation tasks. By randomly switching IP addresses with each request, you significantly reduce the risk of IP bans and improve your chances of success. Remember to respect website terms of service and use proxies responsibly.

Leave a Reply

Your email address will not be published. Required fields are marked *