How to Stop Selenium Scrapers from Getting Blocked
Websites block Selenium because default configurations broadcast automation signatures (such as the navigator.webdriver flag) and exhibit predictable, non-human behavior. To prevent blocks, you must modify your browser fingerprint, rotate IP addresses, randomize request timing, and send realistic headers.
1. Mask Browser Fingerprints with Undetected ChromeDriver
Standard Selenium drivers leave JavaScript variables (like cdc_adoQpoasnfa76pfcZLmcfl_Array) that anti-bot systems instantly detect. The easiest way to bypass this is using the undetected-chromedriver library, which automatically patches these variables.
import undetected_chromedriver as uc
options = uc.ChromeOptions()
# Run headless only if necessary; headful mode is less suspicious
options.add_argument('--headless')
driver = uc.Chrome(options=options)
driver.get("https://targetwebsite.com")
2. Configure Realistic Headers and User-Agents
If your User-Agent string indicates an outdated browser or doesn't match your actual browser engine, you will be flagged. Set a modern User-Agent and ensure your request headers match those of a standard consumer browser.
import undetected_chromedriver as uc
options = uc.ChromeOptions()
# Use a real, up-to-date User-Agent string
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
options.add_argument(f'--user-agent={user_agent}')
driver = uc.Chrome(options=options)
3. Implement Randomized Delays and Human Interactions
Repetitive, rapid actions trigger rate limits. Avoid static wait times (like time.sleep(5)). Instead, use randomized intervals and simulate basic human interactions like scrolling.
import time
import random
from selenium.webdriver.common.by import By
# 1. Randomized sleep intervals
time.sleep(random.uniform(2.0, 6.0))
# 2. Simulate natural scrolling
driver.execute_script("window.scrollTo(0, document.body.scrollHeight * 0.5);")
time.sleep(random.uniform(1.0, 3.0))
# 3. Use explicit waits instead of hardcoded sleeps for elements
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "content-loaded"))
)
4. Rotate IPs with Proxies
Making hundreds of requests from a single IP address will result in an IP ban. For production scrapers, route your traffic through a rotating proxy service (preferably residential or mobile proxies).
import undetected_chromedriver as uc
options = uc.ChromeOptions()
# Format: IP:PORT or proxy provider gateway
PROXY = "192.168.1.100:8080"
options.add_argument(f'--proxy-server={PROXY}')
driver = uc.Chrome(options=options)
Note: If your proxy requires username/password authentication, standard Chrome command-line arguments do not support it directly. You will need to use a proxy-auth extension or a tool like Selenium Wire.
Summary Checklist for Evading Detection
- Disable the WebDriver flag: Use
undetected-chromedriveror manually exclude the switches. - Match your headers: Ensure your User-Agent matches the browser version you are running.
- Use Residential Proxies: Datacenter IPs are easily flagged and blocked by Cloudflare and Akamai.
- Limit concurrent requests: Distribute your scraping load over time to mimic organic traffic.
While these techniques mitigate blocks, highly secured sites using advanced behavioral analysis may still detect automated browsers. No single method guarantees 100% bypass rates indefinitely.
Need this done fast? order a fix on Kwork.
Need help with this?
I take on freelance fixes and builds in this area.