跳到主要内容

playwright反爬虫检测

爬虫检测网站

Antibot: https://bot.sannysoft.com/

playwright:https://playwright.net.cn/python/

正常浏览器展示如下:

image-20250907下午14059943

使用 playwright 打开时,展示如下:

playwright cr https://bot.sannysoft.com/

image-20250907下午14242890

可见默认情况下使用 playwrightWebDriver 那一栏无法通过检查。

绕过方式一:Stealth 插件

文档直达

pip install playwright-stealth

测试代码如下:

import asyncio
from playwright.async_api import async_playwright
from playwright_stealth import Stealth

async def main():
async with Stealth().use_async(async_playwright()) as p: # 最常见用法
browser = await p.chromium.launch(headless=False) # 启动浏览器
context = await browser.new_context()
page = await context.new_page()
await page.goto("https://bot.sannysoft.com/")
print(await page.title())
# 在这里继续操作
await page.wait_for_timeout(10000)

await browser.close()

asyncio.run(main())

绕过方式二:使用当前浏览器

退出浏览器,再以debug模式启动

"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222 --user-data-dir=/tmp/test/

编写代码连接:

import asyncio
from playwright.async_api import async_playwright

async def main():
async with async_playwright() as p:
browser = await p.chromium.connect_over_cdp("http://localhost:9222") # 连接刚才打开的浏览器
context = await browser.new_context()
page = await context.new_page()
await page.goto("https://bot.sannysoft.com/")
print(await page.title())
# 在这里继续操作
await page.wait_for_timeout(10000)

await browser.close()

asyncio.run(main())