Proxy and Scrape.do Mode
Every request in Scrapeman can route through a proxy. You can use any standard HTTP/HTTPS proxy, or flip into Scrape.do native mode to access residential rotation, JS rendering, geo targeting, and automatic ban retry — all from the Settings tab.
Standard Proxy
Configure per-request in the Settings tab of the request builder.
- Protocol —
HTTPorHTTPS - Host and Port
- Auth — username and password for proxy basic authentication
The proxy is applied via undici's ProxyAgent. All fields support {{var}} variable interpolation, so you can store proxy credentials in environment variables.
Scrape.do Native Mode
Flip the Scrape.do toggle in the Settings tab to route the request through Scrape.do's infrastructure instead of sending it directly.
When enabled, the main process rewrites the target URL to api.scrape.do and injects the configured parameters. Your Scrape.do token is stored as a secret environment variable and never appears in history on disk.
Residential Rotation
Automatically rotates the outgoing IP address from Scrape.do's residential pool on every request. No manual proxy list management required.
JS Rendering
Spins up a headless browser on Scrape.do's infrastructure to fully render the target page before returning the response. Useful for SPAs, lazy-loaded content, and anti-bot pages that check for browser fingerprints.
Geo Targeting
Route the request through a specific country by selecting a country code. The outgoing IP will appear to originate from that country.
Ban Retry
When enabled, Scrape.do automatically retries the request if it detects a block or detection response from the target server. Retries are handled server-side and transparent to Scrapeman — you receive the final successful response.
Rotating Proxy
Supply a list of proxy URLs and let Scrapeman rotate through them automatically. In the Settings tab toggle Rotate through multiple proxies and add proxy URLs one per line. Pick a strategy:
- Round-robin — cycles through the list in order. The position is shared across all concurrent slots in a run, so the Collection Runner rotates per request and the Load Runner rotates per concurrent slot.
- Random — picks a random proxy for each request.
When the rotate list is non-empty, the single URL field is ignored. If the list is empty, the single URL is used.
User-Agent Presets
The Settings tab has a User-Agent picker with 9 presets. Select one to set the User-Agent header
for that request. A preview shows the exact UA string below the picker. A custom User-Agent in
the Headers tab always overrides the preset.
| Preset | Label |
|---|---|
scrapeman | Scrapeman <version> (default) |
chrome-macos | Chrome 124 macOS |
chrome-windows | Chrome 124 Windows |
firefox-macos | Firefox 125 macOS |
firefox-windows | Firefox 125 Windows |
safari-macos | Safari 17 macOS |
safari-ios | Safari 17 iOS |
googlebot | Googlebot 2.1 |
curl | curl 8.7 |
Anti-Bot Detection
After every request, Scrapeman inspects the response for anti-bot signals and shows a dismissable banner above the body when one is found.
| Signal | Trigger |
|---|---|
| Cloudflare | cf-ray header present, or HTTP 403 with a Cloudflare browser-check body |
| Rate limited | HTTP 429 or a Retry-After header |
| CAPTCHA | Body contains hcaptcha, recaptcha, captcha-container, or turnstile |
| Bot block | HTTP 403 with body matching access denied, bot detected, automated access, or automated request |
When a Retry-After header is present, the seconds-to-wait countdown shows in the banner.
Cloudflare is checked before rate limit, rate limit before CAPTCHA, CAPTCHA before bot block. Only one
signal is shown per response.
Rate Limiting
Per-request rate limit controls the delay the Collection Runner and Load Runner insert between requests. It has no effect on a single send. Configure under Settings → Rate limit:
- Fixed delay — wait this many milliseconds after each request.
- Jitter min / max — add a random extra delay between min and max ms on top.
Run-level delay (from the Load Runner config) and per-request rate limit stack in a non-additive way: if the run-level delay is greater than 0, the per-request rate limit is not added on top.