Scrapoxy

What is Scrapoxy?

Scrapoxy is a super proxy aggregator, allowing you to manage all proxies in one place 🎯, rather than spreading it across multiple scrapers 🕸️.

It also smartly handles traffic routing 🔀 to minimize bans and increase success rates 🚀.

🚀🚀 GO TO SCRAPOXY.IO FOR MORE INFORMATION! 🚀🚀

Features

☁️ Datacenter Providers with easy installation ☁️

Scrapoxy supports many datacenter providers like AWS, Azure, or GCP.

It installs a proxy image on each datacenter, helping the quick launch of a proxy instance. Traffic is routed to proxy instances to provide many IP addresses.

Scrapoxy handles the startup/shutdown of proxy instances to rotate IP addresses effectively.

🌐 Proxy Services 🌐

Scrapoxy supports many proxy services like Rayobyte, IPRoyal or Zyte.

It connects to these services and uses a variety of parameters such as country or OS type, to create a diversity of proxies.

💻 Hardware materials 💻

Scrapoxy supports many 4G proxy farms hardware types like Proxidize.

It uses their APIs to handle IP rotation on 4G networks.

📜 Free Proxy Lists 📜

Scrapoxy supports lists of HTTP/HTTPS proxies and SOCKS4/SOCKS5 proxies.

It takes care of testing their connectivity to aggregate them into the proxy pool.

⏰ Timeout free ⏰

Scrapoxy only routes traffic to online proxies.

This feature is useful with residential proxies. Sometimes, proxies may be too slow or inactive. Scrapoxy detects these offline nodes and excludes them from the proxy pool.

🔄 Auto-Rotate proxies 🔄

Scrapoxy automatically changes IP addresses at regular intervals.

Scrapers can have thousands of IP addresses without managing proxy rotation.

🏃 Auto-Scale proxies 🏃

Scrapoxy monitors incoming traffic and automatically scales the number of proxies according to your needs.

It also reduces proxy count to minimize your costs.

🍪 Sticky sessions on Browser 🍪

Scrapoxy can keep the same IP address for a scraping session, even for browsers.

It includes HTTP requests/responses interception mechanism to inject a session cookie, ensuring continuity of the IP address throughout the browser session.

🚨 Ban management 🚨

Scrapoxy injects the name of the proxy into the HTTP responses.

When a scraper detects that a ban has occurred, it can notify Scrapoxy to remove the proxy from the pool.

📡 Traffic interception 📡

Scrapoxy intercepts HTTP requests/responses to modify headers, keeping consistency in your scraping stack. It can add session cookies or specific headers like user-agent.

📊 Traffic monitoring 📊

Scrapoxy measures incoming and outgoing traffic to provide an overview of your scraping session.

It tracks metrics such as the number of requests, active proxy count, requests per proxy, and more.

🌍 Coverage monitoring 🌍

Scrapoxy displays the geographic coverage of your proxies to better understand the global distribution of your proxies.

🚀 Easy-to-use and production-ready 🚀

Scrapoxy is suitable for both beginners and experts.

It can be started in seconds using Docker, or be deployed in a complex, distributed environment with Kubernetes.

🔓 Free and Open Source 🔓

And of course, Scrapoxy remains free and open source, under the MIT license.

I simply ask you to give me credit if you redistribute or use it in a project 🙌.

A warm thank-you message is appreciated as well 😃🙏.

Documentation

More information on scrapoxy.io.

Contributors

Want to contribute? Check out the guide!

Here is my contact on

Sponsorship

Scrapoxy is an open-source project. The project is free for users, but it does come with costs for me.

I invest significant time and resources into maintaining and improving this project, covering expenses for hosting, promotion, and more.

If you appreciate the value Scrapoxy provides and wish to support its continued development, discuss new features, access the roadmap, or receive professional support, please consider becoming a sponsor!

Your support would greatly contribute to the project's sustainability and growth:

Licence

See The MIT License (MIT)

Acknowledgements

I would like to thank all the contributors to the project and the open-source community for their support.

Name		Name	Last commit message	Last commit date
Latest commit History 641 Commits
.github		.github
.husky		.husky
.idea/runConfigurations		.idea/runConfigurations
packages		packages
tools		tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitleaksignore		.gitleaksignore
.lintstagedrc.js		.lintstagedrc.js
.prettierrc.js		.prettierrc.js
.stylelintrc.js		.stylelintrc.js
LICENCE.md		LICENCE.md
README.md		README.md
commitlint.config.js		commitlint.config.js
jest.config.ts		jest.config.ts
jest.preset.js		jest.preset.js
nx.json		nx.json
package-lock.json		package-lock.json
package.json		package.json
scrapoxy.iml		scrapoxy.iml
tsconfig-frontend.base.json		tsconfig-frontend.base.json
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapoxy

What is Scrapoxy?

Features