The open-source browser API for AI agents & apps.
The best way to build live web agents and browser automation tools.
Steel.dev is an open-source browser API that makes it easy to build AI apps and agents that interact with the web. Instead of building automation infrastructure from scratch, you can focus on your AI application while Steel handles the complexity.
This repo is the core building block behind Steel - a production-ready, containerized browser sandbox that you can deploy anywhere. It includes built-in stealth capabilities, text-to-markdown session management, UI to view/debug sessions, and full browser control through standard automation frameworks like Puppeteer, Playwright, and Selenium 🔥
Steel is in public beta and evolving every day. Your suggestions, ideas, and reported bugs help us immensely. Do not hesitate to join in the conversation on Discord or raise a GitHub issue. We read everything, respond to most, and love you.
The easiest way to get started with Steel is by creating a Steel Cloud account.
Installation methods | Link |
---|---|
Run locally with GHCR | |
1-click deploy to Railway | |
1-click deploy to Render |
The steel-browser
provides a REST API to control, run, and manage a production-ready browser environment. Under the hood, it manages browser instances, sessions, and pages, allowing you to perform complex browsing tasks programmatically without any of the headaches:
- Full Browser Control: Uses Puppeteer and CDP for complete control over Chrome instances -- allowing you to connect using Puppeteer, Playwright, or Selenium.
- Session Management: Maintains browser state, cookies, and local storage across requests
- Proxy Support: Built-in proxy chain management for IP rotation
- Extension Support: Load custom Chrome extensions for enhanced functionality
- Debugging Tools: Built-in request logging and session recording capabilities
- Anti-Detection: Includes stealth plugins and fingerprint management
- Resource Management: Automatic cleanup and browser lifecycle management
- Browser Tools: Exposes APIs to quick convert pages to markdown, readability, screenshots, or PDFs.
For detailed API documentation and examples, check out our API reference or explore the Swagger UI directly at http://0.0.0.0:3000/documentation
.
The fastest way to get started is to build and run the Docker image:
# Clone and build the Docker image
git clone https://github.com/steel-dev/steel-browser
cd steel-browser
docker compose up
Alternatively, if you have Node.js and Chrome installed, you can run the server directly:
npm run install
npm run dev
This will start the Steel server on port 3000.
Make sure you have the Chrome executable installed and in one of these paths:
-
Linux:
/usr/bin/google-chrome
-
MacOS:
/Applications/Google Chrome.app/Contents/MacOS/Google Chrome
-
Windows:
C:\Program Files\Google\Chrome\Application\chrome.exe
ORC:\Program Files (x86)\Google\Chrome\Application\chrome.exe
For more details on where this is checked look at api/src/utils/browser.ts
.
The Steel browser provides a REST API to control a headless browser powered by Puppeteer. Under the hood, it manages browser instances, sessions, and pages, allowing you to perform complex browsing tasks programmatically.
The full REST API documentation can be found on your Steel instance at /documentation
(e.g., http://0.0.0.0:3000/documentation
).
Steel provides three main ways to let your agents do browser automation:
The /scrape
, /screenshot
, and /pdf
endpoints let you quickly extract clean, well-formatted data from any webpage using the running Steel server. Ideal for simple, read-only, on-demand jobs:
Extract the HTML content of a web page.
# Example using the Actions API
curl -X POST http://0.0.0.0:3000/v1/scrape \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"waitFor": 1000
}'
Take a screenshot of a web page.
# Example using the Actions API
curl -X POST http://0.0.0.0:3000/v1/screenshot \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"fullPage": true
}' --output screenshot.png
The Browser Sessions API lets you relaunch the browser with custom options or extensions (e.g. with a custom proxy) and also reset the browser state. Perfect for complex, stateful workflows that need fine-grained control:
# Launch a new browser session
curl -X POST http://0.0.0.0:3000/v1/sessions \
-H "Content-Type: application/json" \
-d '{
"options": {
"proxy": "user:pass@host:port",
// Custom launch options
}
}'
Note: This integration does not support all the features of the CDP-based browser sessions API.
For teams with existing Selenium workflows, the Steel browser provides a drop-in replacement that adds enhanced features while maintaining compatibility:
# Launch a Selenium session
curl -X POST http://0.0.0.0:3000/v1/selenium/launch \
-H "Content-Type: application/json" \
-d '{
"options": {
// Selenium-compatible options
}
}'
The Selenium API is fully compatible with Selenium's WebDriver protocol, so you can use any existing Selenium clients to connect to the Steel browser.
// Example using the Selenium API
const builder = new Builder()
.forBrowser("chrome")
.usingServer(
`http://0.0.0.0:3000/selenium`
);
const driver = await builder.build();
console.log("Navigating to Google");
await driver.get("https://www.google.com");
// The rest of your Selenium code here...
The Steel browser is an open-source project, and we welcome contributions!