Skip to content

🚧 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you build automate the web without worrying about infrastructure.

License

Notifications You must be signed in to change notification settings

syllogy/steel-browser

 
 

Repository files navigation


Steel Logo

Steel

The open-source browser API for AI agents & apps.
The best way to build live web agents and browser automation tools.

Commit Activity License Discord Twitter Follow GitHub stars

Steel Demo

Steel.dev is an open-source browser API that makes it easy to build AI apps and agents that interact with the web. Instead of building automation infrastructure from scratch, you can focus on your AI application while Steel handles the complexity.

This repo is the core building block behind Steel - a production-ready, containerized browser sandbox that you can deploy anywhere. It includes built-in stealth capabilities, text-to-markdown session management, UI to view/debug sessions, and full browser control through standard automation frameworks like Puppeteer, Playwright, and Selenium 🔥

Steel is in public beta and evolving every day. Your suggestions, ideas, and reported bugs help us immensely. Do not hesitate to join in the conversation on Discord or raise a GitHub issue. We read everything, respond to most, and love you.

⚡ Installation

The easiest way to get started with Steel is by creating a Steel Cloud account.

Installation methods Link
Run locally with GHCR Deploy with Github Container Redistry
1-click deploy to Railway Deploy on Railway
1-click deploy to Render Deploy to Render

✨ Highlights

The steel-browser provides a REST API to control, run, and manage a production-ready browser environment. Under the hood, it manages browser instances, sessions, and pages, allowing you to perform complex browsing tasks programmatically without any of the headaches:

  • Full Browser Control: Uses Puppeteer and CDP for complete control over Chrome instances -- allowing you to connect using Puppeteer, Playwright, or Selenium.
  • Session Management: Maintains browser state, cookies, and local storage across requests
  • Proxy Support: Built-in proxy chain management for IP rotation
  • Extension Support: Load custom Chrome extensions for enhanced functionality
  • Debugging Tools: Built-in request logging and session recording capabilities
  • Anti-Detection: Includes stealth plugins and fingerprint management
  • Resource Management: Automatic cleanup and browser lifecycle management
  • Browser Tools: Exposes APIs to quick convert pages to markdown, readability, screenshots, or PDFs.

For detailed API documentation and examples, check out our API reference or explore the Swagger UI directly at http://0.0.0.0:3000/documentation.

Make sure to give us a star ⭐

Start us on Github!

Getting Started

The fastest way to get started is to build and run the Docker image:

# Clone and build the Docker image
git clone https://github.com/steel-dev/steel-browser
cd steel-browser
docker build -t steel .

# Run the server
docker run -p 3000:3000 -p 5173:5173 -p 9223:9223 steel

Alternatively, if you have Node.js and Chrome installed, you can run the server directly:

npm install
npm run dev

This will start the Steel server on port 3000.

Make sure you have the Chrome executable installed and in one of these paths:

  • Linux: /usr/bin/google-chrome

  • MacOS: /Applications/Google Chrome.app/Contents/MacOS/Google Chrome

  • Windows:

    • C:\Program Files\Google\Chrome\Application\chrome.exe OR
    • C:\Program Files (x86)\Google\Chrome\Application\chrome.exe

For more details on where this is checked look at src/server/utils/browser.ts.

Usage

The Steel browser provides a REST API to control a headless browser powered by Puppeteer. Under the hood, it manages browser instances, sessions, and pages, allowing you to perform complex browsing tasks programmatically.

The full REST API documentation can be found on your Steel instance at /documentation (e.g., http://0.0.0.0:3000/documentation).

Steel provides three main ways to let your agents do browser automation:

Quick Actions API

The /scrape, /screenshot, and /pdf endpoints let you quickly extract clean, well-formatted data from any webpage using the running Steel server. Ideal for simple, read-only, on-demand jobs:

Scrape a Web Page

Extract the HTML content of a web page.

# Example using the Actions API
curl -X POST http://0.0.0.0:3000/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "waitFor": 1000
  }'

Take a Screenshot

Take a screenshot of a web page.

# Example using the Actions API
curl -X POST http://0.0.0.0:3000/v1/screenshot \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "fullPage": true
  }' --output screenshot.png

Browser Sessions API

The Browser Sessions API lets you relaunch the browser with custom options or extensions (e.g. with a custom proxy) and also reset the browser state. Perfect for complex, stateful workflows that need fine-grained control:

# Launch a new browser session
curl -X POST http://0.0.0.0:3000/v1/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "options": {
      "proxy": "user:pass@host:port",
      // Custom launch options
    }
  }'

Selenium Integration

Note: This integration does not support all the features of the CDP-based browser sessions API.

For teams with existing Selenium workflows, the Steel browser provides a drop-in replacement that adds enhanced features while maintaining compatibility:

# Launch a Selenium session
curl -X POST http://0.0.0.0:3000/v1/selenium/launch \
  -H "Content-Type: application/json" \
  -d '{
    "options": {
      // Selenium-compatible options
    }
  }'

The Selenium API is fully compatible with Selenium's WebDriver protocol, so you can use any existing Selenium clients to connect to the Steel browser.

// Example using the Selenium API
const builder = new Builder()
      .forBrowser("chrome")
      .usingServer(
        `http://0.0.0.0:3000/selenium`
      );

const driver = await builder.build();

console.log("Navigating to Google");
await driver.get("https://www.google.com");

// The rest of your Selenium code here...

Get involved

The Steel browser is an open-source project, and we welcome contributions!

  • Questions/ideas/feedback? Come hangout on Discord
  • Found a bug? Open an issue on GitHub

License

Apache 2.0

About

🚧 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you build automate the web without worrying about infrastructure.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 86.4%
  • CSS 6.5%
  • JavaScript 3.8%
  • Dockerfile 2.3%
  • Other 1.0%