Open
Description
Describe the Bug
When this issue occurs from scraping a page failing, instead of throwing this exception, it seems that content is just empty instead.
To Reproduce
Steps to reproduce the issue:
- Wait for a website to raise exception
All scraping methods failed for url:
on the dashboard - View the return result from the
.scrape_url(url)
method - Example:
{'content': '', 'markdown': '', 'linksOnPage': [], 'metadata': {'sourceURL': 'https://ycombinator.com/people', 'pageStatusCode': 200}}
Expected Behavior
An exception should be thrown by the scrape_url
method instead of returning empty content.
Screenshots
If applicable, add screenshots or copies of the command line output to help explain the issue.
Environment (please complete the following information):
- OS: Linux (python:3.12.3-bookworm image)
- Firecrawl Version: ^0.0.20 Python SDK
- Node.js Version: v23.1.0
Logs
{
"url": "https://ycombinator.com/people",
"type": "scrape",
"method": "fetch",
"result": {
"error": null,
"success": false,
"time_taken": 591,
"response_code": 200,
"response_size": 55917
},
"createdAt": "2024-10-31T05:50:34.653524+00:00"
}
{
"type": "error",
"stack": "Error: All scraping methods failed for URL: https://ycombinator.com/people\n at scrapSingleUrl (/app/dist/src/scraper/WebScraper/single_url.js:378:19)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async /app/dist/src/scraper/WebScraper/index.js:66:32\n at async Promise.all (index 0)\n at async WebScraperDataProvider.convertUrlsToDocuments (/app/dist/src/scraper/WebScraper/index.js:64:13)\n at async Promise.all (index 0)\n at async WebScraperDataProvider.processLinks (/app/dist/src/scraper/WebScraper/index.js:208:40)\n at async WebScraperDataProvider.handleSingleUrlsMode (/app/dist/src/scraper/WebScraper/index.js:174:25)\n at async runWebScraper (/app/dist/src/main/runWebScraper.js:77:23)\n at async startWebScraperPipeline (/app/dist/src/main/runWebScraper.js:13:13)\n at async processJob (/app/dist/src/services/queue-worker.js:236:44)\n at async processJobInternal (/app/dist/src/services/queue-worker.js:72:24)\n at async /app/dist/src/services/queue-worker.js:174:39\n at async /app/dist/src/services/queue-worker.js:161:25",
"message": "All scraping methods failed for URL: https://ycombinator.com/people",
"createdAt": "2024-10-31T05:50:35.242789+00:00"
}
Additional Context
Add any other context about the problem here, such as configuration specifics, network conditions, data volumes, etc.
Repeating the error is inconsistent and I am not reaching a rate limit for my api key.
Activity