This project includes two Python-based scrapers that use the Crawlbase Crawling API to extract product data from Google Shopping:
- A SERP scraper to collect multiple products from the shopping search results.
- A product page scraper to extract detailed info from individual product listings.
📖 Read the full blog post here: How to Scrape Google Shopping Data
- Crawlbase Crawling API
- requests handled internally by the SDK
- BeautifulSoup for HTML parsing
- json for structured data output
Install the required dependencies:
pip install crawlbase beautifulsoup4Update the script(s) with your Crawlbase token:
crawling_api = CrawlingAPI({'token': 'YOUR_CRAWLBASE_TOKEN'})- Scrapes multiple pages of Google Shopping search results.
- Extracts:
- Product Title
- Price
- Image URL
- Retailer
- Product URL
Uses the start parameter to paginate (20 products per page).
python google_shopping_serp_scraper.pySaves results to:
products.json[
{
"title": "Louis Vuitton Neverfull MM",
"price": "$2,030.00",
"image": "https://example.com/image.jpg",
"retailer": "Louis Vuitton",
"product_url": "https://www.google.com/shopping/product/123456789"
},
...
]Extracts detailed info from a single Google Shopping product page:
- Title
- Price
- Description
- Image URLs
Update the product_url in the script, then:
python google_shopping_product_scraper.pySaves details to:
product_details.json{
"title": "Louis Vuitton Neverfull MM",
"price": "$2,030.00",
"description": "Iconic Louis Vuitton tote with a timeless design...",
"images": ["https://example.com/image1.jpg", "https://example.com/image2.jpg"]
}Google Shopping employs strict bot protection. Using Crawlbase Crawling API ensures:
- IP rotation
- JavaScript rendering
- User-Agent spoofing
- Geo-targeting support
- Add CLI support for dynamic search terms and product links
- Combine both scrapers into a single flow
- Output data in CSV