News crawling with StormCrawler - stores content as WARC
-
Updated
Dec 13, 2023 - Java
News crawling with StormCrawler - stores content as WARC
Easily crawl news portals or blog sites using Storm Crawler.
Demonstration on how to use the Crawling Framework to setup a simple science news crawler and store results in ElasticSearch. Use this configuration to set up your own crawler.
Add a description, image, and links to the storm-crawler topic page so that developers can more easily learn about it.
To associate your repository with the storm-crawler topic, visit your repo's landing page and select "manage topics."