This script can search and return list of news result
Now we're separating script that fetch detik news to new library, this repo use that new library.
Here's the library: dn_scraper. Its should be installed as pip install dn-scraper
and imported as from dn_scraper import DetikNewsScraper
We want to separate the library that fetch data and the interface like web api interface using flask that this repo demonstrate.
Now, you can grab the library and use that for the interface you like, maybe with FastApi, Django, or as Command Line Interface.
- Search for news articles by query
- Retrieve detailed article content
- Limit the number of search results
- Python 3.x
- Pip (Python package installer)
- Virtual environment (optional but recommended)
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Create and activate a virtual environment (optional):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install the required packages:
pip install dn-scraper Flask
It should work without any further configuration. If something wrong, we're really appreciate you to open new issue.
-
Run the Flask application:
python main.py
-
Access the API endpoints:
-
Search for news articles:
- For details and limit:
GET /search?q=<query>&detail=<true|false>&limit=<number>
- Without details and limit:
GET /search?q=<query>
- For details and limit:
-
Parameters:
q
: The search query (required).detail
: Whether to include the full article body in the response (true
orfalse
). Defaults tofalse
.limit
: The maximum number of results to return. Defaults toNone
.
-
Example Request:
GET /search?q=makan%20siang%20gratis&detail=true&limit=5
-
Example Response:
{ "status": 200, "data": [ { "judul": "Example Title", "link": "https://news.detik.com/example", "gambar": "https://example.com/image.jpg", "body": "Full article body...", "waktu": "2024-08-04T12:34:56" } // Additional articles... ], "length": 5 }
-