Skip to content

Commit 6ddc226

Browse files
author
Encore
committed
feat: add README
1 parent 24346de commit 6ddc226

File tree

1 file changed

+44
-0
lines changed

1 file changed

+44
-0
lines changed

README.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# GitHub Scraper
2+
3+
This project is a GitHub scraper that uses Puppeteer to extract information about GitHub organizations and their repositories. It collects data such as organization details, top languages, and repository information.
4+
5+
## Installation
6+
7+
1. Clone the repository:
8+
9+
```bash
10+
git clone https://github.com/ranbot-ai/github-scraper.git
11+
cd github-scraper
12+
```
13+
14+
2. Install the dependencies:
15+
```bash
16+
npm install
17+
```
18+
19+
## Usage
20+
21+
1. Set the `ORG_NAME` environment variable to the GitHub organization you want to scrape:
22+
23+
```bash
24+
export ORG_NAME=your-organization-name
25+
```
26+
27+
2. Run the scraper:
28+
```bash
29+
npm start
30+
```
31+
32+
## Features
33+
34+
- Extracts organization information including name, description, top languages, employee count, website, and social links.
35+
- Scrapes repository data such as name, link, description, stars, forks, and pull requests.
36+
- Handles pagination to scrape multiple pages of repositories.
37+
38+
## Contributing
39+
40+
Contributions are welcome! Please fork the repository and submit a pull request for any improvements or bug fixes.
41+
42+
## License
43+
44+
This project is licensed under the MIT License.

0 commit comments

Comments
 (0)