Skip to content

ranbot-ai/github-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Scraper

This project is a GitHub scraper that uses Puppeteer to extract information about GitHub organizations and their repositories. It collects data such as organization details, top languages, and repository information.

Installation

  1. Clone the repository:

    git clone https://github.com/ranbot-ai/github-scraper.git
    cd github-scraper
  2. Install the dependencies:

    npm install

Usage

  1. Set the ORG_NAME environment variable to the GitHub organization you want to scrape:

    env ORG_NAME=ranbot-ai npx ts-node scraper.ts

Features

  • Extracts organization information including name, description, top languages, employee count, website, and social links.
  • Scrapes repository data such as name, link, description, stars, forks, and pull requests.
  • Handles pagination to scrape multiple pages of repositories.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request for any improvements or bug fixes.

License

This project is licensed under the MIT License.

About

A Nodejs script that scrapes data from Github public (org/user) profiles.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published