beautifulsoup

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

python crawler scraper automation web-crawler headless scraping crawling pip web-scraping beautifulsoup web-crawling hacktoberfest headless-chrome apify playwright

Updated Nov 13, 2024
Python

lorae / roundup

Star

Web scraper which aggregates pre-print academic economics papers from 20+ sources; presents titles, abstracts, authors and hyperlinks on an online dashboard. Auto-updates daily.

selenium economics microeconomics requests web-scraping beautifulsoup macroeconomics beautifulsoup4 html-scraping github-actions streamlit streamlit-dashboard streamlit-webapp api-scraping

Updated Nov 13, 2024
Python

M-LAai-ai / Crawler

Star

A robust, modular web crawler built in Python for extracting and saving content from websites. This crawler is specifically designed to extract text content from both HTML and PDF files, saving them in a structured format with metadata.

html pdf crawler scraper scraping crawling beautifulsoup

Updated Nov 13, 2024
Python

kcsoc / society-email-scrape

Star

Scrapes Every Email Address of Every Society in Every University

python scraper university email webscraper web-scraper web-scraping beautifulsoup webscraping hacktoberfest beautifulsoup4

Updated Nov 13, 2024
Python

ozgesadet / silver-invention

Star

AI based tender finding

website artificial-intelligence beautifulsoup

Updated Nov 12, 2024

DinhHuy2010 / beautifulsoup

Star

beautifulsoup mirror from https://git.launchpad.net/~leonardr/beautifulsoup with complete git history, branches, tags.

mirror web-scraping beautifulsoup mirrored-repository beautifulsoup4

Updated Nov 12, 2024
HTML

ashvardanian / StringZilla

Star

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖

html parser json information-retrieval csv string simd dataset string-manipulation sorting-algorithms beautifulsoup pattern-recognition ndjson substring string-matching string-search string-parsing common-crawl laion

Updated Nov 12, 2024
C++

sadadYes / post-archiver

Star

A tool to scrape YouTube community posts

python scraper youtube python3 beautifulsoup beautifulsoup4 playwright playwright-python

Updated Nov 12, 2024
Python

005-bot / monitor

Star

Сервис выполняет периодическое сканирование страницы с актуальными отключениями, выявляет изменения и отправляет их в Redis PubSub.

python redis monitoring mvp pubsub web-scraping beautifulsoup pipenv httpx

Updated Nov 11, 2024
Python

devallasaitej / WebScrapers

Star

Python code for scraping Amazon Bestselling Book reviews, BestBuy product reviews

python web beautifulsoup selenium-webdriver scraping-websites selenium-python scraping-python

Updated Nov 11, 2024
Jupyter Notebook

twowannabe / multiweatherbot

Star

Telegram bot providing weather updates, water temperatures, solar flare notifications, and daily horoscopes.

python weather telegram-bot nasa asyncio openweathermap beautifulsoup horoscope

Updated Nov 14, 2024
Python

DearingData / Web-Scraping-BjjHeroes

Star

This was a project using Python and BeautifulSoup to scrape all of the athlete match results off the BJJHeroes.com website.

python beautifulsoup webscraping jiu-jitsu

Updated Nov 11, 2024
Jupyter Notebook

SamarMst / Projet-traitement-des-donnees

Star

This project focuses on applying data analysis and machine learning techniques through a hands-on application.

python machine-learning data-analysis beautifulsoup camel-tools

Updated Nov 11, 2024
Python

Pradip-p / lazy-py-crawler

Star

Lazy Crawler is a Python package that simplifies web scraping tasks. It builds upon Scrapy, a powerful web crawling and scraping framework, providing additional utilities and features for easier data extraction. With Lazy Crawler, you can quickly set up and deploy web scraping projects, saving time and effort.

python scraper requests scrapy beautifulsoup

Updated Nov 11, 2024
Python

sunshineplan / node

Star

HTML parsing library, the alternative to BeautifulSoup in Golang.

go golang generic html-parser xpath beautifulsoup css-selectors xpath-query

Updated Nov 11, 2024
Go

Improve this page

Add a description, image, and links to the beautifulsoup topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the beautifulsoup topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

beautifulsoup

Here are 4,408 public repositories matching this topic...

sawyerclick / scrapers

MuhammadAliyan10 / Automate-Email-Python

jasocami / health-insurance-spider

tbeidlershenk / pdga-rating-bot

lfmramos / tool-stock-visualization

apify / crawlee-python

lorae / roundup

M-LAai-ai / Crawler

kcsoc / society-email-scrape

ozgesadet / silver-invention

DinhHuy2010 / beautifulsoup

ashvardanian / StringZilla

sadadYes / post-archiver

005-bot / monitor

devallasaitej / WebScrapers

twowannabe / multiweatherbot

DearingData / Web-Scraping-BjjHeroes

SamarMst / Projet-traitement-des-donnees

Pradip-p / lazy-py-crawler

sunshineplan / node

Improve this page

Add this topic to your repo