firecrawl
🔥 The Web Data API for AI - Power AI agents with clean web data
scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
D4Vinci
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
ScrapeGraphAI
Python scraper based on AI
apify
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Evil0ctal
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
getmaxun
🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes 🔥
spider-rs
Web crawler and scraper for Rust
0x676e67
An ergonomic Python HTTP Client with TLS fingerprint
0xMassi
Fast, local-first web content extraction for LLMs. Scrape, crawl, extract structured data — all from Rust. CLI, REST API, and MCP server.
orangecoding
❤️ Fredy - [F]ind [R]eal [E]state [D]amn Eas[y] - Fredy keeps searching for new apartments, houses, and flats in Germany on platforms like ImmoScout24, Immowelt, Immonet, eBay Kleinanzeigen, and WG-Gesucht and instantly delivers the results to you via Slack, Telegram, Email, Discord or ntfy, so you can focus on the more important things in life ;)
crwlrsoft
Library for Rapid (Web) Crawler and Scraper Development
xiyuan-fengyu
web spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
s0rg
The unix-way web crawler
mascanho
SEO/GEO toolkit to analyse, crawl, parse and optimise websites & logs (Nginx & Apache)