crawler

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Featured

apifyautomationcrawler

Douyin_TikTok_Download_API

Evil0ctal

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具，支持API调用，在线批量解析及下载。

Featured

apiasynccrawler

Python18.6K

maxun

getmaxun

🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes 🔥

Featured

agentsapiautomation

TypeScript16.1K

spider

spider-rs

Web crawler and scraper for Rust

ai-agentautomationcrawler

Rust2.6K142176d ago

webclaw

0xMassi

Fast, local-first web content extraction for LLMs. Scrape, crawl, extract structured data — all from Rust. CLI, REST API, and MCP server.

aiai-agentsai-scraping

Rust1.6K1441771d ago

wreq-python

0x676e67

An ergonomic Python HTTP Client with TLS fingerprint

akamaiakamai-fingerprintcensorship-resistant

Rust1.4K71091d ago

fredy

orangecoding

❤️ Fredy - [F]ind [R]eal [E]state [D]amn Eas[y] - Fredy keeps searching for new apartments, houses, and flats in Germany on platforms like ImmoScout24, Immowelt, Immonet, eBay Kleinanzeigen, and WG-Gesucht and instantly delivers the results to you via Slack, Telegram, Email, Discord or ntfy, so you can focus on the more important things in life ;)

appartmentappartmentscrawler

JavaScript1.2K1917812h ago

seonaut

StJudeWasHere

Open source SEO audit tool.

crawlercrawlingdocker

Go73411221mo ago

scrapple

AlexMathew

A framework for creating semi-automatic web content extractors

beautifulsoupcrawlercss-selector

Python502415mo ago

fundus

flairNLP

A very simple news crawler with a funny name

cc-newscommoncrawlcorpus

Python4631109h ago

crawler

crwlrsoft

Library for Rapid (Web) Crawler and Scraper Development

crawlercrawlinghacktoberfest

PHP3691131mo ago

crawley

s0rg

The unix-way web crawler

clicrawlergolang

Go3391811d ago

sitemap-generator-cli

lgraubner

Creates an XML-Sitemap by crawling a given site.

Abandoned

clicrawlergoogle

JavaScript3391443y ago

ppspider

xiyuan-fengyu

web spider built by puppeteer, support task-queue and task-scheduling by decorators，support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架，提供灵活的任务队列管理调度方案，提供便捷的数据保存方案（nedb/mongodb），提供数据可视化和用户交互的实现方案

Abandoned

angularcheeriocrawler

TypeScript338744y ago

RustySEO

mascanho

SEO/GEO toolkit to analyse, crawl, parse and optimise websites & logs (Nginx & Apache)

aicrawlercwv

TypeScript3145562d ago

site-audit-seo

viasite

Web service and CLI tool for SEO site audit: crawl site, lighthouse all pages, view public reports in browser. Also output to console, json, csv, xlsx

auditclicrawl-site

JavaScript295444mo ago