gpt-crawler

Content scraper

Automates the process of generating knowledge files to create custom AI models from website content

Crawl a site to generate knowledge files to create your own custom GPT from a URL

GitHub

19k stars
124 watching
2k forks
Language: TypeScript
last commit: 3 months ago
ai

Related projects:

Repository Description Stars
gpt-engineer-org/gpt-engineer An AI-powered development tool that uses natural language to generate and execute code. 52,392
apify/crawlee A tool for building reliable web scraping and browser automation pipelines in Node.js. 15,604
gitbookio/gitbook A Next.js based web application for managing and hosting documentation sites using Markdown format 27,211
spatie/crawler A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently. 2,537
code4craft/webmagic A scalable framework for building web crawlers in Java. 11,432
whoiskatrin/chart-gpt An AI tool to generate charts from text input 3,554
ricklamers/gpt-code-ui An interactive code generation and execution tool using AI models 3,561
yasserg/crawler4j A Java-based web crawler for extracting and processing web page content 4,555
git-bug/git-bug A distributed, offline-first bug tracker embedded in git that allows collaborative development without vendor lock-in. 8,148
builderio/figma-html A tool for converting Figma designs into live webpages and code, supporting various frameworks and languages. 3,173
builderio/builder Enables developers to visually create and generate code for various frontend frameworks 7,548
gitextensions/gitextensions A standalone UI tool for managing Git repositories, integrating with Windows Explorer and Visual Studio. 7,788
jitpack/jitpack.io Provides a package repository and build service for JVM and Android projects 2,535
yujiosaka/headless-chrome-crawler A distributed crawling framework that leverages Headless Chrome to scrape dynamic websites 5,527
unclecode/crawl4ai A tool for web crawling and data extraction, designed to work with large language models. 16,180