Firecrawl recently launched a new feature – the LLMs.txt Generator API (Alpha version) – designed to help users transform any website's content into clean, LLM-ready text files. Simply provide a website URL, and Firecrawl will crawl the website and its linked pages, generating two text file formats: llms.txt and llms-full.txt, for subsequent analysis and training.

QQ_1741571298119.png

The generator's workflow is straightforward. Users provide a URL, and the system automatically crawls the website, extracting clean and meaningful text information. The generated files come in two types: llms.txt provides a concise summary of the website content, containing key information; while llms-full.txt offers a more detailed, complete text version suitable for in-depth analysis.

Several key parameters can be set. First is "url," the website address for which you want to generate the LLMs.txt file. You can also adjust "maxUrls," controlling the maximum number of pages crawled (1-100, defaulting to 10). Additionally, you can choose whether to generate llms-full.txt (disabled by default).

The LLMs.txt generator operates asynchronously. Users can submit requests and monitor the generation status in real-time. The system provides status updates, such as "in progress" or "completed," for convenient progress tracking.

Being in Alpha, the feature has limitations. It only supports publicly accessible pages; login-protected or paywalled content is not handled. During the Alpha phase, the processing limit is 5000 URLs. As an Alpha feature, the output format and processing workflow may be adjusted based on user feedback.

Pricing is based on the number of URLs processed, with a base cost of 1 credit per URL. Control costs by adjusting the maxUrls parameter.

Access point: https://docs.firecrawl.dev/features/alpha/llmstxt

Key Highlights:

🌐 Provide a website URL to quickly generate LLM-ready text files.

📝 Two text formats are generated for different user needs.

🔒 Only supports publicly accessible pages, with quantity limits during the Alpha phase.