How to Create a Knowledge Base by Scraping Any Website in Under a Minute
Creating a knowledge base from any website has never been easier. With the right tools, anyone can scrape website data and transform it into useful information in less than a minute. This guide walks you through the process using Relevance AI and FileCrawl Web Scraper.
Getting Started with Relevance AI
After signing up for Relevance AI, you’ll be presented with an intuitive interface. The first step is to click on “Create New Tool” and provide a title for your project – for example, “Research Company.”
The description is a crucial element in Relevance AI. You should specify that your tool takes a company URL and scrapes the content. When setting up the inputs, add a field for the company URL with proper formatting instructions.
Setting Up Web Scraping
Once your tool is configured, you’ll need to add a step that uses FileCrawl Web Scraper. This powerful function allows you to extract data from any website by simply providing the URL.
For demonstration purposes, you can limit the number of pages to scrape to five, though the tool is capable of handling much more. New users receive 500 credits with FileCrawl, which translates to approximately 526 API calls – more than enough to get started.
Connecting Your API Keys
The process requires two API keys:
- FileCrawl API key: Create this in the Playground section of FileCrawl
- OpenAI API key: Generate this from platform.openai.com (requires a funded account)
After setting up both keys in Relevance AI’s settings section, you’re ready to proceed with the scraping process.
Running the Scraper
When you run the scraper, it will extract all relevant information from the target website. The speed depends on your internet connection and the size of the website, but the process is typically quick.
For this example, scraping Warp.com yields comprehensive information about the company’s platform, which helps users discover communities, learn new skills, and explore business opportunities online.
Processing the Data
Once the data is scraped, you can add a prompt step to process the information according to your needs. For instance, you might ask the AI to summarize the content into 300 words.
When crafting your prompt, make sure to reference the scraped data using the proper syntax: file_crawl.data.
Final Results
The end result is a comprehensive collection of information from your target website, organized in a way that’s useful for your specific needs. You can view all the extracted data within the Relevance AI interface.
This method provides a simple yet powerful way to create knowledge bases from any website. The scraped information can be saved for future use or shared with others, making it an invaluable tool for research, content creation, and business intelligence.