Documentation Index
Fetch the complete documentation index at: https://notte-experiment-visibility-md-links.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Scrape any page and get formatted data
The Scrape API allows you to get the data you want from web pages using a single call. You can scrape page content and capture its data in various formats. For detailed usage, checkout the Scrape API Reference.Basic Markdown Scraping
The simplest way to scrape a webpage is to extract its content as markdown. This is useful when you want to preserve the page’s structure and formatting.simple_scrape.py
Structured Data Extraction
For more sophisticated use cases, you can extract structured data from web pages by defining a schema using Pydantic models. This is particularly useful when you need to extract specific information like product details, pricing plans, or article metadata.Example: Extracting Pricing Plans from notte.cc
Let’s say you want to extract pricing information from a website. First, define your data models then use these models to extract structured data:
structured_scrape.py
Agent Scraping
Agent Scraping is a more powerful way to scrape web pages. It allows you to navigate through the page, fill forms, and extract data from dynamic content.agent_scrape.py
Topics & Tips
Scrape API vs Agent Scrape
Scrape API
Perfect for1. One-off scraping tasks2. Simple data extraction3. Static content
Agent Scrape
Perfect for1. Authentication or login flows2. Form filling and submission3. Dynamic content
Response Format Best Practices
Tips for designing schemas:- Try a few different schemas to find what works best
- If you ask for a
company_namefield but there is nocompany_nameon the page, LLM scraping will fail - Design your schema carefully based on the actual content structure
- Response format is available for both
scrapeandagent.run
schema_design.py

