Python SDK

The faircrawl Python package mirrors the main REST surface for scrape, crawl, map, research, interact, agent, enrichment, and freshness checks.

Install

pip install faircrawl

Sync client

from faircrawl import FairCrawl

fc = FairCrawl()

doc = fc.scrape(
    "https://stripe.com/pricing",
    format="markdown",
    only_main_content=True,
)

research = fc.research(
    "best browser automation stacks",
    sources=["web", "reddit", "hackernews"],
    synthesize=True,
)

Async client

import asyncio
from faircrawl import AsyncFairCrawl

async def main() -> None:
    async with AsyncFairCrawl() as fc:
        doc = await fc.scrape(
            "https://docs.python.org/3/",
            format="markdown",
        )
        print(doc.text[:500])

asyncio.run(main())

Agent, interact, and enrichment

from faircrawl import FairCrawl

fc = FairCrawl()

session = fc.interact(
    "https://example.com/login",
    [
        {"type": "fill", "selector": "#email", "value": "user@example.com"},
        {"type": "fill", "selector": "#password", "value": "secret"},
        {"type": "click", "selector": "button[type='submit']"},
        {"type": "wait_for_selector", "selector": ".dashboard", "timeout_ms": 10000},
        {"type": "extract", "format": "markdown"},
    ],
)

answer = fc.agent(
    "Find the latest public pricing page for Stripe and summarize it.",
    max_steps=6,
    budget_usd=0.05,
)

job = fc.enrich_company("stripe.com")

Raw HTTP for endpoints without a convenience wrapper

The current Python SDK intentionally stays focused. For raw typed routes that are not wrapped yet, use httpx against the REST API.

import httpx
import os

api_key = os.environ["FAIRCRAWL_API_KEY"]

response = httpx.get(
    "https://api.faircompany.ai/v1/crawl/linkedin/company",
    headers={"Authorization": f"Bearer {api_key}"},
    params={"url": "https://www.linkedin.com/company/microsoft/"},
    timeout=30,
)
response.raise_for_status()
print(response.json())