Skip to content

Python SDK

The faircrawl Python package mirrors the main REST surface for scrape, crawl, map, research, interact, agent, enrichment, and freshness checks.

Install

Terminal window
pip install faircrawl

Sync client

from faircrawl import FairCrawl
fc = FairCrawl()
doc = fc.scrape(
"https://stripe.com/pricing",
format="markdown",
only_main_content=True,
)
research = fc.research(
"best browser automation stacks",
sources=["web", "reddit", "hackernews"],
synthesize=True,
)

Async client

import asyncio
from faircrawl import AsyncFairCrawl
async def main() -> None:
async with AsyncFairCrawl() as fc:
doc = await fc.scrape(
"https://docs.python.org/3/",
format="markdown",
)
print(doc.text[:500])
asyncio.run(main())

Agent, interact, and enrichment

from faircrawl import FairCrawl
fc = FairCrawl()
session = fc.interact(
"https://example.com/login",
[
{"type": "fill", "selector": "#email", "value": "user@example.com"},
{"type": "fill", "selector": "#password", "value": "secret"},
{"type": "click", "selector": "button[type='submit']"},
{"type": "wait_for_selector", "selector": ".dashboard", "timeout_ms": 10000},
{"type": "extract", "format": "markdown"},
],
)
answer = fc.agent(
"Find the latest public pricing page for Stripe and summarize it.",
max_steps=6,
budget_usd=0.05,
)
job = fc.enrich_company("stripe.com")

Raw HTTP for endpoints without a convenience wrapper

The current Python SDK intentionally stays focused. For raw typed routes that are not wrapped yet, use httpx against the REST API.

import httpx
import os
api_key = os.environ["FAIRCRAWL_API_KEY"]
response = httpx.get(
"https://api.faircompany.ai/v1/crawl/linkedin/company",
headers={"Authorization": f"Bearer {api_key}"},
params={"url": "https://www.linkedin.com/company/microsoft/"},
timeout=30,
)
response.raise_for_status()
print(response.json())