Welcome to SpiderIQ Guides

This section provides comprehensive guides and tutorials to help you get the most out of SpiderIQ’s web scraping and Google Maps data extraction capabilities.

What is SpiderIQ?

SpiderIQ is a high-performance API service that provides two specialized scraping capabilities:

SpiderSite

Website ScrapingExtract content from any website using the Crawl4AI library with optional AI-powered data extraction.
  • Full-page markdown conversion
  • AI-powered content extraction
  • Screenshot capture
  • Metadata extraction

SpiderMaps

Google Maps ScrapingExtract business information from Google Maps using the Places API.
  • Business details (name, address, phone)
  • Reviews and ratings
  • Business hours
  • Categories and photos

Available Guides

Common Use Cases

Content Aggregation

Extract articles, blog posts, and documentation from multiple sources for content analysis or aggregation platforms. Example: News monitoring, competitor content analysis, research aggregation

E-commerce Data

Scrape product information, prices, and reviews from e-commerce sites for price monitoring or market research. Example: Price comparison tools, inventory monitoring, product catalog building

Local Business Research

Extract business information from Google Maps for lead generation, market research, or directory creation. Example: B2B prospecting, competitive analysis, local SEO research

Real Estate & Property Data

Gather property listings, prices, and details for real estate analysis and market trends. Example: Property aggregators, market analysis tools, investment research

Job Board Aggregation

Collect job postings from multiple sources to create comprehensive job search platforms. Example: Job aggregators, salary analysis, hiring trend research

How SpiderIQ Works

Processing Flow

  1. Submit - Client submits a job via API
  2. Queue - Job is queued for processing
  3. Process - Available worker picks up and processes the job
  4. Store - Results are saved (screenshots to Cloudflare R2, data to Database)
  5. Retrieve - Client polls for results and receives data

Architecture

SpiderIQ is built on a scalable, distributed architecture:
  • API Gateway - FastAPI-based REST API
  • Message Queue - Job distribution system
  • Workers - Distributed scraping workers (Docker containers)
  • Database - Database for job metadata and results
  • Cache - Redis for performance optimization
  • CDN Storage - Cloudflare R2 for screenshots

Worker Types

  • SpiderSite Workers - 4+ workers for website scraping
  • SpiderMaps Workers - 2+ workers for Google Maps scraping

Performance & Limits

Rate Limits

Standard Rate Limit: 100 requests per minute per clientBurst allowance of 20 requests for occasional spikes. Contact us for higher limits.

Processing Times

Job TypeAverage TimeRange
SpiderSite (simple page)5-15s3-30s
SpiderSite (with AI)10-25s5-45s
SpiderMaps3-8s2-15s

Queue Capacity

  • Normal load: < 20 jobs queued
  • Moderate load: 20-50 jobs queued
  • High load: > 50 jobs queued
Use the Queue Stats endpoint to monitor current load.

Best Practices

Poll efficiently: Use 2-5 second intervals when polling for results to balance responsiveness and rate limit compliance.
Handle rate limits: Implement exponential backoff when you receive 429 (Too Many Requests) responses.
Check queue load: Use /system/queue-stats before submitting bulk jobs to avoid overwhelming the queue.
Store job IDs: Save job IDs in your database to retrieve results later if needed.
Respect robots.txt: While SpiderIQ can scrape most sites, ensure you have permission and respect robots.txt directives.

Need Help?

Next Steps

1

Get Credentials

Contact admin@di-atomic.com to get your API credentials
2

Read the Quickstart

Follow our 5-minute quickstart guide to submit your first job
3

Explore Guides

Learn about website scraping and explore the API reference
4

Build Your Integration

Use the API reference to build your integration