Overview
Retrieve the complete results for a scraping job. This endpoint returns different status codes based on job state.Path Parameters
The unique identifier of the job (UUID format)Example:
550e8400-e29b-41d4-a716-446655440000Response Status Codes
Job completed successfully - results available
Job still processing - poll again later
Job failed or was cancelled
Job ID does not exist
Response Structure
Flat Structure (v2.7.1): Responses now use a simplified 2-3 level nesting structure (previously 5 levels). All fields are always present - fields not applicable to your request will be
null.Top-Level Response Fields
true if job completed successfully, false if failedUnique job identifier (UUID format)
Job type:
spiderSite or spiderMapsJob status:
completed, failed, processing, queued, or cancelledTime taken to process the job (null if not completed)
Worker identifier that processed the job
Completion timestamp in ISO 8601 format
Additional context about job state (e.g., “Job is being processed”)
Job results data (structure varies by job type, see below)
Error message if job failed (null otherwise)
SpiderSite Data Fields
Flat Structure: Social media fields are at the top level of
data (e.g., data.linkedin), not nested under data.contact_info.social_media.linkedin.Basic Information
Website URL that was crawled
Number of pages successfully crawled
Crawl result:
success, partial, or failedContact Information (Flat - Top Level)
Email addresses found (filtered - tracking emails removed)
Phone numbers found
Physical addresses found
Social Media Profiles (All Flat - Top Level)
LinkedIn company/profile URL (null if not found)
Twitter/X profile URL (null if not found)
Facebook page URL (null if not found)
Instagram profile URL (null if not found)
YouTube channel URL (null if not found)
GitHub organization/user URL (null if not found)
TikTok profile URL (null if not found)
Pinterest profile URL (null if not found)
Medium profile URL (null if not found)
Discord server invite URL (null if not found)
WhatsApp contact/business URL (null if not found)
Telegram contact/channel URL (null if not found)
Snapchat profile URL (null if not found)
Reddit profile/subreddit URL (null if not found)
Markdown Compendium
AI-generated markdown summary of the website (if enabled)
Compendium metadata including size, cleanup level, and storage location
AI Features (Always Present - Null If Not Enabled)
Company information extracted with AI (null if
extract_company_info: false)Business pain points identified by AI (null if
extract_pain_points: false)Team members found with AI extraction (empty array if
extract_team: false)CHAMP framework lead scoring (null if product/ICP not provided)
Personalization data for outreach (null if not available)
Technical Metadata
Crawl metadata and statistics including:
browser_rendering_available: Whether SPA rendering was usedspa_enabled: Whether SPA detection was enabledsitemap_used: Whether sitemap-first crawling was usedcrawl_strategy: Strategy used (sitemap, bestfirst, bfs, dfs)total_emails_found: Total emails before filteringtotal_phones_found: Total phone numbers found
SpiderMaps Data Fields
Basic Information
Search query used for the scrape
Number of business listings returned
Array of business listings (see structure below)
Search metadata (max_results, extract_reviews, language, etc.)
Business Listing Structure
Each business in thebusinesses array contains:
Business name
Google Place ID
Average rating (1.0-5.0)
Number of reviews
Full street address
Phone number
Business website URL
Business categories/types
Latitude and longitude coordinates
Google Maps link to the business
Status:
OPERATIONAL, CLOSED_TEMPORARILY, etc.Price range:
$, $$, $$$, or $$$$Working hours by day of week
Example Request
Example Responses
- SpiderSite - Minimal
- SpiderSite - With AI
- SpiderSite - Full CHAMP
- SpiderMaps
- Processing (202)
- Failed (410)
- Not Found (404)
Basic contact extraction without AI features:
200 OK - Minimal Request
Handling Different Status Codes
Data Storage
Screenshot Storage: SpiderSite job screenshots are stored in Cloudflare R2 and accessible via CDN at
cdn.spideriq.di-atomic.com. URLs are permanent and do not expire.Best Practices
Don’t poll too frequently: Respect the 100 requests/minute rate limit. Poll every 3-5 seconds for optimal balance between responsiveness and rate limit compliance.
Save job IDs: Store job IDs in your database to retrieve results later. Results remain available indefinitely.
