Error Handling & Troubleshooting

Overview

This guide covers common errors you may encounter when using SpiderIQ API and how to handle them gracefully in your applications.

HTTP Status Codes

200 OK - Success

Job completed successfully and results are available.

{
  "success": true,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "spiderSite",
  "status": "completed",
  "data": { ... }
}

Action: Process the results

201 Created - Job Submitted

Job was successfully submitted and queued for processing.

{
  "success": true,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "spiderSite",
  "status": "queued",
  "message": "Job submitted successfully"
}

Action: Save the job_id and poll for results

202 Accepted - Job Processing

Job is still being processed. Results not yet available.

{
  "success": false,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "message": "Job is still being processed"
}

Action: Wait and poll again Handling Example:

import time
import requests

def wait_for_job(job_id, headers, max_wait=120):
    """Poll for job completion with timeout"""
    start_time = time.time()

    while time.time() - start_time < max_wait:
        response = requests.get(
            f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/results",
            headers=headers
        )

        if response.status_code == 200:
            return response.json()
        elif response.status_code == 202:
            print("⏳ Job still processing...")
            time.sleep(3)
        else:
            raise Exception(f"Error: {response.status_code}")

    raise TimeoutError("Job did not complete within timeout period")

400 Bad Request - Invalid Input

The request was malformed or contains invalid data. Common causes:

Invalid URL Format

Error:

{
  "detail": "Invalid URL format. Please provide a valid HTTP/HTTPS URL."
}

Solution:

Ensure URL starts with http:// or https://
Check for typos in the URL
Validate URL format before submitting

from urllib.parse import urlparse

def is_valid_url(url):
    try:
        result = urlparse(url)
        return all([result.scheme, result.netloc])
    except:
        return False

url = "https://example.com"
if is_valid_url(url):
    # Submit job
    pass
else:
    print("❌ Invalid URL format")

Missing Required Fields

Error:

{
  "detail": "Missing required field: url"
}

Solution: Ensure all required fields are present in your request:

data = {
    "url": "https://example.com",  # Required
    "instructions": "Extract..."     # Optional
}

Invalid Job Type

Error:

{
  "detail": "Invalid job_type. Must be 'spiderSite' or 'spiderMaps'"
}

Solution: Use correct job type values:

# Correct
data = {"url": "...", "job_type": "spiderSite"}
data = {"url": "...", "job_type": "spiderMaps"}

# Incorrect
data = {"url": "...", "job_type": "scrape"}  # ❌ Invalid

401 Unauthorized - Authentication Failed

Your credentials are missing, invalid, or malformed. Error Response:

{
  "detail": "Invalid authentication token format. Expected: client_id:api_key:api_secret"
}

Common causes:

Missing Authorization Header

Ensure you’re sending the Authorization header:

# Correct
headers = {
    "Authorization": "Bearer <your_token>"
}

# Incorrect - missing header
headers = {}

Incorrect Token Format

SpiderIQ expects a three-part token format:

Authorization: Bearer client_id:api_key:api_secret

# Correct
token = "cli_abc123:sk_def456:secret_ghi789"
headers = {"Authorization": f"Bearer {token}"}

# Incorrect - missing parts
token = "cli_abc123:sk_def456"  # ❌ Missing secret

Expired or Invalid Credentials

Contact support if credentials are not working:

Email: admin@di-atomic.com
Include your client ID (do NOT send API key/secret)

Handling Example:

def make_authenticated_request(url, data):
    """Make request with proper error handling"""
    headers = {
        "Authorization": f"Bearer {os.getenv('SPIDERIQ_TOKEN')}",
        "Content-Type": "application/json"
    }

    response = requests.post(url, headers=headers, json=data)

    if response.status_code == 401:
        raise Exception(
            "Authentication failed. Please check your credentials."
        )

    return response

403 Forbidden - Access Denied

Your account exists but is inactive or lacks permission. Error Response:

{
  "detail": "Client account is inactive"
}

Action: Contact support at admin@di-atomic.com to reactivate your account

404 Not Found - Resource Doesn’t Exist

The requested job ID doesn’t exist. Error Response:

{
  "detail": "Job not found"
}

Common causes:

Typo in job ID
Job ID from different environment
Very old job that was cleaned up

Solution:

def get_job_results(job_id, headers):
    """Get results with 404 handling"""
    response = requests.get(
        f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/results",
        headers=headers
    )

    if response.status_code == 404:
        print(f"❌ Job {job_id} not found")
        return None

    return response.json()

410 Gone - Job Failed or Cancelled

The job has failed, been cancelled, or encountered an error during processing. Error Response:

{
  "success": false,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "failed",
  "error": "Target URL is not accessible",
  "message": "Job failed during processing"
}

Common failure reasons:

URL Not Accessible

Website is down
URL is invalid or broken
Site requires authentication
Connection timeout

Scraping Blocked

Site blocks bots
Rate limiting by target site
CAPTCHA protection
IP blocked

Timeout

Page took too long to load
Large website with many pages
Slow server response

Worker Error

Internal processing error
Resource constraints
Unexpected page structure

Handling Example:

def handle_job_result(job_id, headers):
    """Handle all job result scenarios"""
    response = requests.get(
        f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/results",
        headers=headers
    )

    if response.status_code == 200:
        # Success
        return response.json()

    elif response.status_code == 202:
        # Still processing
        print("⏳ Job still processing...")
        return None

    elif response.status_code == 410:
        # Job failed
        error_data = response.json()
        print(f"❌ Job failed: {error_data.get('error', 'Unknown error')}")

        # Check if we should retry
        if "timeout" in error_data.get('error', '').lower():
            print("💡 Try submitting again with longer timeout")
        elif "not accessible" in error_data.get('error', '').lower():
            print("💡 Check if the URL is valid and publicly accessible")

        return None

    else:
        print(f"⚠️ Unexpected status: {response.status_code}")
        return None

429 Too Many Requests - Rate Limited

You’ve exceeded the rate limit (100 requests per minute). Error Response:

{
  "detail": "Rate limit exceeded. Maximum 100 requests per minute."
}

Response Headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1698345678
Retry-After: 42

Handling with Exponential Backoff:

import time
import requests

def make_request_with_backoff(url, headers, data, max_retries=3):
    """Make request with exponential backoff on rate limits"""

    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)

        if response.status_code == 429:
            # Rate limited
            retry_after = int(response.headers.get('Retry-After', 60))

            if attempt < max_retries - 1:
                wait_time = min(retry_after, 2 ** attempt * 5)
                print(f"⏳ Rate limited. Waiting {wait_time}s before retry...")
                time.sleep(wait_time)
            else:
                raise Exception("Max retries exceeded")
        else:
            return response

    raise Exception("Request failed after all retries")

Best Practices for Rate Limiting:

Track your rate: Monitor the X-RateLimit-Remaining header to know how many requests you have left

Implement backoff: Always use exponential backoff when you hit rate limits

Batch wisely: Submit jobs in controlled batches (e.g., 10-20 at a time) rather than all at once

Respect Retry-After: Always check and respect the Retry-After header value

500 Internal Server Error - Server Issue

An unexpected error occurred on the server. Error Response:

{
  "detail": "Internal server error"
}

Action:

Retry the request after a brief delay
If error persists, contact support
Include your job ID or request details when reporting

def make_request_with_retry(url, headers, data, max_retries=3):
    """Retry on server errors"""

    for attempt in range(max_retries):
        try:
            response = requests.post(url, headers=headers, json=data)

            if response.status_code == 500:
                if attempt < max_retries - 1:
                    wait_time = (attempt + 1) * 5
                    print(f"⚠️ Server error. Retrying in {wait_time}s...")
                    time.sleep(wait_time)
                else:
                    raise Exception("Server error persists after retries")
            else:
                return response

        except requests.exceptions.RequestException as e:
            print(f"❌ Request failed: {e}")
            if attempt < max_retries - 1:
                time.sleep(5)
            else:
                raise

    return None

Job Status Errors

Checking Job Status

Use the status endpoint to check if a job failed:

def check_job_status(job_id, headers):
    """Check current job status"""
    response = requests.get(
        f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/status",
        headers=headers
    )

    if response.status_code == 200:
        status_data = response.json()
        status = status_data['status']

        if status == 'completed':
            print("✓ Job completed successfully")
        elif status == 'processing':
            print("⏳ Job is being processed")
        elif status == 'queued':
            print("📋 Job is queued, waiting for worker")
        elif status == 'failed':
            print(f"❌ Job failed: {status_data.get('error')}")
        elif status == 'cancelled':
            print("⚠️ Job was cancelled")

        return status_data

    return None

Comprehensive Error Handler

Here’s a complete error handling implementation:

import requests
import time
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class SpiderIQClient:
    def __init__(self, token):
        self.base_url = "https://spideriq.di-atomic.com/api/v1"
        self.headers = {
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        }

    def submit_job(self, job_type, url, instructions=None, max_retries=3):
        """Submit job with error handling and retries"""

        data = {"url": url}
        if instructions:
            data["instructions"] = instructions

        endpoint = f"{self.base_url}/jobs/{job_type}/submit"

        for attempt in range(max_retries):
            try:
                response = requests.post(endpoint, headers=self.headers, json=data)

                if response.status_code == 201:
                    result = response.json()
                    logger.info(f"✓ Job submitted: {result['job_id']}")
                    return result['job_id']

                elif response.status_code == 400:
                    error = response.json()
                    logger.error(f"❌ Bad request: {error['detail']}")
                    return None  # Don't retry on client errors

                elif response.status_code == 401:
                    logger.error("❌ Authentication failed")
                    return None  # Don't retry on auth errors

                elif response.status_code == 429:
                    retry_after = int(response.headers.get('Retry-After', 60))
                    logger.warning(f"⏳ Rate limited. Waiting {retry_after}s...")
                    time.sleep(retry_after)

                elif response.status_code == 500:
                    if attempt < max_retries - 1:
                        wait = (attempt + 1) * 5
                        logger.warning(f"⚠️ Server error. Retrying in {wait}s...")
                        time.sleep(wait)
                    else:
                        logger.error("❌ Server error persists")
                        return None

            except requests.exceptions.Timeout:
                logger.warning(f"⏱️ Request timeout (attempt {attempt + 1})")
                if attempt < max_retries - 1:
                    time.sleep(5)
                else:
                    return None

            except requests.exceptions.ConnectionError:
                logger.warning(f"🔌 Connection error (attempt {attempt + 1})")
                if attempt < max_retries - 1:
                    time.sleep(5)
                else:
                    return None

        return None

    def get_results(self, job_id, max_wait=120, poll_interval=3):
        """Poll for results with timeout"""

        endpoint = f"{self.base_url}/jobs/{job_id}/results"
        start_time = time.time()

        while time.time() - start_time < max_wait:
            try:
                response = requests.get(endpoint, headers=self.headers)

                if response.status_code == 200:
                    logger.info(f"✓ Job {job_id} completed")
                    return response.json()

                elif response.status_code == 202:
                    logger.info(f"⏳ Job {job_id} still processing...")
                    time.sleep(poll_interval)

                elif response.status_code == 404:
                    logger.error(f"❌ Job {job_id} not found")
                    return None

                elif response.status_code == 410:
                    error_data = response.json()
                    logger.error(f"❌ Job {job_id} failed: {error_data.get('error')}")
                    return None

                else:
                    logger.warning(f"⚠️ Unexpected status: {response.status_code}")
                    return None

            except requests.exceptions.RequestException as e:
                logger.error(f"❌ Request error: {e}")
                time.sleep(poll_interval)

        logger.error(f"⏱️ Timeout waiting for job {job_id}")
        return None

# Usage
client = SpiderIQClient("<your_token>")

# Submit job
job_id = client.submit_job(
    job_type="spiderSite",
    url="https://example.com",
    instructions="Extract contact information"
)

if job_id:
    # Get results
    results = client.get_results(job_id)

    if results:
        print("Success!")
        print(results['data'])
    else:
        print("Job failed or timed out")

Debugging Tips

Enable Verbose Logging

import logging

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# Now all requests will be logged

Inspect Response Headers

response = requests.post(url, headers=headers, json=data)

# Check rate limit status
print(f"Rate limit remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Rate limit resets at: {response.headers.get('X-RateLimit-Reset')}")

# Check response details
print(f"Status: {response.status_code}")
print(f"Body: {response.text}")

Test with System Health

Before submitting jobs, verify API connectivity:

def check_api_health():
    """Test API connectivity"""
    try:
        response = requests.get(
            "https://spideriq.di-atomic.com/api/v1/system/health",
            timeout=5
        )

        if response.status_code == 200:
            health = response.json()
            print(f"✓ API is healthy")
            print(f"  Database: {health.get('database')}")
            print(f"  Queue: {health.get('queue')}")
            return True
        else:
            print(f"⚠️ API returned status {response.status_code}")
            return False

    except requests.exceptions.RequestException as e:
        print(f"❌ Cannot reach API: {e}")
        return False

# Run before submitting jobs
if check_api_health():
    # Proceed with job submission
    pass

Save Failed Requests

import json
from datetime import datetime

def log_failed_request(url, data, response):
    """Log failed requests for debugging"""
    timestamp = datetime.now().isoformat()
    log_entry = {
        "timestamp": timestamp,
        "url": url,
        "request_data": data,
        "status_code": response.status_code,
        "response": response.text
    }

    with open('failed_requests.log', 'a') as f:
        f.write(json.dumps(log_entry) + '\n')

    print(f"⚠️ Failed request logged to failed_requests.log")

When to Contact Support

Contact support at admin@di-atomic.com if:

✉️ Authentication errors persist after verifying credentials
✉️ Server errors (500) continue for extended periods
✉️ Jobs consistently fail with the same error
✉️ You need higher rate limits for your use case
✉️ You encounter unexpected behavior not covered in docs

Include in your support request:

Your client ID (NOT your API key or secret)
Job ID(s) if applicable
Error messages received
Steps to reproduce the issue
Timestamp of when the issue occurred

Next Steps

Authentication

Learn about authentication

API Reference

Complete API documentation

SpiderSite Guide

Website scraping guide

SpiderMaps Guide

Google Maps scraping guide

Guides

​Overview

​HTTP Status Codes

​200 OK - Success

​201 Created - Job Submitted

​202 Accepted - Job Processing

​400 Bad Request - Invalid Input

​401 Unauthorized - Authentication Failed

​403 Forbidden - Access Denied

​404 Not Found - Resource Doesn’t Exist

​410 Gone - Job Failed or Cancelled

URL Not Accessible

Scraping Blocked

Timeout

Worker Error

​429 Too Many Requests - Rate Limited

​500 Internal Server Error - Server Issue

​Job Status Errors

​Checking Job Status

​Comprehensive Error Handler

​Debugging Tips

​Enable Verbose Logging

​Inspect Response Headers

​Test with System Health

​Save Failed Requests

​When to Contact Support

​Next Steps

Authentication

API Reference

SpiderSite Guide

SpiderMaps Guide

Overview

HTTP Status Codes

200 OK - Success

201 Created - Job Submitted

202 Accepted - Job Processing

400 Bad Request - Invalid Input

401 Unauthorized - Authentication Failed

403 Forbidden - Access Denied

404 Not Found - Resource Doesn’t Exist

410 Gone - Job Failed or Cancelled

429 Too Many Requests - Rate Limited

500 Internal Server Error - Server Issue

Job Status Errors

Checking Job Status

Comprehensive Error Handler

Debugging Tips

Enable Verbose Logging

Inspect Response Headers

Test with System Health

Save Failed Requests

When to Contact Support

Next Steps