Overview

This guide covers common errors you may encounter when using SpiderIQ API and how to handle them gracefully in your applications.

HTTP Status Codes

200 OK - Success

Job completed successfully and results are available.
{
  "success": true,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "spiderSite",
  "status": "completed",
  "data": { ... }
}
Action: Process the results

201 Created - Job Submitted

Job was successfully submitted and queued for processing.
{
  "success": true,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "spiderSite",
  "status": "queued",
  "message": "Job submitted successfully"
}
Action: Save the job_id and poll for results

202 Accepted - Job Processing

Job is still being processed. Results not yet available.
{
  "success": false,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "message": "Job is still being processed"
}
Action: Wait and poll again Handling Example:
import time
import requests

def wait_for_job(job_id, headers, max_wait=120):
    """Poll for job completion with timeout"""
    start_time = time.time()

    while time.time() - start_time < max_wait:
        response = requests.get(
            f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/results",
            headers=headers
        )

        if response.status_code == 200:
            return response.json()
        elif response.status_code == 202:
            print("⏳ Job still processing...")
            time.sleep(3)
        else:
            raise Exception(f"Error: {response.status_code}")

    raise TimeoutError("Job did not complete within timeout period")

400 Bad Request - Invalid Input

The request was malformed or contains invalid data. Common causes:
Error:
{
  "detail": "Invalid URL format. Please provide a valid HTTP/HTTPS URL."
}
Solution:
  • Ensure URL starts with http:// or https://
  • Check for typos in the URL
  • Validate URL format before submitting
from urllib.parse import urlparse

def is_valid_url(url):
    try:
        result = urlparse(url)
        return all([result.scheme, result.netloc])
    except:
        return False

url = "https://example.com"
if is_valid_url(url):
    # Submit job
    pass
else:
    print("❌ Invalid URL format")
Error:
{
  "detail": "Missing required field: url"
}
Solution: Ensure all required fields are present in your request:
data = {
    "url": "https://example.com",  # Required
    "instructions": "Extract..."     # Optional
}
Error:
{
  "detail": "Invalid job_type. Must be 'spiderSite' or 'spiderMaps'"
}
Solution: Use correct job type values:
# Correct
data = {"url": "...", "job_type": "spiderSite"}
data = {"url": "...", "job_type": "spiderMaps"}

# Incorrect
data = {"url": "...", "job_type": "scrape"}  # ❌ Invalid

401 Unauthorized - Authentication Failed

Your credentials are missing, invalid, or malformed. Error Response:
{
  "detail": "Invalid authentication token format. Expected: client_id:api_key:api_secret"
}
Common causes:
1

Missing Authorization Header

Ensure you’re sending the Authorization header:
# Correct
headers = {
    "Authorization": "Bearer <your_token>"
}

# Incorrect - missing header
headers = {}
2

Incorrect Token Format

SpiderIQ expects a three-part token format:
Authorization: Bearer client_id:api_key:api_secret
# Correct
token = "cli_abc123:sk_def456:secret_ghi789"
headers = {"Authorization": f"Bearer {token}"}

# Incorrect - missing parts
token = "cli_abc123:sk_def456"  # ❌ Missing secret
3

Expired or Invalid Credentials

Contact support if credentials are not working:
Handling Example:
def make_authenticated_request(url, data):
    """Make request with proper error handling"""
    headers = {
        "Authorization": f"Bearer {os.getenv('SPIDERIQ_TOKEN')}",
        "Content-Type": "application/json"
    }

    response = requests.post(url, headers=headers, json=data)

    if response.status_code == 401:
        raise Exception(
            "Authentication failed. Please check your credentials."
        )

    return response

403 Forbidden - Access Denied

Your account exists but is inactive or lacks permission. Error Response:
{
  "detail": "Client account is inactive"
}
Action: Contact support at admin@di-atomic.com to reactivate your account

404 Not Found - Resource Doesn’t Exist

The requested job ID doesn’t exist. Error Response:
{
  "detail": "Job not found"
}
Common causes:
  • Typo in job ID
  • Job ID from different environment
  • Very old job that was cleaned up
Solution:
def get_job_results(job_id, headers):
    """Get results with 404 handling"""
    response = requests.get(
        f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/results",
        headers=headers
    )

    if response.status_code == 404:
        print(f"❌ Job {job_id} not found")
        return None

    return response.json()

410 Gone - Job Failed or Cancelled

The job has failed, been cancelled, or encountered an error during processing. Error Response:
{
  "success": false,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "failed",
  "error": "Target URL is not accessible",
  "message": "Job failed during processing"
}
Common failure reasons:

URL Not Accessible

  • Website is down
  • URL is invalid or broken
  • Site requires authentication
  • Connection timeout

Scraping Blocked

  • Site blocks bots
  • Rate limiting by target site
  • CAPTCHA protection
  • IP blocked

Timeout

  • Page took too long to load
  • Large website with many pages
  • Slow server response

Worker Error

  • Internal processing error
  • Resource constraints
  • Unexpected page structure
Handling Example:
def handle_job_result(job_id, headers):
    """Handle all job result scenarios"""
    response = requests.get(
        f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/results",
        headers=headers
    )

    if response.status_code == 200:
        # Success
        return response.json()

    elif response.status_code == 202:
        # Still processing
        print("⏳ Job still processing...")
        return None

    elif response.status_code == 410:
        # Job failed
        error_data = response.json()
        print(f"❌ Job failed: {error_data.get('error', 'Unknown error')}")

        # Check if we should retry
        if "timeout" in error_data.get('error', '').lower():
            print("💡 Try submitting again with longer timeout")
        elif "not accessible" in error_data.get('error', '').lower():
            print("💡 Check if the URL is valid and publicly accessible")

        return None

    else:
        print(f"⚠️ Unexpected status: {response.status_code}")
        return None

429 Too Many Requests - Rate Limited

You’ve exceeded the rate limit (100 requests per minute). Error Response:
{
  "detail": "Rate limit exceeded. Maximum 100 requests per minute."
}
Response Headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1698345678
Retry-After: 42
Handling with Exponential Backoff:
import time
import requests

def make_request_with_backoff(url, headers, data, max_retries=3):
    """Make request with exponential backoff on rate limits"""

    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)

        if response.status_code == 429:
            # Rate limited
            retry_after = int(response.headers.get('Retry-After', 60))

            if attempt < max_retries - 1:
                wait_time = min(retry_after, 2 ** attempt * 5)
                print(f"⏳ Rate limited. Waiting {wait_time}s before retry...")
                time.sleep(wait_time)
            else:
                raise Exception("Max retries exceeded")
        else:
            return response

    raise Exception("Request failed after all retries")
Best Practices for Rate Limiting:
Track your rate: Monitor the X-RateLimit-Remaining header to know how many requests you have left
Implement backoff: Always use exponential backoff when you hit rate limits
Batch wisely: Submit jobs in controlled batches (e.g., 10-20 at a time) rather than all at once
Respect Retry-After: Always check and respect the Retry-After header value

500 Internal Server Error - Server Issue

An unexpected error occurred on the server. Error Response:
{
  "detail": "Internal server error"
}
Action:
  • Retry the request after a brief delay
  • If error persists, contact support
  • Include your job ID or request details when reporting
def make_request_with_retry(url, headers, data, max_retries=3):
    """Retry on server errors"""

    for attempt in range(max_retries):
        try:
            response = requests.post(url, headers=headers, json=data)

            if response.status_code == 500:
                if attempt < max_retries - 1:
                    wait_time = (attempt + 1) * 5
                    print(f"⚠️ Server error. Retrying in {wait_time}s...")
                    time.sleep(wait_time)
                else:
                    raise Exception("Server error persists after retries")
            else:
                return response

        except requests.exceptions.RequestException as e:
            print(f"❌ Request failed: {e}")
            if attempt < max_retries - 1:
                time.sleep(5)
            else:
                raise

    return None

Job Status Errors

Checking Job Status

Use the status endpoint to check if a job failed:
def check_job_status(job_id, headers):
    """Check current job status"""
    response = requests.get(
        f"https://spideriq.di-atomic.com/api/v1/jobs/{job_id}/status",
        headers=headers
    )

    if response.status_code == 200:
        status_data = response.json()
        status = status_data['status']

        if status == 'completed':
            print("✓ Job completed successfully")
        elif status == 'processing':
            print("⏳ Job is being processed")
        elif status == 'queued':
            print("📋 Job is queued, waiting for worker")
        elif status == 'failed':
            print(f"❌ Job failed: {status_data.get('error')}")
        elif status == 'cancelled':
            print("⚠️ Job was cancelled")

        return status_data

    return None

Comprehensive Error Handler

Here’s a complete error handling implementation:
import requests
import time
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class SpiderIQClient:
    def __init__(self, token):
        self.base_url = "https://spideriq.di-atomic.com/api/v1"
        self.headers = {
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        }

    def submit_job(self, job_type, url, instructions=None, max_retries=3):
        """Submit job with error handling and retries"""

        data = {"url": url}
        if instructions:
            data["instructions"] = instructions

        endpoint = f"{self.base_url}/jobs/{job_type}/submit"

        for attempt in range(max_retries):
            try:
                response = requests.post(endpoint, headers=self.headers, json=data)

                if response.status_code == 201:
                    result = response.json()
                    logger.info(f"✓ Job submitted: {result['job_id']}")
                    return result['job_id']

                elif response.status_code == 400:
                    error = response.json()
                    logger.error(f"❌ Bad request: {error['detail']}")
                    return None  # Don't retry on client errors

                elif response.status_code == 401:
                    logger.error("❌ Authentication failed")
                    return None  # Don't retry on auth errors

                elif response.status_code == 429:
                    retry_after = int(response.headers.get('Retry-After', 60))
                    logger.warning(f"⏳ Rate limited. Waiting {retry_after}s...")
                    time.sleep(retry_after)

                elif response.status_code == 500:
                    if attempt < max_retries - 1:
                        wait = (attempt + 1) * 5
                        logger.warning(f"⚠️ Server error. Retrying in {wait}s...")
                        time.sleep(wait)
                    else:
                        logger.error("❌ Server error persists")
                        return None

            except requests.exceptions.Timeout:
                logger.warning(f"⏱️ Request timeout (attempt {attempt + 1})")
                if attempt < max_retries - 1:
                    time.sleep(5)
                else:
                    return None

            except requests.exceptions.ConnectionError:
                logger.warning(f"🔌 Connection error (attempt {attempt + 1})")
                if attempt < max_retries - 1:
                    time.sleep(5)
                else:
                    return None

        return None

    def get_results(self, job_id, max_wait=120, poll_interval=3):
        """Poll for results with timeout"""

        endpoint = f"{self.base_url}/jobs/{job_id}/results"
        start_time = time.time()

        while time.time() - start_time < max_wait:
            try:
                response = requests.get(endpoint, headers=self.headers)

                if response.status_code == 200:
                    logger.info(f"✓ Job {job_id} completed")
                    return response.json()

                elif response.status_code == 202:
                    logger.info(f"⏳ Job {job_id} still processing...")
                    time.sleep(poll_interval)

                elif response.status_code == 404:
                    logger.error(f"❌ Job {job_id} not found")
                    return None

                elif response.status_code == 410:
                    error_data = response.json()
                    logger.error(f"❌ Job {job_id} failed: {error_data.get('error')}")
                    return None

                else:
                    logger.warning(f"⚠️ Unexpected status: {response.status_code}")
                    return None

            except requests.exceptions.RequestException as e:
                logger.error(f"❌ Request error: {e}")
                time.sleep(poll_interval)

        logger.error(f"⏱️ Timeout waiting for job {job_id}")
        return None

# Usage
client = SpiderIQClient("<your_token>")

# Submit job
job_id = client.submit_job(
    job_type="spiderSite",
    url="https://example.com",
    instructions="Extract contact information"
)

if job_id:
    # Get results
    results = client.get_results(job_id)

    if results:
        print("Success!")
        print(results['data'])
    else:
        print("Job failed or timed out")

Debugging Tips

Enable Verbose Logging

import logging

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# Now all requests will be logged

Inspect Response Headers

response = requests.post(url, headers=headers, json=data)

# Check rate limit status
print(f"Rate limit remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Rate limit resets at: {response.headers.get('X-RateLimit-Reset')}")

# Check response details
print(f"Status: {response.status_code}")
print(f"Body: {response.text}")

Test with System Health

Before submitting jobs, verify API connectivity:
def check_api_health():
    """Test API connectivity"""
    try:
        response = requests.get(
            "https://spideriq.di-atomic.com/api/v1/system/health",
            timeout=5
        )

        if response.status_code == 200:
            health = response.json()
            print(f"✓ API is healthy")
            print(f"  Database: {health.get('database')}")
            print(f"  Queue: {health.get('queue')}")
            return True
        else:
            print(f"⚠️ API returned status {response.status_code}")
            return False

    except requests.exceptions.RequestException as e:
        print(f"❌ Cannot reach API: {e}")
        return False

# Run before submitting jobs
if check_api_health():
    # Proceed with job submission
    pass

Save Failed Requests

import json
from datetime import datetime

def log_failed_request(url, data, response):
    """Log failed requests for debugging"""
    timestamp = datetime.now().isoformat()
    log_entry = {
        "timestamp": timestamp,
        "url": url,
        "request_data": data,
        "status_code": response.status_code,
        "response": response.text
    }

    with open('failed_requests.log', 'a') as f:
        f.write(json.dumps(log_entry) + '\n')

    print(f"⚠️ Failed request logged to failed_requests.log")

When to Contact Support

Contact support at admin@di-atomic.com if:
  • ✉️ Authentication errors persist after verifying credentials
  • ✉️ Server errors (500) continue for extended periods
  • ✉️ Jobs consistently fail with the same error
  • ✉️ You need higher rate limits for your use case
  • ✉️ You encounter unexpected behavior not covered in docs
Include in your support request:
  • Your client ID (NOT your API key or secret)
  • Job ID(s) if applicable
  • Error messages received
  • Steps to reproduce the issue
  • Timestamp of when the issue occurred

Next Steps