Overview
This guide covers common errors you may encounter when using SpiderIQ API and how to handle them gracefully in your applications.
HTTP Status Codes
200 OK - Success
Job completed successfully and results are available.
{
"success" : true ,
"job_id" : "550e8400-e29b-41d4-a716-446655440000" ,
"type" : "spiderSite" ,
"status" : "completed" ,
"data" : { ... }
}
Action: Process the results
201 Created - Job Submitted
Job was successfully submitted and queued for processing.
{
"success" : true ,
"job_id" : "550e8400-e29b-41d4-a716-446655440000" ,
"type" : "spiderSite" ,
"status" : "queued" ,
"message" : "Job submitted successfully"
}
Action: Save the job_id and poll for results
202 Accepted - Job Processing
Job is still being processed. Results not yet available.
{
"success" : false ,
"job_id" : "550e8400-e29b-41d4-a716-446655440000" ,
"status" : "processing" ,
"message" : "Job is still being processed"
}
Action: Wait and poll again
Handling Example:
import time
import requests
def wait_for_job ( job_id , headers , max_wait = 120 ):
"""Poll for job completion with timeout"""
start_time = time.time()
while time.time() - start_time < max_wait:
response = requests.get(
f "https://spideriq.di-atomic.com/api/v1/jobs/ { job_id } /results" ,
headers = headers
)
if response.status_code == 200 :
return response.json()
elif response.status_code == 202 :
print ( "⏳ Job still processing..." )
time.sleep( 3 )
else :
raise Exception ( f "Error: { response.status_code } " )
raise TimeoutError ( "Job did not complete within timeout period" )
The request was malformed or contains invalid data.
Common causes:
Error: {
"detail" : "Missing required field: url"
}
Solution:
Ensure all required fields are present in your request:data = {
"url" : "https://example.com" , # Required
"instructions" : "Extract..." # Optional
}
Error: {
"detail" : "Invalid job_type. Must be 'spiderSite' or 'spiderMaps'"
}
Solution:
Use correct job type values:# Correct
data = { "url" : "..." , "job_type" : "spiderSite" }
data = { "url" : "..." , "job_type" : "spiderMaps" }
# Incorrect
data = { "url" : "..." , "job_type" : "scrape" } # ❌ Invalid
401 Unauthorized - Authentication Failed
Your credentials are missing, invalid, or malformed.
Error Response:
{
"detail" : "Invalid authentication token format. Expected: client_id:api_key:api_secret"
}
Common causes:
Missing Authorization Header
Ensure you’re sending the Authorization header: # Correct
headers = {
"Authorization" : "Bearer <your_token>"
}
# Incorrect - missing header
headers = {}
Incorrect Token Format
SpiderIQ expects a three-part token format: Authorization: Bearer client_id:api_key:api_secret
# Correct
token = "cli_abc123:sk_def456:secret_ghi789"
headers = { "Authorization" : f "Bearer { token } " }
# Incorrect - missing parts
token = "cli_abc123:sk_def456" # ❌ Missing secret
Expired or Invalid Credentials
Contact support if credentials are not working:
Handling Example:
def make_authenticated_request ( url , data ):
"""Make request with proper error handling"""
headers = {
"Authorization" : f "Bearer { os.getenv( 'SPIDERIQ_TOKEN' ) } " ,
"Content-Type" : "application/json"
}
response = requests.post(url, headers = headers, json = data)
if response.status_code == 401 :
raise Exception (
"Authentication failed. Please check your credentials."
)
return response
403 Forbidden - Access Denied
Your account exists but is inactive or lacks permission.
Error Response:
{
"detail" : "Client account is inactive"
}
Action: Contact support at admin@di-atomic.com to reactivate your account
404 Not Found - Resource Doesn’t Exist
The requested job ID doesn’t exist.
Error Response:
{
"detail" : "Job not found"
}
Common causes:
Typo in job ID
Job ID from different environment
Very old job that was cleaned up
Solution:
def get_job_results ( job_id , headers ):
"""Get results with 404 handling"""
response = requests.get(
f "https://spideriq.di-atomic.com/api/v1/jobs/ { job_id } /results" ,
headers = headers
)
if response.status_code == 404 :
print ( f "❌ Job { job_id } not found" )
return None
return response.json()
410 Gone - Job Failed or Cancelled
The job has failed, been cancelled, or encountered an error during processing.
Error Response:
{
"success" : false ,
"job_id" : "550e8400-e29b-41d4-a716-446655440000" ,
"status" : "failed" ,
"error" : "Target URL is not accessible" ,
"message" : "Job failed during processing"
}
Common failure reasons:
URL Not Accessible
Website is down
URL is invalid or broken
Site requires authentication
Connection timeout
Scraping Blocked
Site blocks bots
Rate limiting by target site
CAPTCHA protection
IP blocked
Timeout
Page took too long to load
Large website with many pages
Slow server response
Worker Error
Internal processing error
Resource constraints
Unexpected page structure
Handling Example:
def handle_job_result ( job_id , headers ):
"""Handle all job result scenarios"""
response = requests.get(
f "https://spideriq.di-atomic.com/api/v1/jobs/ { job_id } /results" ,
headers = headers
)
if response.status_code == 200 :
# Success
return response.json()
elif response.status_code == 202 :
# Still processing
print ( "⏳ Job still processing..." )
return None
elif response.status_code == 410 :
# Job failed
error_data = response.json()
print ( f "❌ Job failed: { error_data.get( 'error' , 'Unknown error' ) } " )
# Check if we should retry
if "timeout" in error_data.get( 'error' , '' ).lower():
print ( "💡 Try submitting again with longer timeout" )
elif "not accessible" in error_data.get( 'error' , '' ).lower():
print ( "💡 Check if the URL is valid and publicly accessible" )
return None
else :
print ( f "⚠️ Unexpected status: { response.status_code } " )
return None
429 Too Many Requests - Rate Limited
You’ve exceeded the rate limit (100 requests per minute).
Error Response:
{
"detail" : "Rate limit exceeded. Maximum 100 requests per minute."
}
Response Headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1698345678
Retry-After: 42
Handling with Exponential Backoff:
import time
import requests
def make_request_with_backoff ( url , headers , data , max_retries = 3 ):
"""Make request with exponential backoff on rate limits"""
for attempt in range (max_retries):
response = requests.post(url, headers = headers, json = data)
if response.status_code == 429 :
# Rate limited
retry_after = int (response.headers.get( 'Retry-After' , 60 ))
if attempt < max_retries - 1 :
wait_time = min (retry_after, 2 ** attempt * 5 )
print ( f "⏳ Rate limited. Waiting { wait_time } s before retry..." )
time.sleep(wait_time)
else :
raise Exception ( "Max retries exceeded" )
else :
return response
raise Exception ( "Request failed after all retries" )
Best Practices for Rate Limiting:
Track your rate: Monitor the X-RateLimit-Remaining header to know how many requests you have left
Implement backoff: Always use exponential backoff when you hit rate limits
Batch wisely: Submit jobs in controlled batches (e.g., 10-20 at a time) rather than all at once
Respect Retry-After: Always check and respect the Retry-After header value
500 Internal Server Error - Server Issue
An unexpected error occurred on the server.
Error Response:
{
"detail" : "Internal server error"
}
Action:
Retry the request after a brief delay
If error persists, contact support
Include your job ID or request details when reporting
def make_request_with_retry ( url , headers , data , max_retries = 3 ):
"""Retry on server errors"""
for attempt in range (max_retries):
try :
response = requests.post(url, headers = headers, json = data)
if response.status_code == 500 :
if attempt < max_retries - 1 :
wait_time = (attempt + 1 ) * 5
print ( f "⚠️ Server error. Retrying in { wait_time } s..." )
time.sleep(wait_time)
else :
raise Exception ( "Server error persists after retries" )
else :
return response
except requests.exceptions.RequestException as e:
print ( f "❌ Request failed: { e } " )
if attempt < max_retries - 1 :
time.sleep( 5 )
else :
raise
return None
Job Status Errors
Checking Job Status
Use the status endpoint to check if a job failed:
def check_job_status ( job_id , headers ):
"""Check current job status"""
response = requests.get(
f "https://spideriq.di-atomic.com/api/v1/jobs/ { job_id } /status" ,
headers = headers
)
if response.status_code == 200 :
status_data = response.json()
status = status_data[ 'status' ]
if status == 'completed' :
print ( "✓ Job completed successfully" )
elif status == 'processing' :
print ( "⏳ Job is being processed" )
elif status == 'queued' :
print ( "📋 Job is queued, waiting for worker" )
elif status == 'failed' :
print ( f "❌ Job failed: { status_data.get( 'error' ) } " )
elif status == 'cancelled' :
print ( "⚠️ Job was cancelled" )
return status_data
return None
Comprehensive Error Handler
Here’s a complete error handling implementation:
import requests
import time
import logging
logging.basicConfig( level = logging. INFO )
logger = logging.getLogger( __name__ )
class SpiderIQClient :
def __init__ ( self , token ):
self .base_url = "https://spideriq.di-atomic.com/api/v1"
self .headers = {
"Authorization" : f "Bearer { token } " ,
"Content-Type" : "application/json"
}
def submit_job ( self , job_type , url , instructions = None , max_retries = 3 ):
"""Submit job with error handling and retries"""
data = { "url" : url}
if instructions:
data[ "instructions" ] = instructions
endpoint = f " { self .base_url } /jobs/ { job_type } /submit"
for attempt in range (max_retries):
try :
response = requests.post(endpoint, headers = self .headers, json = data)
if response.status_code == 201 :
result = response.json()
logger.info( f "✓ Job submitted: { result[ 'job_id' ] } " )
return result[ 'job_id' ]
elif response.status_code == 400 :
error = response.json()
logger.error( f "❌ Bad request: { error[ 'detail' ] } " )
return None # Don't retry on client errors
elif response.status_code == 401 :
logger.error( "❌ Authentication failed" )
return None # Don't retry on auth errors
elif response.status_code == 429 :
retry_after = int (response.headers.get( 'Retry-After' , 60 ))
logger.warning( f "⏳ Rate limited. Waiting { retry_after } s..." )
time.sleep(retry_after)
elif response.status_code == 500 :
if attempt < max_retries - 1 :
wait = (attempt + 1 ) * 5
logger.warning( f "⚠️ Server error. Retrying in { wait } s..." )
time.sleep(wait)
else :
logger.error( "❌ Server error persists" )
return None
except requests.exceptions.Timeout:
logger.warning( f "⏱️ Request timeout (attempt { attempt + 1 } )" )
if attempt < max_retries - 1 :
time.sleep( 5 )
else :
return None
except requests.exceptions.ConnectionError:
logger.warning( f "🔌 Connection error (attempt { attempt + 1 } )" )
if attempt < max_retries - 1 :
time.sleep( 5 )
else :
return None
return None
def get_results ( self , job_id , max_wait = 120 , poll_interval = 3 ):
"""Poll for results with timeout"""
endpoint = f " { self .base_url } /jobs/ { job_id } /results"
start_time = time.time()
while time.time() - start_time < max_wait:
try :
response = requests.get(endpoint, headers = self .headers)
if response.status_code == 200 :
logger.info( f "✓ Job { job_id } completed" )
return response.json()
elif response.status_code == 202 :
logger.info( f "⏳ Job { job_id } still processing..." )
time.sleep(poll_interval)
elif response.status_code == 404 :
logger.error( f "❌ Job { job_id } not found" )
return None
elif response.status_code == 410 :
error_data = response.json()
logger.error( f "❌ Job { job_id } failed: { error_data.get( 'error' ) } " )
return None
else :
logger.warning( f "⚠️ Unexpected status: { response.status_code } " )
return None
except requests.exceptions.RequestException as e:
logger.error( f "❌ Request error: { e } " )
time.sleep(poll_interval)
logger.error( f "⏱️ Timeout waiting for job { job_id } " )
return None
# Usage
client = SpiderIQClient( "<your_token>" )
# Submit job
job_id = client.submit_job(
job_type = "spiderSite" ,
url = "https://example.com" ,
instructions = "Extract contact information"
)
if job_id:
# Get results
results = client.get_results(job_id)
if results:
print ( "Success!" )
print (results[ 'data' ])
else :
print ( "Job failed or timed out" )
Debugging Tips
Enable Verbose Logging
import logging
logging.basicConfig(
level = logging. DEBUG ,
format = ' %(asctime)s - %(name)s - %(levelname)s - %(message)s '
)
# Now all requests will be logged
response = requests.post(url, headers = headers, json = data)
# Check rate limit status
print ( f "Rate limit remaining: { response.headers.get( 'X-RateLimit-Remaining' ) } " )
print ( f "Rate limit resets at: { response.headers.get( 'X-RateLimit-Reset' ) } " )
# Check response details
print ( f "Status: { response.status_code } " )
print ( f "Body: { response.text } " )
Test with System Health
Before submitting jobs, verify API connectivity:
def check_api_health ():
"""Test API connectivity"""
try :
response = requests.get(
"https://spideriq.di-atomic.com/api/v1/system/health" ,
timeout = 5
)
if response.status_code == 200 :
health = response.json()
print ( f "✓ API is healthy" )
print ( f " Database: { health.get( 'database' ) } " )
print ( f " Queue: { health.get( 'queue' ) } " )
return True
else :
print ( f "⚠️ API returned status { response.status_code } " )
return False
except requests.exceptions.RequestException as e:
print ( f "❌ Cannot reach API: { e } " )
return False
# Run before submitting jobs
if check_api_health():
# Proceed with job submission
pass
Save Failed Requests
import json
from datetime import datetime
def log_failed_request ( url , data , response ):
"""Log failed requests for debugging"""
timestamp = datetime.now().isoformat()
log_entry = {
"timestamp" : timestamp,
"url" : url,
"request_data" : data,
"status_code" : response.status_code,
"response" : response.text
}
with open ( 'failed_requests.log' , 'a' ) as f:
f.write(json.dumps(log_entry) + ' \n ' )
print ( f "⚠️ Failed request logged to failed_requests.log" )
Contact support at admin@di-atomic.com if:
✉️ Authentication errors persist after verifying credentials
✉️ Server errors (500) continue for extended periods
✉️ Jobs consistently fail with the same error
✉️ You need higher rate limits for your use case
✉️ You encounter unexpected behavior not covered in docs
Include in your support request:
Your client ID (NOT your API key or secret)
Job ID(s) if applicable
Error messages received
Steps to reproduce the issue
Timestamp of when the issue occurred
Next Steps