GET
/
api
/
v1
/
jobs
/
list
List Jobs
curl --request GET \
  --url https://spideriq.di-atomic.com/api/v1/jobs/list \
  --header 'Authorization: Bearer <token>'
{
  "total": 123,
  "page": 123,
  "page_size": 123,
  "total_pages": 123,
  "jobs": [
    {}
  ],
  "job_id": "<string>",
  "type": "<string>",
  "status": "<string>",
  "url": "<string>",
  "created_at": "<string>",
  "updated_at": "<string>",
  "worker_id": "<string>"
}

Overview

Get a list of all jobs submitted by your client account with filtering, pagination, and sorting options.

Query Parameters

page
integer
default:"1"
Page number (starting from 1)Example: ?page=2
page_size
integer
default:"50"
Number of jobs per page (max: 100)Example: ?page_size=25
status_filter
string
Filter by job statusOptions: queued, processing, completed, failed, cancelledExample: ?status_filter=completed
type_filter
string
Filter by job typeOptions: spiderSite, spiderMapsExample: ?type_filter=spiderSite
sort_by
string
default:"created_at"
Field to sort byOptions: created_at, updated_at, statusExample: ?sort_by=updated_at
sort_order
string
default:"desc"
Sort orderOptions: asc, descExample: ?sort_order=asc

Response

total
integer
Total number of jobs matching the filter
page
integer
Current page number
page_size
integer
Number of items per page
total_pages
integer
Total number of pages available
jobs
array
Array of job objects

Job Object

job_id
string
Unique job identifier (UUID)
type
string
Job type (spiderSite or spiderMaps)
status
string
Current job status
url
string
The URL that was scraped
created_at
string
ISO 8601 timestamp when job was created
updated_at
string
ISO 8601 timestamp of last update
worker_id
string
ID of the worker that processed the job (if assigned)

Example Request

curl https://spideriq.di-atomic.com/api/v1/jobs/list \
  -H "Authorization: Bearer <your_token>"

Example Response

{
  "total": 1234,
  "page": 1,
  "page_size": 50,
  "total_pages": 25,
  "jobs": [
    {
      "job_id": "550e8400-e29b-41d4-a716-446655440000",
      "type": "spiderSite",
      "status": "completed",
      "url": "https://example.com",
      "created_at": "2025-10-27T10:00:00Z",
      "updated_at": "2025-10-27T10:02:45Z",
      "worker_id": "spider-site-main-1"
    },
    {
      "job_id": "660e8400-e29b-41d4-a716-446655440001",
      "type": "spiderMaps",
      "status": "completed",
      "url": "https://maps.google.com/...",
      "created_at": "2025-10-27T09:55:00Z",
      "updated_at": "2025-10-27T09:56:30Z",
      "worker_id": "spider-maps-main-1"
    },
    {
      "job_id": "770e8400-e29b-41d4-a716-446655440002",
      "type": "spiderSite",
      "status": "processing",
      "url": "https://blog.example.com",
      "created_at": "2025-10-27T10:05:00Z",
      "updated_at": "2025-10-27T10:05:30Z",
      "worker_id": "spider-site-main-2"
    }
  ]
}

Pagination Example

import requests

def get_all_jobs(auth_token, status_filter=None, type_filter=None):
    """Fetch all jobs across multiple pages"""
    url = "https://spideriq.di-atomic.com/api/v1/jobs/list"
    headers = {"Authorization": f"Bearer {auth_token}"}

    all_jobs = []
    page = 1
    page_size = 100  # Use maximum page size

    while True:
        params = {
            "page": page,
            "page_size": page_size
        }

        if status_filter:
            params["status_filter"] = status_filter
        if type_filter:
            params["type_filter"] = type_filter

        response = requests.get(url, headers=headers, params=params)
        data = response.json()

        all_jobs.extend(data["jobs"])

        # Check if we've reached the last page
        if page >= data["total_pages"]:
            break

        page += 1

    return all_jobs

# Usage
jobs = get_all_jobs(
    "<your_token>",
    status_filter="completed",
    type_filter="spiderSite"
)
print(f"Found {len(jobs)} jobs")

Use Cases

Get Recent Completed Jobs

curl "https://spideriq.di-atomic.com/api/v1/jobs/list?status_filter=completed&sort_by=updated_at&sort_order=desc&page_size=10" \
  -H "Authorization: Bearer <your_token>"

Get Failed Jobs for Debugging

curl "https://spideriq.di-atomic.com/api/v1/jobs/list?status_filter=failed&sort_by=updated_at&sort_order=desc" \
  -H "Authorization: Bearer <your_token>"

Get All SpiderMaps Jobs

curl "https://spideriq.di-atomic.com/api/v1/jobs/list?type_filter=spiderMaps" \
  -H "Authorization: Bearer <your_token>"

Get Jobs Currently Processing

curl "https://spideriq.di-atomic.com/api/v1/jobs/list?status_filter=processing" \
  -H "Authorization: Bearer <your_token>"

Notes

Default sorting: Jobs are sorted by created_at in descending order (newest first) by default.
Performance: Use the maximum page_size=100 for fewer API calls when fetching large datasets.
Rate limits apply: Each request counts toward your 100 requests/minute limit. When iterating through many pages, implement rate limiting in your code.