Overview
SpiderMaps scrapes business listings from Google Maps using search queries - the same way you would search on Google Maps. This allows you to extract up to 100-120 businesses per query , making it perfect for bulk lead generation, market research, and building comprehensive business directories.
Primary Method: Search queries like “Restaurant Berlin, Germany” or “Coffee shops 10711, Germany”Alternative Method: Individual business URLs (for specific businesses you already know)
Each business listing includes:
Basic Information
Business name
Full address
Phone number
Website URL
Google Place ID
Ratings & Reviews
Google rating (1-5 stars)
Total review count
Business categories
Price level ($ to $$$$)
Operational Data
Business hours (by day)
Business status (open/closed)
Popular times (if available)
Location Data
Latitude/longitude coordinates
Google Maps link
Photo URLs (optional)
Understanding Search Queries
Search queries follow this simple pattern:
[Category/Keyword] + [Location]
Examples:
"Restaurant Berlin, Germany"
"Coffee shops 10711, Germany" (with postal code)
"Non-Profit Organization in Randers, Denmark"
"Italian restaurant Madrid, España"
"Hotels near Times Square, New York"
Why Search Queries?
Up to 120 Results Per Query Each search query can return 100-120 business listings, making it far more efficient than scraping individual business URLs.
Comparison:
❌ Individual URLs: 1 request = 1 business
✅ Search Query: 1 request = 100-120 businesses
Strategic Query Design
Small Cities (< 100,000 people)
For smaller cities, use Keyword + City name :
queries = [
"Restaurant Randers, Denmark" ,
"Hotel Randers, Denmark" ,
"Cafe Randers, Denmark"
]
Small cities typically have fewer than 100-120 businesses per category, so a single query will capture all results.
Large Cities (> 100,000 people)
For large cities like New York, Berlin, or London, use Keyword + Postal Code or Keyword + Neighborhood :
# Berlin postal codes
queries = [
"Restaurant 10711, Germany" , # Wilmersdorf
"Restaurant 10115, Germany" , # Mitte
"Restaurant 10247, Germany" , # Friedrichshain
"Restaurant 10178, Germany" , # Mitte (Alexanderplatz)
]
Why Postal Codes Matter: NYC has 25,000+ restaurants. Using just “Restaurant New York” returns only 100-120 results, missing 99% of businesses . Breaking it down by postal codes ensures complete coverage.
Multi-Language Support
Use the local language for better results:
# German
"Restaurants in Berlin, Deutschland"
# Spanish
"Restaurantes en Madrid, España"
# French
"Restaurants à Paris, France"
# Danish
"Restauranter i København, Danmark"
Specify the lang parameter to match:
{
"search_query" : "Restaurantes en Madrid, España" ,
"lang" : "es" # Spanish
}
Finding the Right Keywords
1. Google Business Categories
Use official Google Maps categories for best results:
Restaurant
Italian restaurant
Chinese restaurant
Fast food restaurant
Cafe
Bakery
Bar
Pizza restaurant
Lawyer
Dentist
Hair salon
Real estate agency
Insurance agency
Accounting firm
Marketing agency
Clothing store
Electronics store
Grocery store
Pharmacy
Bookstore
Furniture store
Hospital
Gym
Yoga studio
Spa
Physical therapist
Chiropractor
Hotel
Event venue
Coworking space
Non-profit organization
Government office
2. Category Subcategories
Combine main category with subcategory for targeted results:
queries = [
"Italian restaurant Berlin" , # Specific cuisine
"Boutique hotel Paris" , # Specific hotel type
"Organic grocery store Munich" , # Specific store type
"Corporate law firm Frankfurt" , # Specific practice area
]
3. Custom Search Terms
You can also use descriptive search terms:
queries = [
"vegan restaurants Berlin" ,
"24 hour pharmacy Munich" ,
"pet friendly hotels Barcelona" ,
"disability rights organization Denmark" ,
]
Basic Usage
Submit a Search Query Job
import requests
url = "https://spideriq.di-atomic.com/api/v1/jobs/spiderMaps/submit"
headers = {
"Authorization" : "Bearer <your_token>" ,
"Content-Type" : "application/json"
}
data = {
"payload" : {
"search_query" : "Restaurant Berlin, Germany" ,
"max_results" : 100 ,
"lang" : "en"
}
}
response = requests.post(url, headers = headers, json = data)
job = response.json()
print ( f "Job submitted: { job[ 'job_id' ] } " )
Response:
{
"success" : true ,
"job_id" : "660e8400-e29b-41d4-a716-446655440001" ,
"type" : "spiderMaps" ,
"status" : "queued" ,
"message" : "SpiderMaps job queued successfully"
}
Retrieve Results
import requests
import time
job_id = "660e8400-e29b-41d4-a716-446655440001"
headers = { "Authorization" : "Bearer <your_token>" }
while True :
response = requests.get(
f "https://spideriq.di-atomic.com/api/v1/jobs/ { job_id } /results" ,
headers = headers
)
if response.status_code == 200 :
# Job completed!
result = response.json()
businesses = result[ 'data' ][ 'businesses' ]
print ( f "Found { len (businesses) } businesses" )
for biz in businesses[: 5 ]: # Show first 5
print ( f " - { biz[ 'name' ] } " )
print ( f " { biz[ 'address' ] } " )
print ( f " Rating: { biz.get( 'rating' , 'N/A' ) } ⭐ ( { biz.get( 'reviews_count' , 0 ) } reviews)" )
print ( f " Phone: { biz.get( 'phone' , 'N/A' ) } " )
print ()
break
elif response.status_code == 202 :
# Still processing
print ( "Waiting for results..." )
time.sleep( 3 )
else :
print ( f "Error: { response.json() } " )
break
Results Structure
When your job completes, you’ll receive an array of businesses:
{
"success" : true ,
"job_id" : "660e8400-e29b-41d4-a716-446655440001" ,
"type" : "spiderMaps" ,
"status" : "completed" ,
"processing_time_seconds" : 45.2 ,
"data" : {
"query" : "Restaurant Berlin, Germany" ,
"results_count" : 100 ,
"businesses" : [
{
"name" : "Restaurant Zur letzten Instanz" ,
"place_id" : "ChIJN1t_tDeuEmsRUsoyG83frY4" ,
"rating" : 4.3 ,
"reviews_count" : 2847 ,
"address" : "Waisenstraße 14-16, 10179 Berlin, Germany" ,
"phone" : "+49 30 2425528" ,
"website" : "https://www.zurletzteninstanz.de/" ,
"categories" : [ "German restaurant" , "Traditional restaurant" ],
"coordinates" : {
"latitude" : 52.5170365 ,
"longitude" : 13.4174634
},
"link" : "https://www.google.com/maps/place/..." ,
"business_status" : "OPERATIONAL" ,
"price_range" : "$$" ,
"working_hours" : {
"Monday" : "12:00 PM - 11:00 PM" ,
"Tuesday" : "12:00 PM - 11:00 PM" ,
"Wednesday" : "12:00 PM - 11:00 PM" ,
"Thursday" : "12:00 PM - 11:00 PM" ,
"Friday" : "12:00 PM - 11:30 PM" ,
"Saturday" : "12:00 PM - 11:30 PM" ,
"Sunday" : "12:00 PM - 11:00 PM"
}
},
// ... 99 more businesses
],
"metadata" : {
"max_results" : 100 ,
"extract_reviews" : false ,
"extract_photos" : false ,
"language" : "en"
}
}
}
Bulk Scraping Strategy
Complete City Coverage
For comprehensive coverage of large cities, break down by postal codes:
import requests
# Berlin postal codes (example subset)
berlin_postal_codes = [
"10115" , "10117" , "10119" , "10178" , "10179" , # Mitte
"10243" , "10245" , "10247" , "10249" , # Friedrichshain
"10551" , "10553" , "10555" , "10557" , # Tiergarten
"10711" , "10713" , "10715" , "10717" , # Wilmersdorf
]
category = "Restaurant"
headers = { "Authorization" : "Bearer <your_token>" }
submit_url = "https://spideriq.di-atomic.com/api/v1/jobs/spiderMaps/submit"
job_ids = []
for postal_code in berlin_postal_codes:
query = f " { category } { postal_code } , Germany"
data = {
"payload" : {
"search_query" : query,
"max_results" : 100 ,
"lang" : "de"
}
}
response = requests.post(submit_url, headers = headers, json = data)
job_id = response.json()[ 'job_id' ]
job_ids.append((postal_code, job_id))
print ( f "✓ Submitted: { query } (Job ID: { job_id } )" )
print ( f " \n Total jobs submitted: { len (job_ids) } " )
print ( f "Expected businesses: { len (job_ids) * 100 } (assuming 100 per zone)" )
Coverage Calculation
Coverage Estimation:
Single query: 100-120 businesses
Small city (< 100k people): 1-5 queries for complete coverage
Large city (> 100k people): 10-50+ queries (by postal code)
Example:
Berlin has ~95 postal codes
“Restaurant” query per postal code = 95 queries
95 queries × 100 businesses = ~9,500 restaurants
Common Use Cases
1. Lead Generation by Category & Location
Extract businesses for B2B outreach:
# Target: Marketing agencies in Munich
query = "Marketing agency Munich, Germany"
data = {
"payload" : {
"search_query" : query,
"max_results" : 100 ,
"lang" : "de"
}
}
# Submit and retrieve
job_id = submit_job(data)
results = get_results(job_id)
# Extract contact info for CRM
for biz in results[ 'data' ][ 'businesses' ]:
lead = {
'company_name' : biz[ 'name' ],
'phone' : biz.get( 'phone' ),
'website' : biz.get( 'website' ),
'address' : biz.get( 'address' ),
'rating' : biz.get( 'rating' ),
'google_maps_link' : biz[ 'link' ]
}
# Add to your CRM
add_to_crm(lead)
2. Market Research for Specific Industries
Analyze competition in target markets:
# Research hotel market in major European cities
cities = [
( "Hotel Barcelona, Spain" , "es" ),
( "Hotel Paris, France" , "fr" ),
( "Hotel Berlin, Germany" , "de" ),
( "Hotel Amsterdam, Netherlands" , "nl" ),
( "Hotel Rome, Italy" , "it" )
]
market_data = {}
for query, lang in cities:
city_name = query.split()[ 1 ].rstrip( ',' )
data = {
"payload" : {
"search_query" : query,
"max_results" : 100 ,
"lang" : lang
}
}
# Submit and collect
job_id = submit_job(data)
results = get_results(job_id)
businesses = results[ 'data' ][ 'businesses' ]
# Analyze
market_data[city_name] = {
'total_hotels' : len (businesses),
'avg_rating' : sum (b.get( 'rating' , 0 ) for b in businesses) / len (businesses),
'price_distribution' : count_price_ranges(businesses),
'top_rated' : sorted (businesses, key = lambda x : x.get( 'rating' , 0 ), reverse = True )[: 10 ]
}
print (json.dumps(market_data, indent = 2 ))
3. Building Comprehensive Directories
Create location-based business directories:
# Build restaurant directory for tourist district
districts = [
"Restaurant Kreuzberg, Berlin" ,
"Restaurant Mitte, Berlin" ,
"Restaurant Prenzlauer Berg, Berlin" ,
"Restaurant Charlottenburg, Berlin"
]
directory = []
for district_query in districts:
district_name = district_query.split()[ 1 ].rstrip( ',' )
data = {
"payload" : {
"search_query" : district_query,
"max_results" : 100 ,
"lang" : "de" ,
"extract_photos" : True # Include photos for directory
}
}
job_id = submit_job(data)
results = get_results(job_id)
for biz in results[ 'data' ][ 'businesses' ]:
directory.append({
'name' : biz[ 'name' ],
'district' : district_name,
'address' : biz.get( 'address' ),
'phone' : biz.get( 'phone' ),
'website' : biz.get( 'website' ),
'rating' : biz.get( 'rating' ),
'price_range' : biz.get( 'price_range' ),
'categories' : biz.get( 'categories' , []),
'photo_url' : biz.get( 'photos' , [ None ])[ 0 ] # First photo
})
# Export to CSV or database
export_to_csv(directory, 'berlin_restaurants.csv' )
4. Competitor Analysis by Region
Monitor competitor locations and ratings:
# Track competitor coffee chain locations
competitor_name = "Starbucks"
cities_to_monitor = [
( "Starbucks Berlin, Germany" , "de" ),
( "Starbucks Munich, Germany" , "de" ),
( "Starbucks Hamburg, Germany" , "de" )
]
competitor_report = {}
for query, lang in cities_to_monitor:
city = query.split()[ 1 ].rstrip( ',' )
data = {
"payload" : {
"search_query" : query,
"max_results" : 100 ,
"lang" : lang
}
}
job_id = submit_job(data)
results = get_results(job_id)
businesses = results[ 'data' ][ 'businesses' ]
competitor_report[city] = {
'location_count' : len (businesses),
'average_rating' : sum (b.get( 'rating' , 0 ) for b in businesses) / len (businesses) if businesses else 0 ,
'locations' : [
{
'address' : b.get( 'address' ),
'rating' : b.get( 'rating' ),
'reviews' : b.get( 'reviews_count' )
}
for b in businesses
]
}
print ( f "Competitor Analysis Report:" )
for city, data in competitor_report.items():
print ( f " \n { city } :" )
print ( f " Locations: { data[ 'location_count' ] } " )
print ( f " Avg Rating: { data[ 'average_rating' ] :.2f} ⭐" )
Advanced Strategies
Combining Multiple Keywords
Cast a wider net by combining related keywords:
keywords = [
"Italian restaurant" ,
"Pizza restaurant" ,
"Pasta restaurant" ,
"Trattoria"
]
location = "Berlin, Germany"
all_results = []
for keyword in keywords:
query = f " { keyword } { location } "
data = {
"payload" : {
"search_query" : query,
"max_results" : 100 ,
"lang" : "de"
}
}
job_id = submit_job(data)
results = get_results(job_id)
all_results.extend(results[ 'data' ][ 'businesses' ])
# Deduplicate by place_id
unique_businesses = {biz[ 'place_id' ]: biz for biz in all_results}
print ( f "Found { len (unique_businesses) } unique Italian restaurants" )
Language Optimization
Use local language for better, more complete results:
# Compare English vs local language results
queries = [
( "Restaurant Copenhagen, Denmark" , "en" ),
( "Restauranter København, Danmark" , "da" ) # Danish
]
for query, lang in queries:
data = {
"payload" : {
"search_query" : query,
"max_results" : 100 ,
"lang" : lang
}
}
job_id = submit_job(data)
results = get_results(job_id)
print ( f " { query } ( { lang } ): { len (results[ 'data' ][ 'businesses' ]) } results" )
# Danish query typically returns MORE results
Language Best Practice: Always use the local language when possible. For example:
Copenhagen: Use Danish ("da")
Berlin: Use German ("de")
Barcelona: Use Spanish or Catalan ("es", "ca")
Extract additional data for richer insights:
data = {
"payload" : {
"search_query" : "Hotel Paris, France" ,
"max_results" : 50 ,
"lang" : "fr" ,
"extract_reviews" : True , # Include customer reviews
"extract_photos" : True # Include photo URLs
}
}
# Processing time increases with these options:
# Base query: ~30-60 seconds
# With reviews: +20-40 seconds
# With photos: +10-20 seconds
Processing Time: Enabling extract_reviews and extract_photos significantly increases processing time:
Base query: 30-60 seconds
With reviews: 50-100 seconds
With both: 60-120 seconds
Only enable when you need this data.
Individual Business URL Method
For specific businesses you already know:
# When you have a specific Google Maps URL
data = {
"payload" : {
"url" : "https://www.google.com/maps/place/Googleplex/@37.4220656,-122.0840897" ,
"max_results" : 1
}
}
# Or use Place ID directly
data = {
"payload" : {
"url" : "ChIJN1t_tDeuEmsRUsoyG83frY4" , # Place ID
"max_results" : 1
}
}
When to Use URLs:
You already have specific business URLs
Verifying/updating existing business data
Single business lookups
Prefer search queries for:
Bulk scraping
Lead generation
Market research
Building directories
Best Practices
Start Small, Scale Up Test with max_results: 20 first to verify your query, then increase to 100 for production runs.
Use Postal Codes for Large Cities Break down cities >100k population by postal codes for complete coverage.
Respect Rate Limits SpiderIQ allows 100 requests per minute. For bulk scraping, batch your submissions: # Submit in batches of 10
for i in range ( 0 , len (queries), 10 ):
batch = queries[i:i + 10 ]
for query in batch:
submit_job(query)
# Wait 6 seconds between batches
if i + 10 < len (queries):
time.sleep( 6 )
Cache Results by Place ID Store Place IDs in your database to avoid duplicate scraping: existing_place_ids = load_from_database()
for biz in results[ 'data' ][ 'businesses' ]:
if biz[ 'place_id' ] not in existing_place_ids:
save_to_database(biz)
existing_place_ids.add(biz[ 'place_id' ])
Use Local Language Always specify the local language (lang parameter) for:
More complete results
Better category matching
Local business names
Terms of Service Ensure your use case complies with:
Google Maps Terms of Service
SpiderIQ Acceptable Use Policy
Local data protection laws (GDPR, CCPA, etc.)
Do not use for spam, unauthorized marketing, or malicious purposes.
Complete Workflow Example
Here’s a complete workflow for bulk lead generation:
import requests
import time
import csv
from concurrent.futures import ThreadPoolExecutor
# Configuration
API_BASE = "https://spideriq.di-atomic.com/api/v1"
AUTH_TOKEN = "<your_token>"
headers = { "Authorization" : f "Bearer { AUTH_TOKEN } " }
# Step 1: Define your target queries
queries = [
( "Marketing agency Berlin, Germany" , "de" ),
( "Marketing agency Munich, Germany" , "de" ),
( "Marketing agency Hamburg, Germany" , "de" ),
( "Marketing agency Frankfurt, Germany" , "de" ),
]
print ( f "🎯 Target: { len (queries) } cities for marketing agency leads \n " )
# Step 2: Submit all jobs
job_mapping = []
for query, lang in queries:
data = {
"payload" : {
"search_query" : query,
"max_results" : 100 ,
"lang" : lang
}
}
response = requests.post(
f " { API_BASE } /jobs/spiderMaps/submit" ,
headers = headers,
json = data
)
job_id = response.json()[ 'job_id' ]
job_mapping.append((query, job_id))
print ( f "✓ Submitted: { query } (Job ID: { job_id } )" )
print ( f " \n ⏳ Waiting for { len (job_mapping) } jobs to complete... \n " )
# Step 3: Poll for results (parallel)
def get_job_results ( query_and_job ):
query, job_id = query_and_job
max_wait = 120
start_time = time.time()
while time.time() - start_time < max_wait:
response = requests.get(
f " { API_BASE } /jobs/ { job_id } /results" ,
headers = headers
)
if response.status_code == 200 :
result = response.json()
businesses = result[ 'data' ][ 'businesses' ]
print ( f "✓ { query } : { len (businesses) } businesses retrieved" )
return (query, businesses)
elif response.status_code == 202 :
time.sleep( 3 )
else :
print ( f "✗ { query } : Error { response.status_code } " )
return (query, [])
print ( f "⏱️ { query } : Timeout" )
return (query, [])
# Fetch results in parallel (max 5 concurrent)
with ThreadPoolExecutor( max_workers = 5 ) as executor:
results = list (executor.map(get_job_results, job_mapping))
# Step 4: Process and export
all_leads = []
for query, businesses in results:
city = query.split()[ 2 ].rstrip( ',' )
for biz in businesses:
lead = {
'company_name' : biz[ 'name' ],
'city' : city,
'address' : biz.get( 'address' , '' ),
'phone' : biz.get( 'phone' , '' ),
'website' : biz.get( 'website' , '' ),
'rating' : biz.get( 'rating' , '' ),
'reviews' : biz.get( 'reviews_count' , 0 ),
'categories' : ', ' .join(biz.get( 'categories' , [])),
'google_maps' : biz[ 'link' ]
}
all_leads.append(lead)
# Step 5: Export to CSV
output_file = 'marketing_agencies_germany.csv'
with open (output_file, 'w' , newline = '' , encoding = 'utf-8' ) as f:
if all_leads:
writer = csv.DictWriter(f, fieldnames = all_leads[ 0 ].keys())
writer.writeheader()
writer.writerows(all_leads)
print ( f " \n ✅ Complete!" )
print ( f "📊 Total leads extracted: { len (all_leads) } " )
print ( f "💾 Exported to: { output_file } " )
print ( f "📈 Average leads per city: { len (all_leads) / len (queries) :.0f} " )
Processing Times
Typical processing times per query:
Configuration Time Basic query (max 20 results) 20-30 seconds Standard query (max 100 results) 45-75 seconds With reviews extraction 60-100 seconds With photos extraction 50-90 seconds With both reviews & photos 80-120 seconds
Optimal Polling Interval
# Recommended polling strategy
time.sleep( 5 ) # First check after 5 seconds
time.sleep( 5 ) # Then check every 5 seconds
# For queries with reviews/photos, poll less frequently
time.sleep( 10 ) # Check every 10 seconds
Parallel Processing
Process multiple results concurrently:
from concurrent.futures import ThreadPoolExecutor
def fetch_result ( job_id ):
# Your polling logic here
return get_result(job_id)
# Process up to 10 jobs in parallel
with ThreadPoolExecutor( max_workers = 10 ) as executor:
results = list (executor.map(fetch_result, job_ids))
Next Steps