How to Validate Emails in Python: API Tutorial with Code
Complete Python tutorial for email validation. Working code examples using API integration, batch processing, and error handling.
How to Validate Emails in Python: API Tutorial with Code
Published February 1, 2026
You have a CSV with 10,000 email addresses. Your boss wants to know which ones are valid before sending a campaign. Regex isn't enough. You need real validation.
This tutorial shows you how to build a Python email validator that checks syntax, domain validity, SMTP responses, and disposable email detection.
We'll start simple and add complexity. By the end, you'll have production-ready code.
What We're Building
A Python script that:
- Validates single emails via API
- Processes CSV files in batches
- Handles rate limits and errors gracefully
- Exports results with detailed status codes
- Tracks validation costs
Time to complete: 20-30 minutes
Requirements: Python 3.7+, requests library, API key
Setup
Install dependencies:
pip install requests pandas python-dotenv
Create a .env file for your API key:
EMAIL_VALIDATION_API_KEY=your_key_here
Never hardcode API keys. Use environment variables. Before diving into code, make sure you understand the difference between email validation and verification.
Method 1: Simple Single Email Validation
Let's start with the basics. Validate one email.
import requests
import os
from dotenv import load_dotenv
load_dotenv()
def validate_email(email):
"""Validate a single email address."""
api_key = os.getenv('EMAIL_VALIDATION_API_KEY')
url = 'https://api.emails-wipes.com/v1/validate'
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
payload = {'email': email}
try:
response = requests.post(url, json=payload, headers=headers, timeout=10)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
return {'error': str(e), 'email': email}
# Test it
result = validate_email('[email protected]')
print(result)
This returns a JSON response:
{
"email": "[email protected]",
"status": "valid",
"is_disposable": false,
"is_role_based": false,
"domain": "gmail.com",
"smtp_check": "ok",
"mx_records": true
}
Simple. But not production-ready. Let's improve it.
Method 2: Batch Processing with CSV
You rarely validate one email. You validate thousands. Let's process a CSV. For best practices on this, see our guide on how to validate emails in bulk.
import pandas as pd
import time
def validate_csv(input_file, output_file):
"""Validate emails from CSV and export results."""
# Read input CSV
df = pd.read_csv(input_file)
# Assuming email column is named 'email'
if 'email' not in df.columns:
raise ValueError("CSV must have an 'email' column")
results = []
for index, row in df.iterrows():
email = row['email']
# Validate
result = validate_email(email)
# Add result to list
results.append({
'email': email,
'status': result.get('status', 'error'),
'is_disposable': result.get('is_disposable', None),
'is_role_based': result.get('is_role_based', None),
'error': result.get('error', None)
})
# Rate limiting: sleep 0.1 seconds between requests
time.sleep(0.1)
# Progress indicator
if (index + 1) % 100 == 0:
print(f"Processed {index + 1}/{len(df)} emails")
# Create results DataFrame
results_df = pd.DataFrame(results)
# Merge with original data
output_df = pd.merge(df, results_df, on='email', how='left')
# Export
output_df.to_csv(output_file, index=False)
print(f"\nValidation complete!")
print(f"Results saved to {output_file}")
# Summary stats
print(f"\nSummary:")
print(results_df['status'].value_counts())
# Usage
validate_csv('emails.csv', 'validated_emails.csv')
This works but has problems:
- Slow for large lists (one API call per email)
- No retry logic if API fails
- Doesn't handle duplicate emails
- Basic rate limiting
Let's fix these.
Method 3: Production-Ready Validator
This version includes error handling, retries, deduplication, and proper rate limiting.
import requests
import pandas as pd
import time
import os
from dotenv import load_dotenv
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
load_dotenv()
class EmailValidator:
def __init__(self, api_key=None):
self.api_key = api_key or os.getenv('EMAIL_VALIDATION_API_KEY')
self.base_url = 'https://api.emails-wipes.com/v1'
self.session = self._create_session()
self.validation_count = 0
def _create_session(self):
"""Create requests session with retry logic."""
session = requests.Session()
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount("http://", adapter)
session.mount("https://", adapter)
return session
def validate_single(self, email):
"""Validate a single email address."""
headers = {
'Authorization': f'Bearer {self.api_key}',
'Content-Type': 'application/json'
}
payload = {'email': email.strip().lower()}
try:
response = self.session.post(
f'{self.base_url}/validate',
json=payload,
headers=headers,
timeout=10
)
response.raise_for_status()
self.validation_count += 1
return response.json()
except requests.exceptions.HTTPError as e:
if response.status_code == 429:
return {'error': 'rate_limit', 'email': email}
return {'error': f'http_{response.status_code}', 'email': email}
except requests.exceptions.RequestException as e:
return {'error': 'network_error', 'email': email}
def validate_batch(self, emails, rate_limit=10):
"""
Validate multiple emails with rate limiting.
Args:
emails: List of email addresses
rate_limit: Max requests per second (default: 10)
"""
results = []
delay = 1.0 / rate_limit # Time between requests
for i, email in enumerate(emails):
result = self.validate_single(email)
results.append(result)
# Progress
if (i + 1) % 100 == 0:
print(f"Validated {i + 1}/{len(emails)} emails")
# Rate limiting
time.sleep(delay)
return results
def validate_csv(self, input_file, output_file, email_column='email'):
"""
Validate emails from CSV file.
Args:
input_file: Path to input CSV
output_file: Path to output CSV
email_column: Name of email column (default: 'email')
"""
print(f"Reading {input_file}...")
df = pd.read_csv(input_file)
if email_column not in df.columns:
raise ValueError(f"Column '{email_column}' not found in CSV")
# Deduplicate emails
original_count = len(df)
df_dedup = df.drop_duplicates(subset=[email_column])
dedup_count = len(df_dedup)
if original_count != dedup_count:
print(f"Removed {original_count - dedup_count} duplicate emails")
# Extract unique emails
emails = df_dedup[email_column].tolist()
print(f"Validating {len(emails)} unique emails...")
# Validate
results = self.validate_batch(emails)
# Create results DataFrame
results_df = pd.DataFrame(results)
# Merge with original data (deduplicated)
output_df = pd.merge(
df_dedup,
results_df,
left_on=email_column,
right_on='email',
how='left'
)
# Export
output_df.to_csv(output_file, index=False)
# Statistics
self._print_summary(results_df)
print(f"\nResults saved to: {output_file}")
print(f"Total validations performed: {self.validation_count}")
return output_df
def _print_summary(self, results_df):
"""Print validation summary statistics."""
print("\n" + "="*50)
print("VALIDATION SUMMARY")
print("="*50)
# Status breakdown
print("\nStatus Breakdown:")
print(results_df['status'].value_counts())
# Calculate percentages
total = len(results_df)
valid = len(results_df[results_df['status'] == 'valid'])
invalid = len(results_df[results_df['status'] == 'invalid'])
print(f"\nValid: {valid}/{total} ({valid/total*100:.1f}%)")
print(f"Invalid: {invalid}/{total} ({invalid/total*100:.1f}%)")
# Disposable emails
if 'is_disposable' in results_df.columns:
disposable = results_df['is_disposable'].sum()
print(f"Disposable: {disposable} ({disposable/total*100:.1f}%)")
# Role-based emails (learn more: /blog/role-based-email-addresses-guide.html)
if 'is_role_based' in results_df.columns:
role_based = results_df['is_role_based'].sum()
print(f"Role-based: {role_based} ({role_based/total*100:.1f}%)")
# Usage example
if __name__ == "__main__":
validator = EmailValidator()
# Validate single email
result = validator.validate_single('[email protected]')
print(result)
# Validate CSV
validator.validate_csv(
input_file='contacts.csv',
output_file='contacts_validated.csv',
email_column='email'
)
This is production-ready. It handles:
- Retries: Automatic retry on network errors or rate limits
- Deduplication: Doesn't waste API calls on duplicate emails
- Rate limiting: Configurable requests per second
- Progress tracking: Shows validation progress
- Error handling: Gracefully handles API failures
- Summary stats: Shows results breakdown
Advanced: Filtering and Segmentation
Once validated, you can filter by criteria:
def filter_results(input_file, output_file, criteria):
"""
Filter validated emails by criteria.
Args:
input_file: Validated CSV file
output_file: Filtered output file
criteria: Dict of filtering rules
"""
df = pd.read_csv(input_file)
# Example filters
filtered = df.copy()
if criteria.get('valid_only'):
filtered = filtered[filtered['status'] == 'valid']
if criteria.get('exclude_disposable'):
filtered = filtered[filtered['is_disposable'] == False]
if criteria.get('exclude_role_based'):
filtered = filtered[filtered['is_role_based'] == False]
filtered.to_csv(output_file, index=False)
print(f"Filtered {len(filtered)}/{len(df)} emails")
return filtered
# Usage
filter_results(
'contacts_validated.csv',
'contacts_clean.csv',
criteria={
'valid_only': True,
'exclude_disposable': True,
'exclude_role_based': False
}
)
Handling Large Files (100K+ Emails)
For very large files, process in chunks to avoid memory issues. If you're also validating email formats with regex before API calls, check out our email regex patterns guide.
def validate_large_csv(input_file, output_file, chunk_size=1000):
"""Validate large CSV files in chunks."""
validator = EmailValidator()
# Process in chunks
chunks = pd.read_csv(input_file, chunksize=chunk_size)
first_chunk = True
for i, chunk in enumerate(chunks):
print(f"\nProcessing chunk {i+1}...")
emails = chunk['email'].tolist()
results = validator.validate_batch(emails)
results_df = pd.DataFrame(results)
output_chunk = pd.merge(chunk, results_df, on='email', how='left')
# Write to file (append after first chunk)
mode = 'w' if first_chunk else 'a'
header = first_chunk
output_chunk.to_csv(output_file, mode=mode, header=header, index=False)
first_chunk = False
print(f"\nValidation complete! Results in {output_file}")
Cost Tracking
Track validation costs in real-time:
class CostTracker:
def __init__(self, cost_per_validation=0.005):
self.cost_per_validation = cost_per_validation
self.total_validations = 0
def add_validations(self, count):
self.total_validations += count
def get_total_cost(self):
return self.total_validations * self.cost_per_validation
def print_summary(self):
print(f"\n๐ฐ Cost Summary:")
print(f"Total validations: {self.total_validations:,}")
print(f"Cost per validation: ${self.cost_per_validation}")
print(f"Total cost: ${self.get_total_cost():.2f}")
# Integrate with validator
tracker = CostTracker(cost_per_validation=0.005)
validator = EmailValidator()
results = validator.validate_batch(emails)
tracker.add_validations(len(results))
tracker.print_summary()
Error Handling Best Practices
Things will go wrong. Handle them gracefully:
def safe_validate(email):
"""Validate with comprehensive error handling."""
try:
# Validate
result = validator.validate_single(email)
# Check for API errors
if 'error' in result:
error_type = result['error']
if error_type == 'rate_limit':
print(f"Rate limit hit. Waiting 60 seconds...")
time.sleep(60)
return safe_validate(email) # Retry
elif error_type.startswith('http_'):
print(f"HTTP error for {email}: {error_type}")
return {'email': email, 'status': 'error', 'error': error_type}
else:
return {'email': email, 'status': 'error', 'error': error_type}
return result
except Exception as e:
print(f"Unexpected error validating {email}: {str(e)}")
return {'email': email, 'status': 'error', 'error': 'unexpected'}
Testing Your Integration
Before running on production data, test with these emails:
test_emails = [
'[email protected]', # Should pass
'[email protected]', # Should fail (domain doesn't exist)
'[email protected]', # Disposable email
'[email protected]', # Role-based email
'not-an-email', # Syntax error
'user@', # Incomplete
]
for email in test_emails:
result = validator.validate_single(email)
print(f"{email:30} โ {result.get('status')}")
Expected output:
[email protected] โ valid
[email protected] โ invalid
[email protected] โ valid (but is_disposable=true)
[email protected] โ valid (but is_role_based=true)
not-an-email โ invalid
user@ โ invalid
Common Gotchas
Gotcha #1: Forgetting to strip whitespace
Always email.strip() before validation. Leading/trailing spaces cause syntax errors.
Gotcha #2: Case sensitivity
Normalize to lowercase: email.lower(). Emails are case-insensitive but different cases can cause duplicate issues.
Gotcha #3: Not handling rate limits
APIs have limits. Implement exponential backoff when you hit 429 errors.
Gotcha #4: Ignoring catch-all domains
Some domains accept all addresses. These return "unknown" status. Decide how to handle them (send with caution or skip).
Gotcha #5: Validating the same email twice
Deduplicate before validating to save API calls and money.
Next Steps
You now have a production-ready email validator in Python. Some enhancements to consider:
- Database integration: Store validation results in PostgreSQL or MySQL
- Scheduling: Use cron or celery to validate lists automatically
- Web interface: Build a Flask/Django front-end for non-technical users
- Batch API: Some validation services offer bulk endpoints that are faster
- Caching: Cache results for emails you've already validated
The complete code from this tutorial is available as a GitHub Gist (link in comments).
Now go validate some emails!
Get Your API Key
Sign up for free and get 100 daily validations. No credit card required.
Get StartedRelated Articles
Email Validation for Shopify Stores: Step-by-Step Setup
Complete tutorial for adding email validation to your Shopify store with code examples...
January 20, 2026 ยท 9 min readHow a SaaS Startup Cut Their Bounce Rate from 12% to 0.3%
Real case study with exact validation processes and results...
January 15, 2026 ยท 8 min read