Content Moderation with Python & FastAPI
This example demonstrates a complete content moderation pipeline using FastAPI and SQLAlchemy: users create posts, others report them, and after 3 reports the content automatically goes to Outharm's AI moderator. If it passes but gets 6 reports total, it escalates to human review.
You'll learn how to integrate both automated and manual moderation, handle webhook callbacks securely, and manage the complete moderation lifecycle from report thresholds to final decisions.
⚠️ Prerequisites
You'll need an Outharm account with API token and moderation schema configured. Set up PostgreSQL database and have basic Python/FastAPI knowledge.
Project Setup
The highlighted requirements.txt
contains moderation-specific dependencies. FastAPI provides the web framework, SQLAlchemy handles database operations, and httpx makes HTTP requests to Outharm's API.
uvicorn
serves the FastAPI application with async support for handling moderation requests efficiently.psycopg2-binary
connects to PostgreSQL where we'll store posts and track moderation status.
🚨 Security Notice
The python-dotenv
dependency manages environment variables securely. Never commit credentials to version control - always use environment variables for API tokens and database connections.
Run pip install -r requirements.txt
to get these dependencies, then we'll configure the moderation API credentials.
Moderation API Configuration
The highlighted .env
file contains the credentials that connect your app to Outharm's moderation API. Each variable serves a specific purpose in the moderation workflow.
OUTHARM_API_URL
points to the moderation endpoints. OUTHARM_TOKEN
authenticates your API requests.OUTHARM_SCHEMA_ID
tells the AI what content fields to analyze (in our case: title and content).
🔒 Critical: Webhook Security
OUTHARM_WEBHOOK_SECRET
is your defense against fake moderation results. Without proper validation, attackers could delete legitimate content by sending fake "harmful" webhooksor bypass moderation entirely with fake "safe" results.
Get your API token from Outharm Console → Access Tokens. Create a schema defining your content structure (we'll send title and content fields). Configure webhooks in Console → Manual with your server URL.
The database connection string should point to your PostgreSQL instance. We'll use this for storing posts, reports, and tracking moderation status.
Database Models for Moderation
The highlighted SQLAlchemy models contain two classes: Post
and Report
. The key moderation fields are submission_id
(links to Outharm) and manual_moderation_status
(tracks human review state).
Post Model - Moderation Fields
submission_id
stores the ID returned by Outharm's API. This is crucial for two things: correlating webhook callbacks back to the right post, and escalating content to manual review.
manual_moderation_status
tracks human review state:
"moderating"
- Currently under human review"harmful"
- Confirmed harmful by human"not-harmful"
- Approved by humannull
- No manual review needed
Report Model - Threshold Logic
The UniqueConstraint('post_id', 'reported_by')
prevents duplicate reports from the same user. This ensures accurate threshold counts for triggering moderation.
We use SQLAlchemy's db.query(Report).filter().count()
to check thresholds: 3 reports triggers AI moderation, 6 reports escalates to humans.
💡 Why Count Queries?
We count reports dynamically instead of storing a counter. This prevents race conditions when multiple users report the same post simultaneously and keeps individual reports for audit purposes.
After creating these models, run alembic init alembic
and alembic revision --autogenerate
to set up your database migrations.
FastAPI Application Setup
The highlighted code shows the FastAPI initialization with the essential imports: FastAPI for the web framework, SQLAlchemy for database operations, and httpx for HTTP requests to Outharm.
We create the database engine and tables automatically on startup. The dependency injection pattern with Depends(get_db)
provides database sessions for our moderation endpoints.
Pydantic models validate request data, ensuring our moderation endpoints receive properly structured post and report data.
Post Creation
The highlighted endpoint creates posts with title
, content
, and author_id
. These are the fields we'll later send to Outharm for moderation analysis.
Notice that submission_id
and manual_moderation_status
start as None
. They get populated when the post enters moderation (after receiving reports).
Posts aren't moderated when created - only when they receive their third report. This keeps posting fast while still catching harmful content through community reporting.
Report System & Moderation Triggers
The highlighted code shows the core moderation logic. When someone reports a post, we create a report record and count total reports for that post using db.query(Report).filter().count()
.
Threshold Logic
3 reports: Calls send_to_moderation(post)
to send content to Outharm's AI moderator.
6 reports: If the post has a submission_id
(meaning it passed AI moderation), escalates to human review with send_to_manual_moderation()
.
Duplicate Prevention
The unique constraint prevents the same user from reporting a post multiple times. When violated, SQLAlchemy raises a constraint violation, which we catch and return a user-friendly message.
💡 Why This Flow Works
Content gets progressively more scrutiny as reports accumulate. AI catches obvious violations quickly, while edge cases that generate more complaints get human attention.
Automated Moderation with Outharm
The highlighted send_to_moderation
function handles AI moderation. Let's break down each part:
Schema Validation
First, we check if OUTHARM_SCHEMA_ID
exists. Without it, we can't tell Outharm what content fields to analyze. The function logs a warning and returns early if missing.
Async HTTP Request
Using httpx.AsyncClient()
, we make non-blocking HTTP requests to Outharm's API. The moderation payload maps our post data to Outharm's expected format:
schema_id
- Which moderation schema to usecontent.title
- Array containing the post titlecontent.content
- Array containing the post body
Critical: Submission ID Storage
We store result["submission_id"]
in our database. This ID is essential for two things: escalating to manual review later and correlating webhook callbacks back to the right post.
Immediate Action on Harmful Content
If result["is_harmful"]
is true, we delete the post immediately using db.delete()
. No waiting for webhooks - harmful content gets removed fast to limit exposure.
⚡ Speed is Critical
Automated moderation gives instant results. Harmful content is deleted within seconds of the API call, minimizing the time it's visible to users.
Manual Escalation
The highlighted send_to_manual_moderation
function escalates content to human reviewers. This happens when content passes AI moderation but gets 6 reports total.
Using Existing Submission
Instead of creating a new submission, we escalate the existing one usingPOST /submissions/[submissionId]/manual
. This is more efficient because human reviewers can see the AI's decision and context.
Async HTTP Handling
Using httpx.AsyncClient()
ensures the escalation request doesn't block other API operations. The function handles errors gracefully and logs the escalation status.
Human review results come back via webhooks, usually within 24-48 hours. The webhook handler processes these results and takes final action.
Webhook Security
The highlighted webhook endpoint receives moderation results from Outharm. Security validation happens before any database operations to prevent fake moderation results.
Signature Verification
We check the webhook signature against OUTHARM_WEBHOOK_SECRET
. If they don't match, we return HTTPException(401)
and ignore the request.
🚨 Why Security Matters
Without validation, attackers could send fake webhooks to delete legitimate posts with forged "harmful" results or bypass moderation by marking harmful content as "safe". This simple check prevents most attacks.
Event Filtering
We only process moderation.manual.completed
events. This prevents errors from unexpected event types and ensures our handler only runs when human review is finished.
The data
object contains the moderation result, including submission_id
and is_harmful
which we'll use to take action on the right post.
Processing Moderation Results
The highlighted handle_manual_moderation_result
function processes human review results. It uses the submission ID to find the right post and takes action based on the verdict.
Finding the Post
db.query(Post).filter().first()
finds the post that matches the webhook's submission ID. This links the moderation result back to the right content.
Handling Harmful Content
If data["is_harmful"]
is true, we delete the post with db.delete(post)
. SQLAlchemy's cascade delete automatically removes associated reports too.
Handling Safe Content
For safe content, we update manual_moderation_status
to "not-harmful"
. This marks the post as human-approved and can help inform future moderation decisions.
💡 Why This Works
Storing submission IDs during initial moderation creates a reliable link between webhook events and your posts. No guesswork or complex matching needed.
Complete API & Testing
The highlighted code completes our API with a posts listing endpoint and server startup. The GET /posts
endpoint includes report counts and moderation status for full pipeline visibility.
Posts Listing with Moderation Data
The query retrieves all posts and dynamically calculates report counts using the relationship. This shows the complete moderation state for each post.
Testing Your Moderation System
Start your server with uvicorn main:app --reload
and test the flow:
- Create posts via
POST /posts
- Report the same post 3 times to trigger AI moderation
- Check Outharm console for submission
- Report 6 times total to trigger human review
- Use ngrok for webhook testing:
ngrok http 8000
🎉 You're Done!
Your FastAPI moderation system handles reports intelligently, uses AI for fast decisions, escalates edge cases to humans, and prevents race conditions. It's ready for production scaling.
🚀Ready to Get Started?
Congratulations! You've built a complete content moderation system with Python. Here are some recommended next steps to get the most out of Outharm:
Related Documentation
- • Schemas & Components - Structure content for better analysis
- • Categories - Configure what content types to detect
- • API Authentication - Secure your moderation endpoints
- • Console Guide - Manage projects and review decisions