Content Moderation with Python & FastAPI

This example demonstrates a complete content moderation pipeline using FastAPI and SQLAlchemy: users create posts, others report them, and after 3 reports the content automatically goes to Outharm's AI moderator. If it passes but gets 6 reports total, it escalates to human review.

You'll learn how to integrate both automated and manual moderation, handle webhook callbacks securely, and manage the complete moderation lifecycle from report thresholds to final decisions.

⚠️ Prerequisites

You'll need an Outharm account with API token and moderation schema configured. Set up PostgreSQL database and have basic Python/FastAPI knowledge.

Project Setup

The highlighted requirements.txt contains moderation-specific dependencies. FastAPI provides the web framework, SQLAlchemy handles database operations, and httpx makes HTTP requests to Outharm's API.

uvicorn serves the FastAPI application with async support for handling moderation requests efficiently.psycopg2-binary connects to PostgreSQL where we'll store posts and track moderation status.

🚨 Security Notice

The python-dotenv dependency manages environment variables securely. Never commit credentials to version control - always use environment variables for API tokens and database connections.

Run pip install -r requirements.txt to get these dependencies, then we'll configure the moderation API credentials.

Moderation API Configuration

The highlighted .env file contains the credentials that connect your app to Outharm's moderation API. Each variable serves a specific purpose in the moderation workflow.

OUTHARM_API_URL points to the moderation endpoints. OUTHARM_TOKEN authenticates your API requests.OUTHARM_SCHEMA_ID tells the AI what content fields to analyze (in our case: title and content).

🔒 Critical: Webhook Security

OUTHARM_WEBHOOK_SECRET is your defense against fake moderation results. Without proper validation, attackers could delete legitimate content by sending fake "harmful" webhooksor bypass moderation entirely with fake "safe" results.

Get your API token from Outharm Console → Access Tokens. Create a schema defining your content structure (we'll send title and content fields). Configure webhooks in Console → Manual with your server URL.

The database connection string should point to your PostgreSQL instance. We'll use this for storing posts, reports, and tracking moderation status.

Database Models for Moderation

The highlighted SQLAlchemy models contain two classes: Post and Report. The key moderation fields are submission_id (links to Outharm) and manual_moderation_status(tracks human review state).

Post Model - Moderation Fields

submission_id stores the ID returned by Outharm's API. This is crucial for two things: correlating webhook callbacks back to the right post, and escalating content to manual review.

manual_moderation_status tracks human review state:

  • "moderating" - Currently under human review
  • "harmful" - Confirmed harmful by human
  • "not-harmful" - Approved by human
  • null - No manual review needed

Report Model - Threshold Logic

The UniqueConstraint('post_id', 'reported_by') prevents duplicate reports from the same user. This ensures accurate threshold counts for triggering moderation.

We use SQLAlchemy's db.query(Report).filter().count() to check thresholds: 3 reports triggers AI moderation, 6 reports escalates to humans.

💡 Why Count Queries?

We count reports dynamically instead of storing a counter. This prevents race conditions when multiple users report the same post simultaneously and keeps individual reports for audit purposes.

After creating these models, run alembic init alembic and alembic revision --autogenerateto set up your database migrations.

FastAPI Application Setup

The highlighted code shows the FastAPI initialization with the essential imports: FastAPI for the web framework, SQLAlchemy for database operations, and httpx for HTTP requests to Outharm.

We create the database engine and tables automatically on startup. The dependency injection pattern with Depends(get_db) provides database sessions for our moderation endpoints.

Pydantic models validate request data, ensuring our moderation endpoints receive properly structured post and report data.

Post Creation

The highlighted endpoint creates posts with title, content, and author_id. These are the fields we'll later send to Outharm for moderation analysis.

Notice that submission_id and manual_moderation_status start as None. They get populated when the post enters moderation (after receiving reports).

Posts aren't moderated when created - only when they receive their third report. This keeps posting fast while still catching harmful content through community reporting.

Report System & Moderation Triggers

The highlighted code shows the core moderation logic. When someone reports a post, we create a report record and count total reports for that post using db.query(Report).filter().count().

Threshold Logic

3 reports: Calls send_to_moderation(post) to send content to Outharm's AI moderator.

6 reports: If the post has a submission_id (meaning it passed AI moderation), escalates to human review with send_to_manual_moderation().

Duplicate Prevention

The unique constraint prevents the same user from reporting a post multiple times. When violated, SQLAlchemy raises a constraint violation, which we catch and return a user-friendly message.

💡 Why This Flow Works

Content gets progressively more scrutiny as reports accumulate. AI catches obvious violations quickly, while edge cases that generate more complaints get human attention.

Automated Moderation with Outharm

The highlighted send_to_moderation function handles AI moderation. Let's break down each part:

Schema Validation

First, we check if OUTHARM_SCHEMA_ID exists. Without it, we can't tell Outharm what content fields to analyze. The function logs a warning and returns early if missing.

Async HTTP Request

Using httpx.AsyncClient(), we make non-blocking HTTP requests to Outharm's API. The moderation payload maps our post data to Outharm's expected format:

  • schema_id - Which moderation schema to use
  • content.title - Array containing the post title
  • content.content - Array containing the post body

Critical: Submission ID Storage

We store result["submission_id"] in our database. This ID is essential for two things: escalating to manual review later and correlating webhook callbacks back to the right post.

Immediate Action on Harmful Content

If result["is_harmful"] is true, we delete the post immediately using db.delete(). No waiting for webhooks - harmful content gets removed fast to limit exposure.

⚡ Speed is Critical

Automated moderation gives instant results. Harmful content is deleted within seconds of the API call, minimizing the time it's visible to users.

Manual Escalation

The highlighted send_to_manual_moderation function escalates content to human reviewers. This happens when content passes AI moderation but gets 6 reports total.

Using Existing Submission

Instead of creating a new submission, we escalate the existing one usingPOST /submissions/[submissionId]/manual. This is more efficient because human reviewers can see the AI's decision and context.

Async HTTP Handling

Using httpx.AsyncClient() ensures the escalation request doesn't block other API operations. The function handles errors gracefully and logs the escalation status.

Human review results come back via webhooks, usually within 24-48 hours. The webhook handler processes these results and takes final action.

Webhook Security

The highlighted webhook endpoint receives moderation results from Outharm. Security validation happens before any database operations to prevent fake moderation results.

Signature Verification

We check the webhook signature against OUTHARM_WEBHOOK_SECRET. If they don't match, we return HTTPException(401) and ignore the request.

🚨 Why Security Matters

Without validation, attackers could send fake webhooks to delete legitimate posts with forged "harmful" results or bypass moderation by marking harmful content as "safe". This simple check prevents most attacks.

Event Filtering

We only process moderation.manual.completed events. This prevents errors from unexpected event types and ensures our handler only runs when human review is finished.

The data object contains the moderation result, including submission_id and is_harmfulwhich we'll use to take action on the right post.

Processing Moderation Results

The highlighted handle_manual_moderation_result function processes human review results. It uses the submission ID to find the right post and takes action based on the verdict.

Finding the Post

db.query(Post).filter().first() finds the post that matches the webhook's submission ID. This links the moderation result back to the right content.

Handling Harmful Content

If data["is_harmful"] is true, we delete the post with db.delete(post). SQLAlchemy's cascade delete automatically removes associated reports too.

Handling Safe Content

For safe content, we update manual_moderation_status to "not-harmful". This marks the post as human-approved and can help inform future moderation decisions.

💡 Why This Works

Storing submission IDs during initial moderation creates a reliable link between webhook events and your posts. No guesswork or complex matching needed.

Complete API & Testing

The highlighted code completes our API with a posts listing endpoint and server startup. The GET /posts endpoint includes report counts and moderation status for full pipeline visibility.

Posts Listing with Moderation Data

The query retrieves all posts and dynamically calculates report counts using the relationship. This shows the complete moderation state for each post.

Testing Your Moderation System

Start your server with uvicorn main:app --reload and test the flow:

  1. Create posts via POST /posts
  2. Report the same post 3 times to trigger AI moderation
  3. Check Outharm console for submission
  4. Report 6 times total to trigger human review
  5. Use ngrok for webhook testing: ngrok http 8000

🎉 You're Done!

Your FastAPI moderation system handles reports intelligently, uses AI for fast decisions, escalates edge cases to humans, and prevents race conditions. It's ready for production scaling.

🚀Ready to Get Started?

Congratulations! You've built a complete content moderation system with Python. Here are some recommended next steps to get the most out of Outharm:

Related Documentation

1fastapi==0.104.1
2uvicorn[standard]==0.24.0
3sqlalchemy==2.0.23
4alembic==1.13.1
5psycopg2-binary==2.9.9
6pydantic==2.5.0
7python-dotenv==1.0.0
8httpx==0.25.2
9python-multipart==0.0.6