🏷️

Categories

Define what types of harmful content to detect in your moderation

Categories are the types of harmful content Outharm can detect. You choose which categories matter to your platform - only selected categories will be analyzed. This applies to both automated AI moderation and manual human review.

How Categories Work

Categories act as filters for what content gets flagged as harmful. Outharm only tries to detect categories you've enabled - disabled categories are completely ignored during moderation.

This selective approach means you can customize moderation to your platform's needs. A family-friendly app might enable all categories, while a mature gaming community might only flag extreme content like violence or harassment.

Category Selection

Enable/Disable Model

Categories use a simple on/off model. Each category can be either enabled (will be detected) or disabled (will be ignored). There are no thresholds or sensitivity levels - you choose which categories to care about for your platform.

Universal Application

Your category selection applies to both automated moderation (AI-powered instant decisions) and manual moderation (human review queue). This ensures consistent standards across your entire content moderation workflow.

Available Categories

Outharm provides a comprehensive set of categories covering both image and text content. Categories are organized by content type:

Mixed Categories (Text + Image)

🔞

Sexual

sexual

TextImage

Content containing nudity, pornography, or sexually explicit material.

⚡

Fascism

fascism

TextImage

Fascist symbols, Nazi propaganda, or authoritarian extremist political content.

🍺

Alcohol

alcohol

TextImage

Content promoting alcohol consumption, drinking culture, or alcoholic beverages.

🚬

Smoking

smoking

TextImage

Tobacco products, smoking, vaping, or substance use promotional content.

Image-Only Categories

🩸

Blood

blood

Image

Graphic content showing blood, injuries, gore, or disturbing medical imagery.

🔪

Knife

knife

Image

Images of knives, blades, sharp weapons, or cutting implements.

🔫

Gun

gun

Image

Firearms, weapons, ammunition, or gun-related promotional content.

🤲

Insulting Gesture

insulting_gesture

Image

Hand gestures, body language, or visual symbols intended to insult or offend.

Text-Only Categories

🩹

Self-Harm

self-harm

Text

Content promoting, depicting, or encouraging self-injury, suicide, or self-destructive behavior.

⚔️

Violence

violence

Text

Violent content, fighting, aggressive behavior, or depictions of physical harm.

💀

Hate

hate

Text

Hate speech, discrimination, slurs, or content targeting specific groups based on identity.

⚡

Threatening

threatening

Text

Direct threats, intimidation, menacing language, or content intended to frighten.

👊

Harassment

harassment

Text

Targeted abuse, personal attacks, stalking, or persistent unwanted communication.

👥

Bullying

bullying

Text

Systematic intimidation, abuse, exclusion, or repeated harmful behavior toward individuals.

💣

Terrorism

terrorism

Text

Content promoting, supporting, or glorifying terrorist activities, organizations, or attacks.

⚡

Extremism

extremism

Text

Radical ideologies, violent extremist content, or radicalization materials beyond fascism.

🚫

CSAM

csam

Text

Any reference to child sexual abuse material or exploitation of minors (zero tolerance policy).

💋

Sexual Solicitation

sexual_solicitation

Text

Requests, offers, or advertisements for sexual acts, services, or inappropriate contact.

🏥

Medical Misinformation

medical_misinformation

Text

False, misleading, or dangerous health claims, fake cures, or harmful medical advice.

🗳️

Political Misinformation

political_misinformation

Text

Deceptive content about elections, voting, political processes, or democratic institutions.

💰

Scams

scams

Text

Fraudulent schemes, fake offers, phishing attempts, or deceptive money-making promises.

💳

Fraud

fraud

Text

Financial deception, identity theft attempts, fake services, or fraudulent business practices.

📍

Doxxing

doxxing

Text

Sharing private personal information without consent, including addresses, phone numbers, or documents.

🤬

Profanity

profanity

Text

Vulgar language, obscene words, offensive expressions, or inappropriate verbal content.

Configuring Categories

Category configuration is done through the Console interface:

1

Navigate to Categories

Open your project in Console and go to Categories

2

Filter by Type

Use All, Text, or Image filters to focus on specific content types

3

Toggle Categories

Click on category cards to enable/disable them. Use bulk actions to select or deselect all

4

Save Changes

Click Save to apply your category selection to all future moderation

API Response Format

When content matches your enabled categories, the API returns detected categories using their internal names (shown in monospace font above). For example, if "Sexual" content is detected, it will appear as sexual in the response.

Category Names in Responses

API responses always use the internal category names (like sexual, violence, medical_misinformation) rather than display names. This ensures consistent parsing and avoids issues with spacing or special characters.

Best Practices

✅

Choose Relevant Categories

Only enable categories that matter to your platform. A professional network might focus on harassment and misinformation, while a gaming platform might prioritize violence and hate speech.

✅

Start Broad, Then Narrow

Begin with more categories enabled and disable ones that don't fit your content. It's easier to reduce false positives than to catch missed harmful content.

✅

Consider Content Types

If your platform is text-only, you can safely ignore image categories. If it's image-focused, prioritize visual categories over text-based ones.

✅

Review Regularly

Periodically review your category selection as your platform evolves. New features or user behaviors might require enabling additional categories.

Ready to Get Started?

Configure your content categories and start building your moderation setup with our comprehensive detection capabilities.

Console Walkthrough

Learn how to enable categories in the console

Automated Moderation

Configure thresholds for AI-powered detection

Related Documentation

• Quick Start Guide - Get up and running quickly
• Schemas & Components - Structure your content for moderation
• Manual Moderation - Human review workflows
• Console Walkthrough - Learn to use the management interface