Forum & Community Moderation

Forums and community platforms have different moderation needs than real-time chat. Content is longer, more nuanced, and often stays visible indefinitely. Sieve’s General mode with forum_post context is designed for this.

Recommended Setup

For forums and community platforms, use General mode with forum_post context:

curl -X POST https://api.getsieve.dev/v1/moderate/text \
  -H "Authorization: Bearer mod_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "The full text of the forum post...",
    "context": "forum_post",
    "username": "poster_username"
  }'

The forum_post context applies a 1.2x threshold multiplier, making moderation more lenient to account for the additional context and nuance in longer content. For comments on posts, use comment context (1.0x standard thresholds).

Pre-Publish vs. Post-Publish Moderation

Moderate content before it becomes visible. No violating content ever appears on your platform, but users experience a brief delay when posting.

async function submitPost(postData) {
  // Moderate before publishing
  const modResult = await fetch('https://api.getsieve.dev/v1/moderate/text', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer mod_live_your_key',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      text: postData.body,
      context: 'forum_post',
      username: postData.author,
    }),
  }).then(r => r.json());

  if (modResult.action === 'block') {
    return { error: 'Your post was blocked for violating community guidelines.' };
  }

  if (modResult.action === 'flag') {
    // Hold for human review
    await db.posts.create({ ...postData, status: 'pending_review', moderation: modResult });
    return { message: 'Your post has been submitted for review.' };
  }

  // Clean content, publish immediately
  await db.posts.create({ ...postData, status: 'published' });
  return { message: 'Post published!' };
}

Best for: Platforms with younger audiences, regulated industries, or low tolerance for false negatives.

Publish content immediately and moderate asynchronously. Users get instant feedback, but violating content may be briefly visible.

async function submitPost(postData) {
  // Publish immediately
  const post = await db.posts.create({ ...postData, status: 'published' });

  // Moderate asynchronously
  const modResult = await fetch('https://api.getsieve.dev/v1/moderate/text', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer mod_live_your_key',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      text: postData.body,
      context: 'forum_post',
      username: postData.author,
    }),
  }).then(r => r.json());

  if (modResult.action === 'block') {
    await db.posts.update(post.id, { status: 'hidden' });
    await notifyUser(postData.author, 'Your post was removed for violating guidelines.');
  } else if (modResult.action === 'flag') {
    await db.posts.update(post.id, { status: 'under_review' });
    await moderationQueue.add(post.id, modResult);
  }
}

Best for: High-traffic platforms where latency matters and brief exposure to violations is acceptable.

Handling Flagged Content

Flagged content (action: flag) is borderline — it might violate your guidelines, but the AI isn’t confident enough to block. You have several options:

Queue flagged content for human moderators. The post stays hidden until a moderator approves or rejects it.

Automatically hide flagged content but let the author see it. If no reports come in within a time window, auto-approve.

Publish the content but display a community guidelines reminder to the author. Log the flag for trend analysis.

Hold for review only for specific categories (e.g., always hold hate_speech flags, but publish toxicity flags with a warning).

// Conditional handling based on flag category
function handleFlaggedPost(post, modResult) {
  const highRiskCategories = ['hate_speech', 'violence', 'self_harm', 'sexual'];

  if (highRiskCategories.includes(modResult.primary_category)) {
    // Hold high-risk flags for review
    return holdForReview(post, modResult);
  }

  // Publish low-risk flags with a warning
  return publishWithWarning(post, modResult);
}

Custom Thresholds per Category

Different forums need different sensitivity levels. A children’s educational forum needs stricter thresholds than an adults-only discussion board.

Lower thresholds catch more content. Recommended for platforms with users under 18.

{
  "text": "post content",
  "context": "forum_post",
  "thresholds": {
    "toxicity": 0.5,
    "harassment": 0.5,
    "hate_speech": 0.5,
    "sexual": 0.4,
    "violence": 0.5,
    "self_harm": 0.3,
    "spam": 0.7
  }
}

Default thresholds work well for general-purpose forums and communities.

{
  "text": "post content",
  "context": "forum_post",
  "thresholds": {
    "toxicity": 0.7,
    "harassment": 0.7,
    "hate_speech": 0.7,
    "sexual": 0.7,
    "violence": 0.7,
    "self_harm": 0.5,
    "spam": 0.8
  }
}

Higher thresholds for platforms where mature content is expected. Sexual and violence thresholds are raised, but hate speech and self-harm stay strict.

{
  "text": "post content",
  "context": "forum_post",
  "thresholds": {
    "toxicity": 0.85,
    "harassment": 0.8,
    "hate_speech": 0.7,
    "sexual": 0.9,
    "violence": 0.85,
    "self_harm": 0.5,
    "spam": 0.8
  }
}

Moderating Different Content Types

Use different contexts for different parts of your platform:

// Forum post body
await moderate({ text: post.body, context: 'forum_post', username: post.author });

// Comment on a post
await moderate({ text: comment.body, context: 'comment', username: comment.author });

// Username registration
await moderate({ text: newUser.displayName, context: 'username' });

This ensures each content type gets the appropriate threshold multiplier. See Contexts for details.