Pipeline Modes

Sieve offers three pipeline modes, each optimized for different use cases, latency requirements, and budgets. Your pipeline mode determines which AI model evaluates content that passes Tier 0 (local preprocessing).

General

The default mode. Uses OpenAI’s free omni-moderation endpoint for AI-tier evaluation.

Best for: Forums, comment sections, user profiles, standard UGC platforms.

Covers toxicity, hate speech, self-harm, sexual content, and violence out of the box
Zero cost per AI call — the OpenAI moderation endpoint is free
Does not detect scams, spam, RMT (real-money trading), or gaming-specific abuse well
Typical AI-tier latency: 100-200ms

curl -X POST https://api.getsieve.dev/v1/moderate/text \
  -H "Authorization: Bearer mod_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "This is a comment on a blog post",
    "context": "comment"
  }'

Gaming

Uses gpt-4o-mini with a gaming-aware system prompt. Purpose-built for game studios and gaming communities.

Best for: In-game chat, gaming forums, Discord-style communities, live game lobbies.

Detects RMT/gold selling, account trading, boosting spam, and scam links
Understands gaming banter vs. genuine threats (“I’ll kill you” in a PvP context vs. a real threat)
Recognizes leetspeak and obfuscation patterns common in gaming
Cost: ~$0.00006 per AI-tier call
Typical AI-tier latency: 200-400ms

curl -X POST https://api.getsieve.dev/v1/moderate/text \
  -H "Authorization: Bearer mod_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "selling 10k gold $5 paypal dm me",
    "context": "chat"
  }'

Edge

Uses Llama Guard 3 8B running on Cloudflare Workers AI. The model runs at the edge — no external API call leaves the Cloudflare network.

Best for: Latency-sensitive applications, high-volume chat, applications where data residency matters.

Lowest blended latency: ~140ms (including Tier 0 preprocessing)
Runs entirely on Cloudflare’s edge network — no data sent to third-party APIs
Cost: ~$0.0001 per AI-tier call
Good general coverage but less nuanced than GPT-based modes for edge cases

curl -X POST https://api.getsieve.dev/v1/moderate/text \
  -H "Authorization: Bearer mod_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "real-time game chat message",
    "context": "chat"
  }'

Comparison

Feature	General	Gaming	Edge
AI Model	OpenAI omni-moderation	gpt-4o-mini	Llama Guard 3 8B
Cost per AI call	Free	~$0.00006	~$0.0001
Blended latency	~180ms	~300ms	~140ms
Spam/RMT detection	Poor	Excellent	Good
Gaming context	No	Yes	No
Data stays on edge	No	No	Yes
Best for	Forums, comments	Game chat, lobbies	Low-latency, high-volume

Setting Your Mode

Your pipeline mode is an account-level setting. During the beta, set it when you create your API key or contact support to change it.

Create your account at getsieve.dev/signup.

Choose your mode

Select your pipeline mode during onboarding. Pick based on your primary use case.

Get your API key

Your API key will be configured for your chosen mode. All requests using that key route through the selected pipeline.