Pipeline Modes
Sieve offers three pipeline modes, each optimized for different use cases, latency requirements, and budgets. Your pipeline mode determines which AI model evaluates content that passes Tier 0 (local preprocessing).
General
Section titled “General”The default mode. Uses OpenAI’s free omni-moderation endpoint for AI-tier evaluation.
Best for: Forums, comment sections, user profiles, standard UGC platforms.
- Covers toxicity, hate speech, self-harm, sexual content, and violence out of the box
- Zero cost per AI call — the OpenAI moderation endpoint is free
- Does not detect scams, spam, RMT (real-money trading), or gaming-specific abuse well
- Typical AI-tier latency: 100-200ms
curl -X POST https://api.getsieve.dev/v1/moderate/text \ -H "Authorization: Bearer mod_live_your_key" \ -H "Content-Type: application/json" \ -d '{ "text": "This is a comment on a blog post", "context": "comment" }'Gaming
Section titled “Gaming”Uses gpt-4o-mini with a gaming-aware system prompt. Purpose-built for game studios and gaming communities.
Best for: In-game chat, gaming forums, Discord-style communities, live game lobbies.
- Detects RMT/gold selling, account trading, boosting spam, and scam links
- Understands gaming banter vs. genuine threats (“I’ll kill you” in a PvP context vs. a real threat)
- Recognizes leetspeak and obfuscation patterns common in gaming
- Cost: ~$0.00006 per AI-tier call
- Typical AI-tier latency: 200-400ms
curl -X POST https://api.getsieve.dev/v1/moderate/text \ -H "Authorization: Bearer mod_live_your_key" \ -H "Content-Type: application/json" \ -d '{ "text": "selling 10k gold $5 paypal dm me", "context": "chat" }'Uses Llama Guard 3 8B running on Cloudflare Workers AI. The model runs at the edge — no external API call leaves the Cloudflare network.
Best for: Latency-sensitive applications, high-volume chat, applications where data residency matters.
- Lowest blended latency: ~140ms (including Tier 0 preprocessing)
- Runs entirely on Cloudflare’s edge network — no data sent to third-party APIs
- Cost: ~$0.0001 per AI-tier call
- Good general coverage but less nuanced than GPT-based modes for edge cases
curl -X POST https://api.getsieve.dev/v1/moderate/text \ -H "Authorization: Bearer mod_live_your_key" \ -H "Content-Type: application/json" \ -d '{ "text": "real-time game chat message", "context": "chat" }'Comparison
Section titled “Comparison”| Feature | General | Gaming | Edge |
|---|---|---|---|
| AI Model | OpenAI omni-moderation | gpt-4o-mini | Llama Guard 3 8B |
| Cost per AI call | Free | ~$0.00006 | ~$0.0001 |
| Blended latency | ~180ms | ~300ms | ~140ms |
| Spam/RMT detection | Poor | Excellent | Good |
| Gaming context | No | Yes | No |
| Data stays on edge | No | No | Yes |
| Best for | Forums, comments | Game chat, lobbies | Low-latency, high-volume |
Setting Your Mode
Section titled “Setting Your Mode”Your pipeline mode is an account-level setting. During the beta, set it when you create your API key or contact support to change it.
Sign up for Sieve
Section titled “Sign up for Sieve”Create your account at getsieve.dev/signup.
Choose your mode
Section titled “Choose your mode”Select your pipeline mode during onboarding. Pick based on your primary use case.
Get your API key
Section titled “Get your API key”Your API key will be configured for your chosen mode. All requests using that key route through the selected pipeline.