Moderation and Safety
How ChurchWiseAI protects your chatbot from abuse and handles crisis situations automatically
ChurchWiseAI includes a built-in moderation system that protects your chatbot from abuse, spam, and harmful content — while ensuring people in genuine crisis get the help they need. The system works automatically in the background. As an admin, you can monitor what is happening from the Moderation Dashboard.
Where to Find It
Go to your admin dashboard → Settings tab → scroll to the Chatbot Settings section → Moderation Dashboard.
The Moderation Dashboard is visible to Admin and Office Admin roles.
How Moderation Works
Every message sent to your chatbot is checked for safety. The system detects five types of violations:
| Type | What It Detects | What Happens |
|---|---|---|
| Crisis | Self-harm, suicidal language | Immediate crisis resources (988 Lifeline, 911). Admin notified. |
| Mild Abuse | Profanity, rude language | Warning message. Violation logged. |
| Severe Abuse | Extreme hostility, threats | Session paused. Violation logged. |
| Spam | Repeated nonsense, flooding | Rate limited. Violation logged. |
| Predatory | Grooming behavior, exploitation | Session ended. Admin notified. |
Automatic Escalation
The moderation system escalates automatically based on how many violations a user accumulates:
- 2 violations → 5-minute cooldown. The user sees a message asking them to pause and try again shortly. Crisis resources (988, 911) are always included.
- 4 violations → 24-hour temporary block. The user cannot chat for 24 hours. They are directed to call the church office.
- 7 violations → permanent block. The session is permanently blocked. The user is directed to contact the church directly.
This graduated approach gives people the benefit of the doubt while protecting your chatbot and congregation.
Crisis Handling
When the system detects a crisis (self-harm, suicidal ideation), it responds immediately:
- The chatbot provides the 988 Suicide & Crisis Lifeline number and 911
- The chatbot expresses care and concern — it does not just dump a phone number
- The violation is logged as a Crisis type
- The admin is notified so your pastoral team can follow up if contact information is available
Crisis detection always takes priority over other moderation rules.
The Moderation Dashboard
The dashboard shows four stat cards at the top:
- Total Violations — All-time count of detected violations
- Crisis Interventions — How many times crisis resources were provided
- Active Blocks — How many users are currently blocked
- Today's Violations — Violations detected in the last 24 hours
Below the stats, two tabs show details:
Violations Tab
A list of all detected violations, newest first. Each entry shows:
- Violation type (color-coded badge)
- The original message that triggered the violation
- Action taken by the system
- Timestamp
Restrictions Tab
A list of all user restrictions (cooldowns, temporary blocks, permanent blocks). Each entry shows:
- Restriction type
- Reason for the restriction
- When it expires (or "Never" for permanent blocks)
- When it was created
What You Should Do
The moderation system is fully automatic — you do not need to take action for it to work. But here are some things to keep in mind:
- Check the dashboard periodically. If you see crisis interventions, consider whether pastoral follow-up is possible.
- Review flagged violations. Most violations are straightforward, but occasionally you may want context on what happened.
- Trust the escalation. The graduated system (cooldown → temp block → permanent block) is designed to be fair while keeping your chatbot safe.
Related Docs
- Dashboard Overview — Tour of the admin dashboard
- Data Privacy and Confidentiality — How sensitive data is protected
- Voice Agent Safety — Safety features for the voice agent