XecGuard: The Security Guardrail Built for AI Chatbot / LLM
XecGuard is a suite of AI security Guardrail APIs developed by CyCraft, specifically designed to defend against various Prompt Attacks targeting Large Language Models (LLMs). It can detect and analyze malicious conversational content in real-time, providing comprehensive LLM Safety protection. XecGuard is particularly well-suited for AI Agents and various AI application scenarios. It effectively prevents malicious external inputs, such as Prompt Injection, Prompt Extraction, Harmful Content, or aggressive prompts, preventing the model from being manipulated into generating errors, deviating from its tasks, or even executing harmful behaviors. This further enhances the overall security and reliability of the system.
Core Protection Mechanisms and Features
XecGuard features lightweight semantic models designed specifically for security scenarios. Unlike traditional protection mechanisms that rely on keyword filtering, it is capable of understanding "attack intentions" and "semantic variations." Its core advantages include:
- Multi-Model Fine-Tuning Technology: Instead of relying on keyword filtering, XecGuard is powered by the collaboration of multiple small models. These small models have been fine-tuned to be highly specialized for specific security tasks.
- High Performance and Low Latency: Utilizing a small-model architecture rather than a single large model results in lower computational costs and faster processing speeds.
- High Accuracy: Each model is fine-tuned for a specific security task, performing semantic recognition and voting to ensure highly accurate detection.
- Multi-Layered Threat Defense: Effectively blocks Prompt Injection, Prompt Extraction, and Harmful Content.
- Malicious Semantic Recognition: Capable of understanding deep "attack intentions" and "semantic variations," it possesses the ability to recognize malicious semantics within User Prompts. Even if attackers use Instruction Obfuscation, Role-Playing, or Emotional appeal, the system can accurately uncover the underlying malicious logic.
- Context Instruction-Following: Ensures the AI strictly adheres to the task scope defined by the System Prompt, preventing it from being guided into unauthorized task contexts by users.
- Multilingual and Localization Advantages: Supports multi-language analysis, including Chinese, English, and Japanese. In addition to accurately identifying semantic attacks in Chinese and Japanese, it strengthens the detection of local Taiwanese PII formats (such as ID numbers, addresses, and financial account numbers), meeting both cross-border and local compliance requirements.

Eight Core Functional Modules
XecGuard provides comprehensive AI protection capabilities, encompassing the following key modules:
- General Prompt Attack Protection: Detects and prevents prompt injection, extraction, and evasion attempts that aim to override model behavior, expose protected instructions, or bypass safeguards, including those using obfuscation.
- System Prompt Enforcement: Context-Aware Task Enforcement for Enterprise AI keeps AI systems strictly within their assigned enterprise tasks, enabling true context-aware safety that goes beyond traditional content filtering.
- Content Bias Protection: Detects and mitigates outputs that exhibit prejudice, harassment, or harmful stereotypes, ensuring generated content remains non-discriminatory and respectful of protected attributes, health status, and socio-economic markers.
- Harmful Content Protection: Detects and prevents toxic or harmful AI outputs by analyzing semantic intent beyond simple keyword filtering.
- PII & Sensitive Data Protection: Privacy and data leakage prevention for enterprise AI, ensuring personal or sensitive data is not exposed by the AI.
- Malicious Skills Protection: Detects whether AI Agent Skills and related file content contain malicious or harmful content targeting the system, protecting Agent AI from the impact of malicious Skills.
- Custom Policy Enforcement: Enables organizations to define custom Guardrail Rules in natural language based on their business or AI/Chatbot task requirements, and enforce organization-specific AI governance rules.
- Context Grounding Validation: Used to detect AI hallucinations by ensuring responses are strictly grounded in user-provided documents and approved RAG context.
API Integration and System Deployment
- Standardized and Simple API Integration: Adopts a standard RESTful API architecture. The calling conventions are compatible with mainstream LLM services on the market (such as OpenAI). Developers only need to include their dedicated Management Token/Service Token (API Key) in the HTTP Header to quickly complete authentication and integration. For details, please refer to the User Guide.
- Diverse Licensing Plans: Offers flexible plans across three tier (Lite, Standard, and Advanced). These cover needs ranging from POC validation to large-scale enterprise deployments, allowing businesses to flexibly allocate the most suitable security protection based on their application scale. Plans from Standard and above utilize dedicated GPU servers, ensuring isolated computing resources and sustained high performance.
- Data Privacy Guarantee: All normal request content is neither recorded nor retained. Only when malicious attacks (such as Prompt Attacks) are detected will the relevant content be temporarily stored in a de-identified format. This data is strictly used for security research, model optimization, and AI safety initiatives. We will continuously contribute our research findings back to the open-source community to promote the enhancement of security capabilities across the entire ecosystem. Through this positive feedback loop, we aim to collaborate with the open-source community to build a safer and more trustworthy AI development environment.
Serving as the "security firewall" for enterprise AI applications, XecGuard ensures that AI systems operate within a secure, compliant, and controllable environment. It is an indispensable security cornerstone in the enterprise's process of adopting Generative AI.