Skip to main content

Risk Signals Library

Risk Signals

A continuously expanding library of AI threat detection and risk intelligence tools that empower your enterprise to deploy generative models safely, securely, and in full compliance.

PII

Discerning special entities such as social security numbers, date of birth, addresses, emails, etc.

PHI

Discerning special entities such as medical history, treatment information, insurance details, etc.

PCI

Discerning special entities such as credit card numbers, exp date, CVV

Secrets

Passwords, API Keys, crypto keys, etc.

Toxicity

Identifying and mitigating harmful or inappropriate language

Illegality

Preventing content that may violate laws

Blocklists

Restrict specific words, phrases, or topics from being processed or generated by the AI

System Instruct Class

Ensures that the model's responses and actions are in direct correspondence with the instructions provided by users

Relevance

Ensures the content generated is pertinent to the context of the interaction

Intent

Understanding and aligning with the user's purpose

Code Present

Discerning special entities such as medical history, treatment information, insurance details, etc.

Code Requested

Ensures that executable content is only included when explicitly requested by the user

Input Manipulation

Tactics like prompt injection, instruction override, or direct command injection are identified and neutralized

Output Manipulation

Stops the leaking of prompts that could reveal sensitive information or internal system data

Goal Alignment

Ensures AI’s actions remain aligned with intended objectives and user directives.

Code Vulnerability

Syntactic instructions and semantic based attacks such as prompt injection, jailbreaking, prompt leaking or role impersonation

Text-to-SQL

Ensures accuracy and relevance in tasks that require precise language-to-code translation

Instruct-to-Action

Harmonizes the user's stated objectives and intents with the actual actions performed by the AI

Data Loss Protect

We analyze the content against the defined data loss ground truth and alert or enforce policy

Intent-to-Instruct

Ensuring that AI correctly interprets and follows the intended instruction of a prompt while minimizing the risk of misalignment, harmful outputs, or unintended consequences