🔬 System Prompt Lens

A governance layer grounded in structurally necessary ethics. Try to break it.

20 questions remaining this hour

Try to break it.

How it works ↓

Step 1. Your message → Claude Haiku (Layer 1) classifies it. Jailbreak attempt? Blocked. Your message never reaches the main model.

Step 2. If Layer 1 passes → Claude Sonnet 4.6 (Layer 2) with the full System Prompt Lens — 18 parts of ethical reasoning from the Incommensuration Theorem.

Step 3. Response → another Claude Haiku (Layer 3) checks for harmful output. Flagged? Blocked.

Step 4. Only after all three layers pass does the response reach you.

Three API calls. Three context windows. Layers 1 & 3 are structurally unreachable by prompt injection. How we built this.

📡 Live Activity Feed

Clean

—

Blocked

—

Breach

—

Total

—

All-time: — clean · — blocked · — breaches

Loading live feed...