Skip to main content

Overview

Guardrails enable you to control what code Sphinx will generate and execute, giving you fine-grained control over AI-generated code behavior and security.

Guardrail modes

Sphinx can always run code with a given artifact without requesting permission. Use this for trusted operations and frequently used libraries.
Sphinx must get approval from you before running code with a given artifact. Recommended for sensitive operations or unfamiliar tools.
Sphinx will avoid generating code with the artifact when possible and ask for approval if it must use it. Best for restricted or dangerous operations.

Configuring guardrails

1

Open the dashboard

Navigate to the dashboard and sign in to your account.
2

Access guardrails settings

Click on the Guardrails tab in the navigation menu.
3

Select guardrail mode

Click on the new mode you want to apply for the specific artifact or operation.
4

Save your changes

Click Apply changes to activate your new guardrail configuration.

How it works

When a guardrail is triggered, Sphinx will request approval before executing code. If approval is not granted, the agent will attempt to make progress using alternative approaches when possible.
Start with “Ask For Permission” mode for new artifacts until you’re confident in their behavior, then switch to “Always Allow” for trusted operations to improve workflow efficiency.