Guardrails

Overview

Guardrails enable you to control what code Sphinx will generate and execute, giving you fine-grained control over AI-generated code behavior and security.

Guardrail modes

Always Allow

Sphinx can always run code with a given artifact without requesting permission. Use this for trusted operations and frequently used libraries.

Ask For Permission

Sphinx must get approval from you before running code with a given artifact. Recommended for sensitive operations or unfamiliar tools.

Ask and Avoid

Sphinx will avoid generating code with the artifact when possible and ask for approval if it must use it. Best for restricted or dangerous operations.

Configuring guardrails

Open the dashboard

Navigate to the dashboard and sign in to your account.

Access guardrails settings

Click on the Guardrails tab in the navigation menu.

Select guardrail mode

Click on the new mode you want to apply for the specific artifact or operation.

Save your changes

Click Apply changes to activate your new guardrail configuration.

How it works

When a guardrail is triggered, Sphinx will request approval before executing code. If approval is not granted, the agent will attempt to make progress using alternative approaches when possible.

Start with “Ask For Permission” mode for new artifacts until you’re confident in their behavior, then switch to “Always Allow” for trusted operations to improve workflow efficiency.

Home

Extension

CLI

Agent

Configuration

Overview

Guardrail modes

Configuring guardrails

How it works

Home

Extension

CLI

Agent

Configuration

​Overview

​Guardrail modes

​Configuring guardrails

​How it works

Overview

Guardrail modes

Configuring guardrails

How it works