Overview
The Sphinx Library (sphinxai) is a Python library that provides your notebook code with direct access to AI capabilities and secure resources. When Sphinx generates code in your notebooks, it can use this library to:
- Call LLMs for text generation and analysis
- Generate text embeddings for similarity search and clustering
- Process images with vision-capable models
- Retrieve connection credentials for databases like Snowflake and Databricks
- Access user secrets stored securely in Sphinx
The Sphinx Library is automatically available when running code in Sphinx-managed notebooks. No installation required.
Why Use the Sphinx Library?
The library solves several key problems for data scientists:- Model Abstraction: Use size tiers (
S,M,L) instead of specific model names, so your code stays independent of model versions - Simplified Authentication: Access LLMs and embeddings without managing API keys in your code
- Batch Processing: Built-in concurrent processing with rate limiting for batch operations
- Secure Credentials: Retrieve database credentials and secrets without hardcoding sensitive values
- Provider Flexibility: Switch between providers (OpenAI, Anthropic, Google) or bring your own API keys
Model Size Tiers
Instead of specifying exact model names, the library uses abstract size tiers:| Tier | Chat Models | Embedding Models | Best For |
|---|---|---|---|
| S (Small) | Fast, cost-effective | Smaller dimensions | Simple tasks, high throughput |
| M (Medium) | Balanced performance | — | General-purpose tasks |
| L (Large) | Highest quality | Larger dimensions | Complex reasoning, nuanced analysis |
Functions Reference
Chat Completion
llm()
Call an LLM with a text prompt.
The text prompt to send to the LLM.
Model size tier:
"S" (small/fast), "M" (medium), or "L" (large/capable).Timeout in seconds for the request.
The LLM’s response text.
batch_llm()
Process multiple prompts concurrently with automatic rate limiting.
List of prompts to process.
Model size tier for all requests.
Maximum number of concurrent requests (rate limiting).
Timeout in seconds for each individual request.
List of responses in the same order as input prompts. Failed requests return error messages.
Vision
batch_vision_llm()
Process images with questions using vision-capable models.
List of base64-encoded image strings (without the
data: URL prefix).List of questions, one per image. Must match the length of
images.Model size tier.
Maximum concurrent requests.
Timeout per request in seconds.
MIME type of the images (e.g.,
"image/png", "image/jpeg").Image detail level:
"low", "high", or "auto".Text Embeddings
embed_text()
Generate a vector embedding for a single text.
The text to embed.
Model size tier:
"S" (small/fast) or "L" (large/high-quality). Note: "M" is not available for embeddings.Timeout in seconds.
The embedding vector as a list of floats.
batch_embed_text()
Generate embeddings for multiple texts concurrently.
List of texts to embed.
Model size tier:
"S" or "L".Maximum concurrent requests.
Timeout per request in seconds.
List of embedding vectors in the same order as input texts. Failed requests return empty lists.
Connection Credentials
get_connection_credentials()
Retrieve credentials for configured data integrations.
- Snowflake
- Databricks
The name of the integration:
"snowflake" or "databricks".Timeout in seconds.
Connection credentials are configured in the Sphinx Dashboard under Integrations. See Integrations for setup instructions.
Secrets
get_user_secret_value()
Retrieve a secret value from the Sphinx secrets store.
The name of the secret to retrieve.
Timeout in seconds.
The secret value as a string.
Secrets are configured in the Sphinx Dashboard under Secrets. See Secrets for setup instructions.
Configuration
These functions allow you to bring your own API keys or switch providers dynamically.set_llm_config()
Configure the LLM provider programmatically.
Provider name:
"sphinx", "openai", "anthropic", or "google".API key for the provider.
Optional custom base URL for the provider’s API.
Optional mapping of size tiers to model names. Unspecified sizes use provider defaults.
set_embedding_config()
Configure the embedding provider programmatically.
Provider name:
"sphinx", "openai", or "google". Note: Anthropic does not support embeddings.API key for the provider.
Optional custom base URL.
Optional mapping of size tiers to model names.
get_llm_config() / get_embedding_config()
Inspect the current configuration.
reset_config() / reset_llm_config() / reset_embedding_config()
Reset configuration to environment variable defaults.
Supported Providers
| Provider | Chat | Embeddings | Default Models |
|---|---|---|---|
| sphinx (default) | ✓ | ✓ | GPT-4.1 family, text-embedding-3 |
| openai | ✓ | ✓ | GPT-4.1 family, text-embedding-3 |
| anthropic | ✓ | ✗ | Claude Haiku/Sonnet |
| ✓ | ✓ | Gemini 2.5 family |