Overview
Sphinx can search the web for real-time information when your questions require current data that may not be in the model’s training data. This capability is powered by Parallel AI, providing accurate, up-to-date search results directly in your notebook workflow.When Sphinx Uses Web Search
Sphinx automatically searches the web when you need:- Current documentation — Latest API references, library syntax, or updated guides
- Recent information — News, announcements, or changes after the model’s knowledge cutoff
- Verification — Confirming current behavior of libraries, APIs, or services
- Resource discovery — Finding datasets, APIs, tools, or documentation you don’t have URLs for
How It Works
When Sphinx decides a web search is needed, it:- Formulates a concise query — Extracts the key terms from your question (max 6 words for optimal results)
- Searches via Parallel AI — Retrieves up to 5 relevant results with excerpts
- Analyzes the results — Reads the titles, URLs, and excerpts to find relevant information
- Incorporates findings — Uses the search results to inform its response or code generation
What You See
When Sphinx performs a web search, you’ll see a message in the chat indicating the search, such as:🔍 Searched the web for: “pandas read_parquet engine parameter”The search results include:
| Field | Description |
|---|---|
| Title | The page title from the search result |
| URL | Direct link to the source |
| Excerpt | A relevant snippet from the page (up to 250 characters) |
| Page Age | When the page was last updated (if available) |
Example Use Cases
Finding Current Documentation
Example prompt
“How do I use the new pyarrow engine in pandas.read_parquet?”
Discovering APIs and Datasets
Example prompt
“Where can I find a free stock market API for historical data?”
Verifying Library Behavior
Example prompt
“What’s the current default behavior of scikit-learn’s train_test_split?”
Limitations
- Maximum 5 results per search query
- Excerpts limited to 250 characters — Sphinx sees summaries, not full pages
- Maximum 10 domains for include/exclude filters
- 30-second timeout for search requests
Best Practices
Be specific in your questions
Be specific in your questions
Instead of asking “how do I read files in pandas?”, ask “how do I read a parquet file with custom column selection in pandas?” — more specific questions lead to better search queries and more relevant results.
Mention when you need current information
Mention when you need current information
If you know something has changed recently, mention it: “What’s the new syntax for X in the latest version?” helps Sphinx understand that up-to-date information is critical.
Provide URLs when you have them
Provide URLs when you have them
If you already have a URL to documentation or a resource, include it in your message. Sphinx will fetch the content directly rather than searching for it, which is faster and more accurate.