Skip to main content

Overview

Sphinx can search the web for real-time information when your questions require current data that may not be in the model’s training data. This capability is powered by Parallel AI, providing accurate, up-to-date search results directly in your notebook workflow. Sphinx automatically searches the web when you need:
  • Current documentation — Latest API references, library syntax, or updated guides
  • Recent information — News, announcements, or changes after the model’s knowledge cutoff
  • Verification — Confirming current behavior of libraries, APIs, or services
  • Resource discovery — Finding datasets, APIs, tools, or documentation you don’t have URLs for
You don’t need to explicitly ask Sphinx to search the web. It automatically determines when web search would be helpful based on your question.

How It Works

When Sphinx decides a web search is needed, it:
  1. Formulates a concise query — Extracts the key terms from your question (max 6 words for optimal results)
  2. Searches via Parallel AI — Retrieves up to 5 relevant results with excerpts
  3. Analyzes the results — Reads the titles, URLs, and excerpts to find relevant information
  4. Incorporates findings — Uses the search results to inform its response or code generation

What You See

When Sphinx performs a web search, you’ll see a message in the chat indicating the search, such as:
🔍 Searched the web for: “pandas read_parquet engine parameter”
The search results include:
FieldDescription
TitleThe page title from the search result
URLDirect link to the source
ExcerptA relevant snippet from the page (up to 250 characters)
Page AgeWhen the page was last updated (if available)

Example Use Cases

Finding Current Documentation

Example prompt

“How do I use the new pyarrow engine in pandas.read_parquet?”
Sphinx searches for the latest pandas documentation to find the correct syntax and parameters, even if the feature was added after the model’s training data.

Discovering APIs and Datasets

Example prompt

“Where can I find a free stock market API for historical data?”
Sphinx searches for available APIs, comparing options and providing links to documentation.

Verifying Library Behavior

Example prompt

“What’s the current default behavior of scikit-learn’s train_test_split?”
Sphinx searches official documentation to confirm current defaults rather than relying on potentially outdated training data.

Limitations

  • Maximum 5 results per search query
  • Excerpts limited to 250 characters — Sphinx sees summaries, not full pages
  • Maximum 10 domains for include/exclude filters
  • 30-second timeout for search requests
Web search finds and locates information but doesn’t fetch full web pages. If you provide a specific URL and want Sphinx to read its contents, Sphinx will generate code using libraries like requests or BeautifulSoup to fetch and parse the page instead.

Best Practices

Instead of asking “how do I read files in pandas?”, ask “how do I read a parquet file with custom column selection in pandas?” — more specific questions lead to better search queries and more relevant results.
If you know something has changed recently, mention it: “What’s the new syntax for X in the latest version?” helps Sphinx understand that up-to-date information is critical.
If you already have a URL to documentation or a resource, include it in your message. Sphinx will fetch the content directly rather than searching for it, which is faster and more accurate.

Privacy

Web searches are processed through Parallel AI’s search infrastructure. Search queries contain only the terms Sphinx determines are necessary to find relevant information — your notebook data and code are not included in search queries. We also have a full Zero Data Retention (ZDR) agreement with Parallel AI.