Skip to main content

Limiting File Search Results

In this lesson, you'll learn how to limit the number of documents GPT-4o considers from a vector store by setting max_num_results.

Prerequisites

Make sure you have:

  • A vector store already created and populated with files
  • The vector_store_id for that store
  • OpenAI SDK installed:
pip install openai

The Code

from openai import OpenAI

client = OpenAI()

response = client.responses.create(
model="gpt-4o-mini",
input="What services Nexovious provides?",
tools=[{
"type": "file_search",
"vector_store_ids": ["vs_68328938caac819194fefb25d125d497"],
"max_num_results": 1
}]
)

# Extract only the assistant's textual reply
for item in response.output:
if getattr(item, "type", None) == "message":
for content in item.content:
if getattr(content, "type", None) == "output_text":
print(content.text)

Explanation

  • max_num_results: Restricts how many top documents are used for grounding.
  • Useful when you want precise, fast results.
  • Same parsing logic is used to extract only the assistant’s reply text.

Use Case

This is particularly useful when:

  • You want a specific document to be prioritized
  • Reducing latency is important
  • You want concise answers without aggregating too many sources