Limiting File Search Results

In this lesson, you'll learn how to limit the number of documents GPT-4o considers from a vector store by setting max_num_results.

Prerequisites

Make sure you have:

A vector store already created and populated with files
The vector_store_id for that store
OpenAI SDK installed:

pip install openai

The Code

from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-4o-mini",
    input="What services Nexovious provides?",
    tools=[{
        "type": "file_search",
        "vector_store_ids": ["vs_68328938caac819194fefb25d125d497"],
        "max_num_results": 1
    }]
)

# Extract only the assistant's textual reply
for item in response.output:
    if getattr(item, "type", None) == "message":
        for content in item.content:
            if getattr(content, "type", None) == "output_text":
                print(content.text)

Explanation

max_num_results: Restricts how many top documents are used for grounding.
Useful when you want precise, fast results.
Same parsing logic is used to extract only the assistant’s reply text.

Use Case

This is particularly useful when:

You want a specific document to be prioritized
Reducing latency is important
You want concise answers without aggregating too many sources

Prerequisites​

The Code​

Explanation​

Use Case​

Prerequisites

The Code

Explanation

Use Case