Integrating OpenAI with Pinecone for Semantic Search

In this final lesson, you’ll learn how to combine OpenAI’s embedding models with Pinecone to build a working semantic search pipeline.

Prerequisites

Install the required packages:

pip install openai pinecone-client

Ensure you have environment variables set:

OPENAI_API_KEY
PINECONE_API_KEY
PINECONE_REGION

The Code

import os
import logging
from openai import OpenAI
from pinecone import Pinecone, ServerlessSpec

logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
logger = logging.getLogger(__name__)

OPENAI_API_KEY   = os.getenv("OPENAI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_REGION  = os.getenv("PINECONE_REGION", "us-east-1")
INDEX_NAME       = "quickstart-py"

if not OPENAI_API_KEY or not PINECONE_API_KEY:
    logger.error("Both OPENAI_API_KEY and PINECONE_API_KEY must be set")
    exit(1)

openai_client = OpenAI(api_key=OPENAI_API_KEY)
pc = Pinecone(api_key=PINECONE_API_KEY)

# Create index if not present
existing = pc.list_indexes().names()
if INDEX_NAME not in existing:
    pc.create_index(
        name=INDEX_NAME,
        dimension=1536,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region=PINECONE_REGION)
    )

index = pc.Index(INDEX_NAME)

# Embed sample documents
documents = {
    "doc1": "Pinecone is a vector database for similarity search.",
    "doc2": "OpenAI provides embedding models like text-embedding-ada-002.",
    "doc3": "You can combine Pinecone and OpenAI to build a semantic search tool."
}

embed_resp = openai_client.embeddings.create(
    model="text-embedding-ada-002",
    input=list(documents.values())
)

vectors = [
    (doc_id, emb.embedding, {"text": text})
    for (doc_id, text), emb in zip(documents.items(), embed_resp.data)
]

index.upsert(vectors)

# Embed the query and search
query = "What can I use to build semantic search?"
query_embed = openai_client.embeddings.create(
    model="text-embedding-ada-002",
    input=[query]
).data[0].embedding

results = index.query(
    vector=query_embed,
    top_k=2,
    include_metadata=True
)

matches = results.matches
context = "\n".join(f"- {m.metadata['text']}" for m in matches)

# Ask GPT using the retrieved context
chat_input = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": f"Context:\n{context}\n\nQ: {query}"}
]

response = openai_client.responses.create(
    model="gpt-4o-mini",
    input=chat_input,
    stream=False
)

answer = response.output_text.strip()
print("\n=== Answer ===")
print(answer)

Explanation

Uses text-embedding-ada-002 to embed documents and queries.
Upserts the embeddings into a Pinecone index.
Retrieves top-k semantically similar documents.
Sends them to GPT-4o as context for accurate, grounded answers.

Use Case

Great for:

Building question-answering systems on private datasets
Implementing AI-powered document search
Creating intelligent chat agents with memory

Prerequisites​

The Code​

Explanation​

Use Case​

Prerequisites

The Code

Explanation

Use Case