OpenAI
Let's load the OpenAI Embedding class.
Setupโ
First we install langchain-openai and set the required env vars
%pip install -qU langchain-openai
import getpass
import os
os.environ["OPENAI_API_KEY"] = getpass.getpass()
from langchain_openai import OpenAIEmbeddings
API Reference:OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
text = "This is a test document."
Usageโ
Embed queryโ
query_result = embeddings.embed_query(text)
Warning: model not found. Using cl100k_base encoding.
query_result[:5]
[-0.014380056377383358,
-0.027191711627651764,
-0.020042716111860304,
0.057301379620345545,
-0.022267658631828974]
Embed documentsโ
doc_result = embeddings.embed_documents([text])
Warning: model not found. Using cl100k_base encoding.
doc_result[0][:5]
[-0.014380056377383358,
-0.027191711627651764,
-0.020042716111860304,
0.057301379620345545,
-0.022267658631828974]
Specify dimensionsโ
With the text-embedding-3
class of models, you can specify the size of the embeddings you want returned. For example by default text-embedding-3-large
returned embeddings of dimension 3072:
len(doc_result[0])
3072
But by passing in dimensions=1024
we can reduce the size of our embeddings to 1024:
embeddings_1024 = OpenAIEmbeddings(model="text-embedding-3-large", dimensions=1024)
len(embeddings_1024.embed_documents([text])[0])
Warning: model not found. Using cl100k_base encoding.
1024