Published
An integration package connecting Google's genai package and LangChain
pip install langchain-google-genai
Package Downloads
Authors
Project URLs
Requires Python
<4.0,>=3.9
langchain-google-genai
LangChain integration for Google Gemini models using the generative-ai
SDK
This package enables seamless access to Google Gemini's chat, vision, embeddings, and retrieval-augmented generation (RAG) features within the LangChain ecosystem.
Table of Contents
Overview
This package provides LangChain support for Google Gemini models (via the official Google Generative AI SDK). It supports:
- Text and vision-based chat models
- Embeddings for semantic search
- Multimodal inputs and outputs
- Retrieval-Augmented Generation (RAG)
- Thought tracing with reasoning tokens
Installation
pip install -U langchain-google-genai
Quickstart
Set up your environment variable with your Gemini API key:
export GOOGLE_API_KEY=your-api-key
Then use the ChatGoogleGenerativeAI
interface:
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-pro")
response = llm.invoke("Sing a ballad of LangChain.")
print(response.content)
Chat Models
The main interface for Gemini chat models is ChatGoogleGenerativeAI
.
Multimodal Inputs
Gemini vision models support image inputs in single messages.
from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-pro-vision")
message = HumanMessage(
content=[
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": "https://picsum.photos/seed/picsum/200/300"},
]
)
response = llm.invoke([message])
print(response.content)
ā
image_url
can be:
- A public image URL
- A Google Cloud Storage path (
gcs://...
) - A base64-encoded image (e.g.,
data:image/png;base64,...
)
Multimodal Outputs
The Gemini 2.0 Flash Experimental model supports both text and inline image outputs.
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="models/gemini-2.0-flash-exp-image-generation")
response = llm.invoke(
"Generate an image of a cat and say meow",
generation_config=dict(response_modalities=["TEXT", "IMAGE"]),
)
image_base64 = response.content[0].get("image_url").get("url").split(",")[-1]
meow_text = response.content[1]
print(meow_text)
Audio Output
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="models/gemini-2.5-flash-preview-tts")
# example
response = llm.invoke(
"Please say The quick brown fox jumps over the lazy dog",
generation_config=dict(response_modalities=["AUDIO"]),
)
# Base64 encoded binary data of the image
wav_data = response.additional_kwargs.get("audio")
with open("output.wav", "wb") as f:
f.write(wav_data)
Multimodal Outputs in Chains
You can use Gemini models in a LangChain chain:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI, Modality
llm = ChatGoogleGenerativeAI(
model="models/gemini-2.0-flash-exp-image-generation",
response_modalities=[Modality.TEXT, Modality.IMAGE],
)
prompt = ChatPromptTemplate.from_messages([
("human", "Generate an image of {animal} and tell me the sound it makes.")
])
chain = {"animal": RunnablePassthrough()} | prompt | llm
response = chain.invoke("cat")
Thinking Support
Gemini 2.5 Flash Preview supports internal reasoning ("thoughts").
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(
model="models/gemini-2.5-flash-preview-04-17",
thinking_budget=1024
)
response = llm.invoke("How many O's are in Google? How did you verify your answer?")
reasoning_score = response.usage_metadata["output_token_details"]["reasoning"]
print("Response:", response.content)
print("Reasoning tokens used:", reasoning_score)
Embeddings
You can use Gemini embeddings in LangChain:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(model="models/gemini-embedding-001")
vector = embeddings.embed_query("hello, world!")
print(vector)
Semantic Retrieval (RAG)
Use Gemini with RAG to retrieve relevant documents from your knowledge base.
from langchain_google_genai.vectorstores import GoogleVectorStore
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import DirectoryLoader
# Create a corpus (collection of documents)
corpus_store = GoogleVectorStore.create_corpus(display_name="My Corpus")
# Create a document under that corpus
document_store = GoogleVectorStore.create_document(
corpus_id=corpus_store.corpus_id, display_name="My Document"
)
# Load and upload documents
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=0)
for file in DirectoryLoader(path="data/").load():
chunks = text_splitter.split_documents([file])
document_store.add_documents(chunks)
# Query the document corpus
aqa = corpus_store.as_aqa()
response = aqa.invoke("What is the meaning of life?")
print("Answer:", response.answer)
print("Passages:", response.attributed_passages)
print("Answerable probability:", response.answerable_probability)