NvidiaChatGenerator
This Generator enables chat completion using NVIDIA-hosted models.
| Most common position in a pipeline | After a ChatPromptBuilder |
| Mandatory init variables | api_key: API key for the NVIDIA NIM. Can be set with NVIDIA_API_KEY env var. |
| Mandatory run variables | messages: A list of ChatMessage objects |
| Output variables | replies: A list of ChatMessage objects |
| API reference | NVIDIA API |
| GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/nvidia |
Overview
NvidiaChatGenerator enables chat completions using NVIDIA generative models via the NVIDIA API. It is compatible with the ChatMessage format for both input and output, ensuring seamless integration in chat-based pipelines.
You can use LLMs self-hosted with NVIDIA NIM or models hosted on the NVIDIA API Catalog. The default model for this component is meta/llama-3.1-8b-instruct.
To use this integration, you must have an NVIDIA API key. You can provide it with the NVIDIA_API_KEY environment variable or by using a Secret.
Tool support
NvidiaChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:
- A list of Tool objects: Pass individual tools as a list
- A single Toolset: Pass an entire Toolset directly
- Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list
This allows you to organize related tools into logical groups while also including standalone tools as needed.
from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator
# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)
# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])
# Pass mixed tools and toolsets to the generator
generator = NvidiaChatGenerator(
tools=[math_toolset, weather_tool, news_tool] # Mix of Toolset and Tool objects
)
For more details on working with tools, refer to the Tool and Toolset documentation.
Streaming
This generator supports streaming responses from the LLM. To enable streaming, pass a callable to the streaming_callback parameter during initialization.
Usage
To start using NvidiaChatGenerator, install the nvidia-haystack package:
You can use NvidiaChatGenerator with all the LLMs available in the NVIDIA API Catalog or with a model deployed using NVIDIA NIM. For more information, refer to the NVIDIA NIM for LLMs Playbook.
On its own
To use LLMs from the NVIDIA API Catalog, specify the api_url if needed (the default is https://integrate.api.nvidia.com/v1) and your API key. You can get your API key from the NVIDIA API Catalog.
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator
generator = NvidiaChatGenerator(
model="meta/llama-3.1-8b-instruct",
api_key=Secret.from_env_var("NVIDIA_API_KEY"),
)
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
result = generator.run(messages)
print(result["replies"])
print(result["meta"])
With multimodal inputs:
from haystack.dataclasses import ChatMessage, ImageContent
from haystack.utils import Secret
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator
llm = NvidiaChatGenerator(
model="meta/llama-3.2-11b-vision-instruct",
api_key=Secret.from_env_var("NVIDIA_API_KEY"),
)
image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(content_parts=[
"What does the image show? Max 5 words.",
image,
])
response = llm.run([user_message])["replies"][0].text
print(response)
# Red apple on straw.
In a pipeline
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator
pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component(
"llm",
NvidiaChatGenerator(
model="meta/llama-3.1-8b-instruct",
api_key=Secret.from_env_var("NVIDIA_API_KEY"),
),
)
pipe.connect("prompt_builder", "llm")
country = "Germany"
system_message = ChatMessage.from_system("You are an assistant giving out valuable information to language learners.")
messages = [system_message, ChatMessage.from_user("What's the official language of {{ country }}?")]
res = pipe.run(
data={
"prompt_builder": {
"template_variables": {"country": country},
"template": messages,
}
}
)
print(res)