Version: 2.23-unstable

NvidiaChatGenerator

This Generator enables chat completion using NVIDIA-hosted models.


Most common position in a pipeline	After a ChatPromptBuilder
Mandatory init variables	`api_key`: API key for the NVIDIA NIM. Can be set with `NVIDIA_API_KEY` env var.
Mandatory run variables	`messages`: A list of ChatMessage objects
Output variables	`replies`: A list of ChatMessage objects
API reference	NVIDIA API
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/nvidia

Overview

NvidiaChatGenerator enables chat completions using NVIDIA generative models via the NVIDIA API. It is compatible with the ChatMessage format for both input and output, ensuring seamless integration in chat-based pipelines.

You can use LLMs self-hosted with NVIDIA NIM or models hosted on the NVIDIA API Catalog. The default model for this component is meta/llama-3.1-8b-instruct.

To use this integration, you must have an NVIDIA API key. You can provide it with the NVIDIA_API_KEY environment variable or by using a Secret.

Tool support

NvidiaChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:

A list of Tool objects: Pass individual tools as a list
A single Toolset: Pass an entire Toolset directly
Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list

This allows you to organize related tools into logical groups while also including standalone tools as needed.

python

from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator

# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)

# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])

# Pass mixed tools and toolsets to the generator
generator = NvidiaChatGenerator(
    tools=[math_toolset, weather_tool, news_tool]  # Mix of Toolset and Tool objects
)

For more details on working with tools, refer to the Tool and Toolset documentation.

Streaming

This generator supports streaming responses from the LLM. To enable streaming, pass a callable to the streaming_callback parameter during initialization.

Usage

To start using NvidiaChatGenerator, install the nvidia-haystack package:

shell

pip install nvidia-haystack

You can use NvidiaChatGenerator with all the LLMs available in the NVIDIA API Catalog or with a model deployed using NVIDIA NIM. For more information, refer to the NVIDIA NIM for LLMs Playbook.

On its own

To use LLMs from the NVIDIA API Catalog, specify the api_url if needed (the default is https://integrate.api.nvidia.com/v1) and your API key. You can get your API key from the NVIDIA API Catalog.

python

from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator

generator = NvidiaChatGenerator(
    model="meta/llama-3.1-8b-instruct",
    api_key=Secret.from_env_var("NVIDIA_API_KEY"),
)

messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
result = generator.run(messages)
print(result["replies"])
print(result["meta"])

With multimodal inputs:

python

from haystack.dataclasses import ChatMessage, ImageContent
from haystack.utils import Secret
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator

llm = NvidiaChatGenerator(
    model="meta/llama-3.2-11b-vision-instruct",
    api_key=Secret.from_env_var("NVIDIA_API_KEY"),
)

image = ImageContent.from_file_path("apple.jpg")
user_message = ChatMessage.from_user(content_parts=[
    "What does the image show? Max 5 words.",
    image,
])

response = llm.run([user_message])["replies"][0].text
print(response)
# Red apple on straw.

In a pipeline

python

from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator

pipe = Pipeline()
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component(
    "llm",
    NvidiaChatGenerator(
        model="meta/llama-3.1-8b-instruct",
        api_key=Secret.from_env_var("NVIDIA_API_KEY"),
    ),
)
pipe.connect("prompt_builder", "llm")

country = "Germany"
system_message = ChatMessage.from_system("You are an assistant giving out valuable information to language learners.")
messages = [system_message, ChatMessage.from_user("What's the official language of {{ country }}?")]

res = pipe.run(
    data={
        "prompt_builder": {
            "template_variables": {"country": country},
            "template": messages,
        }
    }
)
print(res)

Cookbook: Haystack RAG Pipeline with Self-Deployed AI models using NVIDIA NIMs

Overview​

Tool support​

Streaming​

Usage​

On its own​

In a pipeline​

Related​

Overview

Tool support

Streaming

Usage

On its own

In a pipeline

Related