🍃 Overview of the MongoDB integration

MongoDB integrates operational databases and vector search into a single, unified, and fully managed platform. This allows users to easily build semantic search and AI-powered applications via popular frameworks, leveraging a MongoDB native interface and large language models (LLMs).

The Gradient and MongoDB integration enables users to effectively implement RAG on top of their private custom models in the Gradient platform while using MongoDB as their backend storage.

Gradient as your LLM and embeddings provider

Since Gradient provides simple web APIs, you can easily fine-tune models and generate completions and embeddings all from one platform. To get started, you’ll have to create an account on Gradient to generate an access token.

You will also need your workspace ID that is automatically created when you first sign up.

Gradient provides each user with a $5 workspace credit so you’ll be able to get started on the platform immediately. From there, you can easily create LLM and Embeddings modules using LlamaIndex to make requests to the Gradient Model and Embeddings API.

import os
from llama_index.llms import GradientBaseModelLLM
from llama_index.embeddings import GradientEmbedding

DEFAULT_LLM_MODEL_SLUG = "nous-hermes2"
DEFAULT_EMBEDDING_MODEL_SLUG = "bge-large"

llm = GradientBaseModelLLM(
    base_model_slug=DEFAULT_LLM_MODEL_SLUG,
    access_token=os.environ["GRADIENT_ACCESS_TOKEN"],
    workspace_id=os.environ["GRADIENT_WORKSPACE_ID"],
)
embed_model = GradientEmbedding(
    gradient_model_slug=DEFAULT_EMBEDDING_MODEL_SLUG,
    gradient_access_token=os.environ["GRADIENT_ACCESS_TOKEN"],
    gradient_workspace_id=os.environ["GRADIENT_WORKSPACE_ID"],
)

Atlas as a storage backend for RAG with LlamaIndex

Atlas Vector Search is a comprehensive managed solution that streamlines the indexing of high-dimensional vector data in MongoDB. It enables quick and efficient vector similarity searches, reducing operational overhead and simplifying the AI lifecycle with a single data model. This service allows you to either utilize MongoDB Atlas as a standalone vector database for a fresh project or enhance your current MongoDB Atlas collections with vector search capabilities.

To get started, create an Atlas account and set up a cluster.

You’ll have to first create an Index in the UI before you can use it to store embeddings and document chunks. Since we’ll be using the bge-large-v1.5 Embeddings model, we’ll also need to specify our embeddings dimension (1024) within the index config.

Once you’ve completed that step, you’ll need to create a pymongo client to connect to a LlamaIndex MongoDBAtlasVectorSearch so that you can use it as the backend storage in the VectorStoreIndex.

Now you can run completions, together with RAG, on your prompts for Gradient models directly from LlamaIndex.

import pymongo
from llama_index import VectorStoreIndex
from llama_index.storage.storage_context import StorageContext
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch

# Get your MongoDB URI to use here
mongodb_client = pymongo.MongoClient(mongo_uri)
store = MongoDBAtlasVectorSearch(mongodb_client)

service_context = ServiceContext.from_defaults(
    chunk_size=1024,
    llm=llm,
    embed_model=embed_model,
)
storage_context = StorageContext.from_defaults(vector_store=store)
documents = SimpleDirectoryReader("../australian_animals/data").load_data()
index = VectorStoreIndex.from_documents(
    documents, 
    storage_context=storage_context, 
    service_context=service_context,
)

query_engine = index.as_query_engine()
response = query_engine.query("How many thumbs do koalas have?")

Example

You can see a more details example of using MongoDB with Gradient here.