Embeddings Quickstart
Overview
Use the Gradient Embeddings API to easily generate embeddings to create your LLM's extended knowledge base.
ποΈ Set up your environment
If you have not used the Gradient API before, please follow the steps in CLI Quickstart or SDK Quickstart to set up your environment.
π¦ Prepare your dataset
Your data should be in a JSONL file, where each line has the format:
{ "input": "<your-string>" }
π’ Generate embeddings with the CLI
You can see the list of available embeddings models:
$ gradient embeddings list
View the available embeddings models here.
You can select an embeddings model and generate embeddings for your data with gradient embeddings generate <model-slug> <json-filepath>
:
$ gradient embeddings generate bge-large ~/sample.jsonl
π’ Generate embeddings with the SDKs
Here is an example of how to generate embeddings using the Python SDK:
from dotenv import load_dotenv
load_dotenv()
from gradientai import Gradient
def main() -> None:
gradient = Gradient()
embeddings_model = gradient.get_embeddings_model(slug="bge-large")
generate_embeddings_response = embeddings_model.generate_embeddings(
inputs=[
"Multimodal brain MRI is the preferred method to evaluate for acute ischemic infarct and ideally should be obtained within 24 hours of symptom onset, and in most centers will follow a NCCT",
"CTA has a higher sensitivity and positive predictive value than magnetic resonance angiography (MRA) for detection of intracranial stenosis and occlusion and is recommended over time-of-flight (without contrast) MRA",
"Echocardiographic strain imaging has the advantage of detecting early cardiac involvement, even before thickened walls or symptoms are apparent",
],
)
for embedding in generate_embeddings_response.embeddings:
print(f"generated embedding: {embedding.embedding}")
gradient.close()
if __name__ == "__main__":
main()
Here is an example to generate embeddings using the Typescript SDK:
import { Gradient } from "@gradientai/nodejs-sdk";
const main = async () => {
const gradient = new Gradient({});
const embeddingsModel = await gradient.getEmbeddingsModel({
slug: "bge-large",
});
const { embeddings } = await embeddingsModel.generateEmbeddings({
inputs: [
"Multimodal brain MRI is the preferred method to evaluate for acute ischemic infarct and ideally should be obtained within 24 hours of symptom onset, and in most centers will follow a NCCT",
"CTA has a higher sensitivity and positive predictive value than magnetic resonance angiography (MRA) for detection of intracranial stenosis and occlusion and is recommended over time-of-flight (without contrast) MRA",
"Echocardiographic strain imaging has the advantage of detecting early cardiac involvement, even before thickened walls or symptoms are apparent",
],
});
for (const { embedding } of embeddings) {
console.log(`created embedding: ${JSON.stringify(embedding, null, 2)}`);
}
};
main()
.catch((e) => {
console.error(e);
process.exit(1);
})
.finally(() => process.exit());
Updated 5 months ago