Entity Extraction

Overview

The Gradient Accelerator Block for entity extraction is designed to simplify and streamline the process of transforming unstructured data into structured JSON objects.

Entity extraction can be used for business use cases such as extracting clinical codes from medical notes, managing KYC compliance, extracting important information from legal documents, or automating data entry processes.

With this Accelerator Block, users can automate more labor-intensive tasks and focus on more strategic activities.

Create an account and workspace

If you haven't already, go to gradient.ai, click sign up and create your account. Once you have verified your email, log in to the account. Click "Create New Workspace" and give it a name.

You can see your workspaces at any time by going to https://auth.gradient.ai/select-workspace.

Use the Accelerator Blocks playground

You can easily try out the Gradient Accelerator Block for entity extraction via the playground UI.

  1. Log into Gradient and select the workspace you want to use. Select the “Accelerator Blocks” tab from the left sidebar.

  2. Navigate to the “Entity Extraction” block.

    Screenshot 2024-01-23 at 11.12.11 PM.png
  3. Insert the text or excerpt into the input.

  4. Identify the entities that you want to extract from the text. Ideally, the name of the entity is descriptive but concise, so the LLM can understand what to identify correctly.

    1. You can set the type of the entity, which is strictly defined.
    2. You can mark an entity as “required” (meaning a value will always be returned for it even if there is no match found) or “not required” (meaning a value will not be returned if a match is not found).
  5. Hit “Submit” and the extracted fields should be displayed below!

Use the SDK

The SDK is best suited for integrating entity extraction into your application or business workflow.

Follow our SDK Quickstart to get set up. From there, simply provide your document for extraction and the structured schema.

from gradientai import ExtractParamsSchemaValueType, Gradient

gradient = Gradient()

document = (
    "When Apple released the Apple Watch in 2015, it was business as "
    + "usual for a company whose iPhone updates had become cultural "
    + "touchstones. Before the watch went on sale, Apple gave early "
    + "versions of it to celebrities like Beyoncé, featured it in fashion "
    + "publications like Vogue and streamed a splashy event on the "
    + "internet trumpeting its features."
)
schema_ = {
    "company": {
        "type": ExtractParamsSchemaValueType.STRING,
        "required": True,
    },
    "product": {
        "type": ExtractParamsSchemaValueType.STRING,
    },
    "magazine": {
        "type": ExtractParamsSchemaValueType.STRING,
    },
    "year": {
        "type": ExtractParamsSchemaValueType.NUMBER,
    },
}

result = gradient.extract(
    document=document,
    schema_=schema_,
)