Audio Transcription

Overview

Audio files make up a large portion of the data companies produce: from content creation to call recordings to meetings. Unfortunately working with audio data is difficult; a text transcript is much easier. Using Gradient’s audio transcription API, users can easily transcribe audio files into text files to be used with their LLMs.

Create an account and workspace

If you haven't already, go to gradient.ai, click sign up and create your account. Once you have verified your email, log in to the account. Click "Create New Workspace" and give it a name.

You can see your workspaces at any time by going to https://auth.gradient.ai/select-workspace.

Upload and transcribe your audio file

You can use either the web UI or the SDK to transcribe your audio files. Gradient currently supports m4a, mp3, mp4 audio files.

Using the Gradient UI

Gradient allows you to easily transcribe your audio file into a full text transcript.

  1. Log into Gradient and select the workspace you want to use. Select the “Accelerator Blocks” tab from the left sidebar.

  2. Navigate to the “Audio Transcriptions” block.

  3. Upload the audio file by dragging it in or using the "Choose File" button.

  4. Hit “Submit” and the transcription will be ready in a few moments.


Using the SDK

The SDK is best suited for transcribing audio within your application or business workflow.

Reference our SDK Quickstart to get set up. From there, follow the code snippet below to transcribe an audio file (audio.m4a in the example) into text (audio.txt in the example):

from gradientai import Gradient

gradient = Gradient()

filepath = "audio.m4a"
result = gradient.transcribe_audio(filepath=filepath)

text_of_audio = result['text']
print(text_of_audio)
with open("audio.txt", "w") as file:
    file.write(text_of_audio)