Transcription
Note: AudioShake is currently migrating customers to the new Tasks API.
The legacy documentation below is provided for reference only.
Given an audio file, AudioShake will transcribe audio into readable text. This task can also be combined with alignment as a single call. See instructions for that combined task.
Upload a File
You can make a POST request to the /upload/audio-link endpoint. When using this endpoint, you should send a JSON payload that includes a link to the audio file. Alternatively, you can use the more general /upload endpoint for uploading audio files directly. Expect both endpoints to return a JSON payload containing an asset object with an “id”.
Further documentation on uploading files.
Create a Job
Once you have the asset ID, you can create a job request to generate a transcription using the /jobs endpoint. The endpoint accepts a JSON request with the file ID, a callback url, and a metadata object that contains a format and name that maps to the desired output. In the following example, we'll explicitly use transcription
:
curl -X 'POST' \
'https://groovy.audioshake.ai/job' \
-H 'accept: application/json' \
-H 'Authorization: Bearer EXAMPLE' \
-H 'Content-Type: application/json' \
-d '{
"metadata": {
"format": "json",
"name": "transcription",
"language": "en"
},
"callbackUrl": "https://example.com/webhook/transcription",
"assetId": "abc123"
}'
Input | Description |
---|---|
format | Format of the output file |
name | The data model for the job, will always be "transcription" for the purposes of this page |
language | The language of the input audio. A list of supported languages is found in supported Languages |
callbackUrl | An optional webhook site where multiple updates about the job status will be sent |
assetId | The id of the asset you obtained through uploading the file |
Upon successful initiation, the server will reply with an HTTP status code of 200, indicating that everything is OK. The response will include a 'job' object, which contains details about the job that has been started. You can use the provided job ID in two ways:
- Wait for a callback from AudioShake, which will be sent upon completion or status update of the job.
- Periodically check the status of the job by polling using the job id.
{
job: {
id: "clmgnr10z001oeumub3a9hifc",
clientId: "<YOUR CLIENT ID>",
requestId: "clmgnr10z001oeumub3a9hifc",
metadata: {
format: "json",
name: "transcription",
language: "en"
},
callbackUrl: "<OPTIONAL- YOUR CALLBACK URL>",
status: "created",
},
}
Field | Description |
---|---|
id | The unique id assigned to the job; this can be used to get information about the job later |
clientId | Your client id |
requestId | The id of the api request |
format | The given format of the output file |
name | The given data model |
language | The given language of the input file |
status | The status of the job, either "created", "processing", "error", or "completed" |
Get Job
After you have created a job request, you can retrieve the results by making a GET request to the /jobs/{job_id} endpoint, where job_id is the ID of the job that you created. Here is an example of a curl command that retrieves the results of a job request:
curl -X 'GET' \
'https://groovy.audioshake.ai/job/JOB_ID' \
-H 'accept: application/json' \
-H 'Authorization: Bearer EXAMPLE'
The server will reply with an HTTP status code of 200 with a response that contains a "job" object. To retrieve the output asset, parse the outputAssets
list, and there should be an asset with name: "transcription.<format>"
for singular requested format.
{
job: {
id: "clmgnr10z001oeumub3a9hifc",
clientId: "<YOUR CLIENT ID>",
createdAt: "2023-08-01T20:37:06.656Z",
startedAt: "2023-08-01T20:37:07.839Z",
updatedAt: "2023-08-01T20:37:28.392Z",
requestId: "clksrfob8000bf7of6xkscy8m",
licenseId: "<YOUR LICENSE ID>",
metadata: {
format: "json",
name: "transcription",
language: "en"
},
callbackUrl: "<OPTIONAL- YOUR CALLBACK URL>",
status: "completed",
statusInfo: {},
sourceAsset: {
name: "<Your-File.mp3>",
id: "clmgn7nkg50527f0mulbhwi6z8",
fileType: "audio/mpeg",
format: "mp3",
link: "<LINK TO FILE>",
},
outputAssets: [
{
name: "transcription.json",
id: "clksrg52j2319f7of26wleinn",
fileType: "application/json",
format: "json",
link: "<LINK TO GENERATED FILE>",
},
],
},
}