FAQ - AudioShake Developers

Getting started

What is a Task?

A Task is a processing job. You provide a media source and one or more model targets — AudioShake processes them asynchronously and returns results for each target. See Create Task.

What is a target?

A target specifies which model to run and what output format to produce. You can include up to 20 targets in one Task to generate several outputs from the same file:

{
  "assetId": "your_asset_id",
  "targets": [
    { "model": "vocals", "formats": ["wav"] },
    { "model": "instrumental", "formats": ["wav"] },
    { "model": "transcription", "formats": ["json"] }
  ]
}

Should I use a URL or upload a file?

Use url when your media is publicly accessible over HTTPS
Use assetId when your media is local — upload it first, then reference the returned ID

We recommend uploading files when possible for the most reliable processing.

What models are available?

AudioShake offers models for instrument stem separation, speech processing, post-production, copyright compliance, and lyric transcription. See the Models page for the full list with descriptions and pricing.

I already have a lyrics transcript. Can I just align it?

Yes. Use the alignment model and provide your existing transcript via transcriptUrl or transcriptAssetId. If no transcript is provided, the model will transcribe automatically.

{
  "assetId": "your_asset_id",
  "targets": [
    {
      "model": "alignment",
      "formats": ["json"],
      "transcriptUrl": "https://example.com/lyrics.txt"
    }
  ]
}

Processing

How do I know when a Task is complete?

Two options:

Poll — call Get Task by ID until each target’s status is completed or error. See Check Task Status.
Webhooks — receive a POST request each time a target completes. See Webhooks.

How long does processing take?

Processing time depends on the media length, number of targets, and current queue load. Most tasks complete within seconds to a few minutes.

Can I process video files?

Yes. AudioShake accepts MP4 and MOV video files. Only the audio stream is processed — video content is ignored. See Formats for all supported input types.

What happens if a target fails?

Individual targets can fail independently. Check the status and error fields on each target in the Task response. Other targets in the same Task are not affected.

Files and formats

What formats can I upload?

Audio: WAV, AIFF, FLAC, MP3, AAC. Video: MP4, MOV. Maximum file size is 2GB. See Formats for the full list.

What output formats are available?

Audio: wav, mp3, flac, aiff. Video: mp4. Text: json, srt, txt. See Formats for which formats apply to which models.

How long do uploaded Assets and output files last?

Uploaded Assets expire after 72 hours. Output download links expire after one hour — re-fetch the Task to get fresh links.

Billing

How are credits calculated?

Credits are charged per minute of source audio, per target model. Duration is rounded up to the nearest minute. See Billing & Credits for examples and Models for per-model rates.

Do failed Tasks consume credits?

Tasks that fail before processing begins are not charged. If a Task fails mid-processing, contact support@audioshake.ai with the Task ID.

Are bulk discounts available?

Yes. Contact info@audioshake.ai for volume pricing, custom SLAs, and enterprise plans.

Support

How do I contact support?

Email support@audioshake.ai. Include your Task ID and the full error message for the fastest resolution.

Is there a rate limit?

Yes — 60 requests per second. See Rate Limits.

​Getting started

​Processing

​Files and formats

​Billing

​Support

Getting started

Processing

Files and formats

Billing

Support