Dialogue Separation - AudioShake Developers

Isolate dialogue from the music and effects bed in any audio or video content. Use the separated outputs for dubbing prep, podcast cleanup, broadcast post-production, or feeding clean speech into downstream AI models.

Create a Task

This example extracts both the dialogue and the background (music + effects) stems:

import requests

API_KEY = "your_api_key"
HEADERS = {"Content-Type": "application/json", "x-api-key": API_KEY}

response = requests.post(
    "https://api.audioshake.ai/tasks",
    headers=HEADERS,
    json={
        "assetId": "your_asset_id",
        "targets": [
            {"model": "dialogue", "formats": ["wav"]},
            {"model": "music_fx", "formats": ["wav"]}
        ]
    }
)

task_id = response.json()["id"]
print(f"Task created: {task_id}")

Check Task status to monitor progress and download results, or use webhooks to be notified when each target completes.

Outputs

Target	Role
`dialogue`	Clean speech — use as reference or feed into translation/dubbing
`music_fx`	Music + effects bed — layer localized voiceover on top

Use cases

Prepare content for localization and foreign-language dubbing
Clean up podcast audio by isolating the host’s voice
Extract clean dialogue for speech-to-text or AI training data
Separate effects and ambience for sound design workflows

Stem Separation

Separate music into individual instruments instead.

Music Detection

Find where music appears before separating.

Speech Denoising Music Removal

⌘I

Documentation Index

​Create a Task

​Outputs

​Use cases

Stem Separation

Music Detection

Create a Task

Outputs

Use cases