Skip to main content
Separate a recording with multiple speakers into one stem per speaker — even when speakers overlap. Use the outputs for transcription, speaker-specific editing, or feeding clean single-speaker audio into downstream AI models.

Create a Task

import requests

API_KEY = "your_api_key"
HEADERS = {"Content-Type": "application/json", "x-api-key": API_KEY}

response = requests.post(
    "https://api.audioshake.ai/tasks",
    headers=HEADERS,
    json={
        "assetId": "your_asset_id",
        "targets": [
            {"model": "multi_voice", "formats": ["wav"]}
        ]
    }
)

task_id = response.json()["id"]
print(f"Task created: {task_id}")
Check Task status to monitor progress and download results, or use webhooks to be notified when each target completes. The model outputs one audio file per detected speaker. Even when speakers overlap, each stem contains only the isolated voice of a single speaker.

Use cases

  • Clean per-speaker audio for transcription and diarization
  • Isolate individual voices in meetings, interviews, or panel discussions
  • Prepare training data for speech AI models
  • Enable speaker-specific editing in podcast post-production

Speech Denoising

Clean up individual speaker stems after separation.

Dialogue Separation

Separate all speech from music and effects instead.