Skip to main content
Extract word- and line-level timestamped lyrics from a song. Use the output for karaoke rendering, subtitle generation, music search, or analytics.

Create a Task

import requests

API_KEY = "your_api_key"
HEADERS = {"Content-Type": "application/json", "x-api-key": API_KEY}

response = requests.post(
    "https://api.audioshake.ai/tasks",
    headers=HEADERS,
    json={
        "assetId": "your_asset_id",
        "targets": [
            {"model": "transcription", "formats": ["json"]}
        ]
    }
)

task_id = response.json()["id"]
print(f"Task created: {task_id}")
Check Task status to monitor progress and download results, or use webhooks to be notified when each target completes.
Use clean, full-length source audio for best results. Heavily compressed or low-bitrate files reduce transcription accuracy.
Need word-level timing for karaoke or subtitles? Use the alignment model instead — it provides precise per-word timestamps. See Models for details.

Use cases

  • Render karaoke-style highlighted lyrics in real time
  • Convert to SRT or VTT subtitle formats
  • Build searchable lyric databases with per-line timestamps
  • Power lyric-synced visual experiences

Build Karaoke Tracks

Combine lyric transcription with stem separation.

Models

See all available transcription and alignment models.