Skip to main content

AudioShakeSeparator

The core processing class. Takes raw audio buffers in, returns separated stem buffers. Multiple instances can run in parallel; each operates in a single thread.

Constructor

AudioShakeSeparator(
    const char*  clientID,
    const char*  clientSecret,
    void*        model,
    unsigned int modelSizeBytes,
    unsigned int inputSamplerate,
    unsigned int flags = useFastestBackend | inputFloat |
                         inputNonInterleaved | outputFloat |
                         outputNonInterleaved | chunkNormal
);
ParameterDescription
clientIDYour AudioShake Client ID
clientSecretYour AudioShake Client Secret
modelPointer to the encrypted model data loaded into memory
modelSizeBytesSize of the model data in bytes
inputSamplerateSample rate of input audio in Hz (e.g. 44100, 48000). Automatically resampled to match output.
flagsConfiguration flags — see Flags

Methods

const char* getInitializationError();  // NULL on success, error string on failure
const char* getBackendName();          // Active backend (e.g. "LiteRT GPU", "Metal", "CPU")
const char* getVersion();              // SDK and model version string
unsigned char getNumberOfStems();
const char*   getStemName(unsigned char stemIndex);
unsigned int getNumberOfChannels();    // Channels per output stem
unsigned int getOutputSamplerate();    // May differ from input
unsigned int getFramesNeeded();        // Frames required per process() call

Processing

int process(void** inputChannels, unsigned int numInputFrames, void*** outputChannels);
  • Returns number of output frames produced
  • Returns 0 if more input is needed before output is available
  • Returns -1 on error
  • Pass NULL for inputChannels to flush remaining audio (e.g. end of file)
  • Mono input is valid — the separator duplicates it across channels

Reuse

void prepareForNewContent(unsigned int inputSamplerate, unsigned int flags);
Reset internal state to process a new audio source without reconstructing the separator.

SourceSeparationTask

High-level wrapper that manages file reading, writing, and progress callbacks.

Constructor

SourceSeparationTask(
    const char*              clientID,
    const char*              clientSecret,
    SourceSeparationInput*   input,
    SourceSeparationOutput*  output,
    const PathStr&           modelPath,
    void*                    model = nullptr,
    unsigned int             modelSizeBytes = 0,
    unsigned int             additionalFlags = 0
);

Processing

bool run(ProgressCallback cb = nullptr, void* clientData = nullptr);  // Run to completion
bool processOneIteration(ProgressCallback cb = nullptr, void* clientData = nullptr);  // Process one chunk

Status

double getProgress();          // 0.0–1.0
bool   isFinished();
const char* getErrorMessage(); // NULL on success

Progress callback

typedef void (*ProgressCallback)(SourceSeparationTask* task, double progress, void* clientData);

Input classes

AudioFileReader — reads from a file on disk (WAV, MP3, etc.):
AudioFileReader(const char* filePath);
RingBufferInput — accepts audio frames from a ring buffer for streaming:
RingBufferInput(size_t size, bool isStereoInterleaved, int sampleRate);

Output classes

WAVOutput — writes each stem as a separate WAV file:
WAVOutput(const char* outputPath);
RingBufferOutput — exposes stems via a ring buffer:
RingBufferOutput(size_t size, bool isStereoInterleaved);

Flags

Combine with |. Pass to AudioShakeSeparator or as additionalFlags to SourceSeparationTask.

Backend

FlagDescription
useFastestBackendGPU if available, CPU otherwise. Default.
useCPUBackendOnlyForce CPU-only processing

Input format

FlagDescription
inputFloat32-bit float. Default.
inputInt1616-bit signed integer
inputNonInterleavedSeparate channel buffers. Default.
inputInterleavedStereoInterleaved stereo

Output format

FlagDescription
outputFloat32-bit float. Default.
outputInt1616-bit signed integer
outputNonInterleavedSeparate channel buffers. Default.
outputInterleavedStereoInterleaved stereo

Chunk size

Durations are approximate and vary by model. Smaller chunks reduce latency but increase compute cost.
FlagApprox. durationCompute cost
chunkNormal~3 sec1x. Default.
chunk2X~1.5 sec2x
chunk4X~0.75 sec4x
chunk8X~375 ms8x
chunk16X~185 ms16x
chunk32X~92 ms32x
chunk64X~46 ms64x

Residual

FlagDescription
addResidualAdd a residual stem containing audio remaining after subtracting all extracted stems
residualIsInputMinusStem0Residual = input minus stem 0
residualIsInputMinusStem1Residual = input minus stem 1
residualIsInputMinusStem2Residual = input minus stem 2
residualIsInputMinusStem3Residual = input minus stem 3