Skip to main content

extract_audio

Extract audio track from a video file.
{ "type": "extract_audio", "params": { "format": "mp3", "bitrate": "192k" } }

normalize_audio

Normalize audio loudness to broadcast standards (EBU R128).
{ "type": "normalize_audio", "params": { "target_lufs": -14 } }

noise_remove

Remove background noise from audio.
{ "type": "noise_remove" }

fade_audio

Add fade in/out effects to audio.
{ "type": "fade_audio", "params": { "fade_in": 2.0, "fade_out": 3.0 } }

mix_audio

Mix multiple audio tracks together. Use an array of URLs as input.
{
  "input": ["https://example.com/voice.mp3", "https://example.com/music.mp3"],
  "operations": [{ "type": "mix_audio", "params": { "volumes": [1.0, 0.3] } }]
}

pitch_shift

Shift audio pitch up or down.
{ "type": "pitch_shift", "params": { "semitones": 2 } }

loudness_analyze

Analyze audio loudness levels (LUFS, peak, dynamic range).
{ "type": "loudness_analyze" }
Returns loudness metrics in the output without modifying the file.

transcribe

Transcribe speech to text using OpenAI Whisper.
{ "type": "transcribe", "params": { "language": "en" } }
ParamTypeDescription
languagestringLanguage code (en, es, fr, etc.) or “auto”
formatstringOutput format: text, srt, vtt, json

waveform

Generate a visual waveform image from audio.
{ "type": "waveform", "params": { "width": 1920, "height": 200, "color": "0x00ff00" } }