How to Choose the Right Request Parameters for Your Audio in Deepgram?
Choosing the correct request parameters for your audio depends on several factors:
-
Audio format and encoding
-
Transcription features (punctuation, numerals)
-
Business use case
-
Speaker organization and number of channels
Audio Format and Encoding
For batch transcription, the audio format is inferred from the file headers, so it doesn’t need to be specified in your query parameters.
For streaming transcription, specify the audio format when sending raw, headerless audio packets. Know the audio codec and use the encoding query parameter.
Refer to these links for more details on audio content types and supported formats:
-
Audio Content Types (Mozilla): https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types
-
Deepgram Supported Audio Formats: https://developers.deepgram.com/docs/supported-audio-formats
Transcription Features
To use specific transcription features, append them to your HTTP or WebSockets request URL as query parameters. For example:
/v1/listen?numerals=true&punctuate=true&multichannel=true&search=yes
Business Use Case
Your business use case influences the query parameters, especially the model selection. Test different models like general or phonecall to find the best results.
Speaker Organization
For multiple speakers, use diarization (diarize=true) or specify multichannel audio (multichannel=true). Diarization is for mono channels, while multichannel is for speakers on individual channels. If speakers are on multiple channels, further analysis is needed to choose the best option.
By considering these factors, you can select the right parameters for your audio transcription with Deepgram.