How to Choose the Right Request Parameters for Your Audio in Deepgram?

Choosing the correct request parameters for your audio depends on several factors:

Audio format and encoding
Transcription features (punctuation, numerals)
Business use case
Speaker organization and number of channels

Audio Format and Encoding

For batch transcription, the audio format is inferred from the file headers, so it doesn’t need to be specified in your query parameters.

For streaming transcription, specify the audio format when sending raw, headerless audio packets. Know the audio codec and use the encoding query parameter.

Refer to these links for more details on audio content types and supported formats:

Audio Content Types (Mozilla): https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types
Deepgram Supported Audio Formats: https://developers.deepgram.com/docs/supported-audio-formats

Transcription Features

To use specific transcription features, append them to your HTTP or WebSockets request URL as query parameters. For example:

/v1/listen?numerals=true&punctuate=true&multichannel=true&search=yes

Business Use Case

Your business use case influences the query parameters, especially the model selection. Test different models like general or phonecall to find the best results.

Speaker Organization

For multiple speakers, use diarization (diarize=true) or specify multichannel audio (multichannel=true). Diarization is for mono channels, while multichannel is for speakers on individual channels. If speakers are on multiple channels, further analysis is needed to choose the best option.

By considering these factors, you can select the right parameters for your audio transcription with Deepgram.

Community

Beta

How to Choose the Right Request Parameters for Your Audio in Deepgram?