Community

Deepgram Endpointing Parameter in Speech-to-Text API

The Deepgram Speech-to-Text API offers a feature known as "endpointing" that is crucial for understanding when transcriptions should be finalized and returned after a period of silence. This article explains how to configure the endpointing parameter to optimize the flow of transcription in your applications.

Understanding Endpointing

Endpointing is a feature that determines the amount of silence required to decide that a user has paused or finished speaking. By default, Deepgram's endpointing is set to finalize the transcript after detecting 10 milliseconds of silence (endpointing=10). This enables a quick response after the completion of speech.

Configuring Endpointing

You can configure the endpointing parameter by setting it to an integer that represents the millisecond value of desired silence. For example, if you wish to wait 500 milliseconds (0.5 seconds) before finalizing the transcript, you would set the endpointing parameter to 500:

endpointing=500

This setup is useful if you expect users to pause for longer periods but still want to capture their complete thoughts before endpointing.

Disabling Endpointing

If your application's needs are such that endpointing should be disabled—meaning that transcriptions are returned at a cadence determined solely by Deepgram's chunking algorithms—you can set the parameter to false:

endpointing=false

This ensures continuous transcription without waiting for defined pauses.

Use Cases and Considerations

Short Responses: If your application anticipates brief responses, keeping the default or a lower value is beneficial for quick transcription turnaround.
Long Pauses: For situations where users may pause for emphasis or due to thought processes, increasing the endpointing value can ensure full sentence capture.
Streamlined Input: Disabling endpointing might be necessary for real-time continuous speech settings such as lectures or live events.

Conclusion

Adjust the endpointing parameter to suit the nature of your audio input to achieve an optimal balance between speed and accuracy in transcriptions. If issues persist or the system behavior seems inconsistent, reach out to your Deepgram support representative (if you have one) or visit our community for assistance: Deepgram Community.

References

Deepgram API Documentation on Endpointing