To configure Deepgram's Voice Agent to speak Portuguese using Cartesia's TTS, you need to adjust the settings properly to ensure the correct language output. This article outlines the necessary configuration steps and addresses potential issues related to setting up the Cartesia TTS provider with a specific language.
To use Cartesia as your TTS provider within Deepgram's Voice Agent, you must set the speak
model configuration correctly. Here is an example of what your configuration might look like:
{
"type": "SettingsConfiguration",
"audio": {
"input": {
"encoding": "mulaw",
"sample_rate": 8000
},
"output": {
"encoding": "mulaw",
"sample_rate": 8000,
"container": "none"
}
},
"agent": {
"listen": {
"model": "nova-2"
},
"think": {
"provider": {
"type": "open_ai"
},
"model": "gpt-4o-mini"
},
"speak": {
"provider": "cartesia",
"voice_id": "<your_voice_id>"
}
}
}
When using Cartesia, it's important to specify the language to prevent defaulting to English with an accent:
As of recent updates, Deepgram might allow specifying language in the Voice Agent Settings Config, simplifying TTS configuration with third-party providers like Cartesia.
Keep your configuration files updated and consult Cartesia documentation for advanced features and settings. If issues persist, reach out to your Deepgram support representative or visit our community on Discord for assistance.