Sesame AI POC
Proof Of Concept
This demo allows you to interact with an AI using both voice-to-voice and text-to-speech capabilities.
Why each tool was added:
- Whisper (OpenAI): Used for converting spoken input to text because Sesame AI currently only supports text-to-speech.
- LLaMA 3 (AWS): Acts as the brain that generates intelligent responses from your questions.
- Sesame AI (Hugging Face): Converts the AI's response back to expressive speech.
Example questions you can ask:
- What are the healthiest oils to cook with?
- How much water should I drink daily?
- What are good snacks for weight loss?
Created by Kara Granados
NOTE: This demo is intended for testing purposes. The longer response time is due to using free-tier resources on Hugging Face. In a production environment, dedicated infrastructure will be used to ensure real-time performance.
Additional Info: The CSM (Conversational Speech Model) used for voice output is a large model and may take additional time to load and generate audio responses, especially during the first use or after inactivity.