Sesame AI POC

Proof Of Concept
This demo allows you to interact with an AI using both voice-to-voice and text-to-speech capabilities.
Why each tool was added:

  • Whisper (OpenAI): Used for converting spoken input to text because Sesame AI currently only supports text-to-speech.
  • LLaMA 3 (AWS): Acts as the brain that generates intelligent responses from your questions.
  • Sesame AI (Hugging Face): Converts the AI's response back to expressive speech.

Example questions you can ask:

  • What are the healthiest oils to cook with?
  • How much water should I drink daily?
  • What are good snacks for weight loss?

Created by Kara Granados

NOTE: This demo is intended for testing purposes. The longer response time is due to using free-tier resources on Hugging Face. In a production environment, dedicated infrastructure will be used to ensure real-time performance.

Additional Info: The CSM (Conversational Speech Model) used for voice output is a large model and may take additional time to load and generate audio responses, especially during the first use or after inactivity.