Build a Voice Agent from scratch - Enkamind workshop, Chennai
Sat 22 Aug 2026 · Chennai

Ship a working voice AI agent in one afternoon.

In 3 hours you build, run, and talk to your own voice agent - speech in, LLM reasoning, voice out - and leave with the code, the architecture, and a clear sense of what it takes to get latency low enough to feel human. Built for engineers and founders who want to stop reading about voice AI and actually ship one.

🗓 Sat 22 Aug 2026 🕘 3 hours · 10 AM - 1 PM 📍 Chennai · in-person 💺 10 seats ₹1,999 · one-time
See all workshops
▎ What you'll build

You leave with the real thing - not notes.

A working real-time voice agent you can speak to and that answers back in a natural voice
A full STT to LLM to TTS pipeline wired up with streaming, not batch
Working interruption / barge-in so you can cut the agent off mid-sentence
At least one tool / function call mid-call
A runnable repo you take home plus a clear path to put it on a phone number via Twilio/SIP
▎ The 3 hours, block by block

Hands-on the whole way.

Block 1

Anatomy of a voice agent

  • The cascading stack STT (Deepgram/Whisper) to LLM to TTS (ElevenLabs/Cartesia) and the speech-to-speech alternative (OpenAI Realtime)
  • Where latency hides - time-to-first-token, time-to-first-audio
  • Get accounts, keys, and a starter repo running
Block 2

Build the pipeline

  • Stand up streaming STT to LLM to TTS with Pipecat or LiveKit Agents
  • Get turn-taking and interruption working with VAD and endpointing
  • Add one function / tool call
Block 3

Make it real and ship-aware

  • Put the agent on a phone number conceptually via Twilio/SIP
  • Build vs buy - frameworks like Pipecat/LiveKit vs turnkey Vapi/Retell
  • Debugging, measuring latency, what breaks in production, Q&A

Who it's for

  • Software engineers who can read and write code and want hands-on voice AI, not slides
  • Founders evaluating whether to build a voice product in-house or buy
  • Anyone shipping conversational or phone-based AI who needs to understand real latency and turn-taking trade-offs

What to bring

  • A laptop with Python 3.10+ (or Node), a terminal, and a code editor
  • A working microphone and headphones - headphones prevent echo during testing
  • API keys created beforehand - an LLM key, an STT key (Deepgram), a TTS key (ElevenLabs or Cartesia); most have free trial credit
  • Optional - a Twilio account for the phone-number portion
▎ By the end

What's true when you walk out.

You have a voice agent running on your own machine that you can hold a real back-and-forth conversation with
You understand cascading vs speech-to-speech architectures and can reason about which to use
You can name where latency comes from and the concrete levers to control it
▎ Tools you'll touch
PipecatLiveKit AgentsOpenAI Realtime APIDeepgramWhisperElevenLabsCartesiaSilero VADTwilio/SIPPython
▎ Who teaches
Workshop instructor illustration

Your instructor

Workshop instructor

I've spent 22+ years building software for enterprises - full-stack apps, backend systems, and lately RAG pipelines and agentic AI solutions. I've shipped the hard stuff for big companies. These workshops are that experience, distilled into one hands-on room so you can ship your own.

▎ Questions

Before you sign up.

Do I need machine learning experience?

No. If you can clone a repo, run a command, and read code, you can keep up. We use hosted models via APIs - no model training required.

Will the agent actually work or is it a toy demo?

It is a real, runnable pipeline with streaming and interruption - the same architecture production agents use. It will not be hardened for scale in 3 hours, but it is an honest foundation and you keep the code.

Why build from scratch when platforms like Vapi exist?

To understand what is actually happening so you can make an informed build-vs-buy call. We cover where turnkey platforms win and where building wins.

Build a Voice Agent from scratch

Sat 22 Aug 2026 · 3 hours · 10 AM - 1 PM · Chennai · 10 seats. Drop your email and we'll tell you the moment booking opens.