Introduction
VoiceLayer is a push-to-talk voice transcription SDK that adds voice input to any web application in two lines of code. Focus an input, hold a key, speak — your words appear instantly.
What is VoiceLayer?
VoiceLayer wraps browser microphone access, streams audio over a WebSocket to Deepgram's Nova-3 model, and injects the live transcript directly into whatever input the user is focused on. It works on any <input>, <textarea>, or contenteditable element — no configuration needed.
On mobile, a tap-to-speak pill appears automatically. On desktop, users hold the spacebar (or click the pill). Transcription is streaming — words appear as they're spoken, not after a delay.
How it works
Key features
- Works on any input — drop the script tag and every
inputandtextareaon the page gets voice input automatically - Streaming transcription — words appear as you speak using Deepgram Nova-3, not after you stop
- Mobile + desktop — tap-to-speak on mobile, hold-to-speak on desktop, both work simultaneously
- Framework agnostic — works with React, Next.js, Vue, plain HTML — anything that runs in a browser
- Open source — MIT licensed, self-hostable, auditable
- 2KB SDK — tiny footprint, no dependencies in the browser bundle
Open source
The SDK and server are MIT licensed and available on GitHub. You can self-host the entire stack with your own Deepgram key, or use our hosted backend for zero-config setup.
Next steps
- Quickstart — add voice input to your app in 2 minutes
- SDK Reference — full API documentation
- Self-Hosting — run your own VoiceLayer backend