Overview
EchoLingua AI is a sophisticated web application engineered for real-time simultaneous interpretation. By leveraging the low-latency capabilities of Google's Gemini Live API, the application bridges language barriers instantly while offering a dedicated writing lab for granular text critique.
The user experience is built upon a "Thumb UI" philosophy, anchoring critical controls to the bottom of the viewport for optimal one-handed mobile interaction.
Core Features
Dual-Voice Interpreter
- ▹ Simultaneous Live API Interpretation
- ▹ Bi-Directional Flow (No toggle needed)
- ▹ Raw PCM Audio Processing (16kHz/24kHz)
Writing & Pronunciation Lab
- ▹ Schema-Enforced Granular Analysis
- ▹ IPA Transcriptions & Error Logic
- ▹ Neural Text-to-Speech Playback
Technical Stack
| Category | Technology | Details |
|---|---|---|
| Frontend | React 19 | Built with TypeScript for type safety. |
| Styling | Tailwind CSS | Utility-first styling framework. |
| SDK | Google GenAI | @google/genai integration. |
| Audio | Web Audio API | AudioContext & ScriptProcessorNode for PCM stream. |
Architecture Pipeline
Input Processing
Microphone data is captured via getUserMedia, downsampled to 16kHz, and converted into raw PCM 16-bit integer format. This stream is transmitted over WebSocket.
Output Rendering
The model returns base64-encoded PCM data. The frontend decodes this into a Float32Array and schedules playback via AudioBufferSourceNode for gapless audio.
const processAudio = (stream) => { // Downsample to 16kHz const pcmData = convertToPCM(stream); // Stream via WebSocket socket.send(pcmData); // Handle Response return decodeBase64(response.audio); };
Installation
# Clone the repository git clone https://github.com/dovvnloading/EchoLingua.git cd echolingua-ai # Install dependencies npm install # Start development server npm start
Environment Configuration
Create a .env file in the root directory:
Usage Guide
Interpreter Mode
- Navigate to the Interpreter tab.
- Designate two active languages.
- Activate the Microphone to initialize WebSocket.
- Speak freely; system auto-detects language.
- Deactivate microphone to end session.
Writing Lab
- Navigate to the Writing Lab tab.
- Select target language from dropdown.
- Input text into the drafting area.
- Select Review for Gemini Flash analysis.
- Click Speaker icon for Neural TTS.