A custom web application tool that help the client analyze their audio mixes by extracting key audio metrics and translates them into actionable mixing feedback.
Role: Lead Developer
Client: Private Client (Freelance)
Duration: 2 months
Collaborators: Xochilt Cojal (UI/UX Design), Hany Miller (Audio Engineer Consultant)
Tech Stack: Python, FastAPI, Librosa, PyLoudnorm, Pydub, NumPy, TypeScript, React, Groq API, SlowAPI
The client needed a tool to speed up his mixing analysis workflow. While commercial solutions exist, they either provide too much raw data without interpretation, or generate generic feedback that doesn`t align with professional mixing standards.
The client wanted a custom solution that could:
The goal was to create an internal tool that would serve as a first-pass analysis assistant, helping identify mix issues faster than manual evaluation.

I built MixScope as a full-stack web application with a Python backend for audio processing and a React frontend for visualization and interaction.
The backend accepts MP3, WAV, OGG, and FLAC files up to 120MB. Using Pydub, files are converted to WAV format in-memory to ensure consistent processing. The system then extracts key features using Librosa and PyLoudnorm:

Raw audio metrics are sent to the Groq API (GPT-OSS-120B model) with a carefully engineered prompt developed in collaboration with the client. The prompt enforces structured JSON output, professional terminology, genre-aware interpretation, and specific processing recommendations.
The LLM generates comprehensive reports covering loudness/dynamics analysis, spectral balance, stereo imaging, identified strengths/weaknesses, actionable suggestions, and detailed processing recommendations like EQ frequencies and compression ratios.

I implemented async/await patterns with FastAPI`s asyncio.to_thread to handle CPU-bound audio analysis without blocking the server. For reference comparisons, both tracks are analyzed in parallel using asyncio.gather.
The backend includes production-ready security: rate limiting (5 requests/minute via SlowAPI), CORS configuration, file size validation, MIME type checking, and comprehensive security headers.

Working with the UI/UX designer, I built a React interface with a 3-step upload workflow, tabbed analysis sections (Overview, Loudness & Dynamics, Stereo/Spectral, Suggestions, Reference Comparison), interactive metric cards with hover tooltips, and frequency spectrum visualization using Recharts with logarithmic scaling and smooth interpolation.

The initial implementation was loading and resampling audio files multiple times for different analysis stages, resulting in 60+ second processing times. I refactored the pipeline to:
Combined with async processing via asyncio.to_thread, this reduced average analysis time by approximately 30%, bringing typical tracks under 30 seconds.

Early LLM outputs were inconsistent, sometimes missing key analysis sections or using markdown formatting that broke the JSON parser. I solved this through strict JSON schema definition, explicit anti-markdown instructions, response validation with fallbacks, and iterative prompt refinement based on the client`s feedback on real tracks.

Translating audio engineering concepts into code was challenging. For example, implementing perceptual stereo width analysis required understanding how different frequency bands contribute to perceived width, then weighting them accordingly (Sub: 0.05, Bass: 0.1, Low-mids: 0.25, Mids: 0.4, High-mids: 0.7, Air: 0.9). This required constant collaboration with the client.

The final system processes typical 3-4 minute tracks in 20-30 seconds, with reference comparisons adding minimal overhead thanks to parallel processing. This met the client`s requirement for a tool usable during active mixing sessions.
MixScope now provides accurate loudness measurements (LUFS), detailed stereo imaging analysis with per-band correlation, frequency balance assessment across 6 bands, and dynamic range evaluation, all with specific processing recommendations.
The tool is now an active part of the client`s production workflow, providing a second opinion backed by objective measurements that helps catch issues early and validates mixing decisions.
This was a private commission, so the codebase is not publicly available. Contact me for more details about the technical implementation.
Client collaboration is essential. Working closely with the audio engineer throughout development helped me understand which metrics actually matter in professional mixing. Regular testing with real projects ensured practical utility, not just technical accuracy.
Optimization requires trade-offs. Not every audio feature needs extraction. Focusing on metrics that genuinely inform mixing decisions significantly improved performance.
LLM prompt engineering is an art. Getting consistent, high-quality output required as much work as the audio pipeline itself. I learned to be extremely specific about format, use examples, and iteratively refine based on failure cases.
Async processing matters. Moving CPU-bound operations to separate threads with asyncio.to_thread kept the API responsive. For reference comparisons, parallel processing nearly halved total processing time.

While MixScope currently meets the client`s workflow needs, potential improvements discussed include:
Training a custom model on a set of professionally mixed tracks could provide even more accurate genre-specific feedback and better understand the nuances of different mixing styles.
Implementing preset profiles for different genres (electronic, rock, classical) could adjust evaluation criteria and recommendations based on style-specific standards.
Adding user accounts and project tracking would allow engineers to see mix evolution over revisions and compare different versions of the same track.
Implementing graphs and visualizations for key metrics like frequency spectrum, stereo field, and loudness evolution would improve data comprehension.