Audio-Language Reasoning Agent
- Engineered a multimodal retrieve-then-reason pipeline projecting audio and text into a shared CLAP embedding space with FAISS vector search.
- Routed retrieved context to Qwen2.5-3B for zero-shot audio classification, achieving 94% top-1 accuracy on ESC-50.