Generative AI-Powered Framework for Audio Analysis and Conversational Exploration
DOI:
https://doi.org/10.63278/1425Keywords:
Audio Interpretation, Generative AI, Large Language Models, CNN, Transformer, Spectrogram, Multimodal Fusion, Interactive AI.Abstract
This paper introduces a hybrid deep learning system for complex audio interpretation and post time communication utilizing associated hidden Convolutional Neural Networks (CNNs) with transformer based Large Language Models (LLMs) over spectrogram. The system inputs raw audio input in the form of audio signals, and maps them into spectrograms, extracts high level features using CNNs, and asks for fusion of LLM-produced embeddings with it, for adding semantic understanding, and contextual discussions. The multimodal attention technique helps in crossing the audio-linguistic gap and therefore, it is possible that they can have meaningful and context-aware response. The release offers the apps for intelligent assistant, education, intelligent monitoring, and other. Github repository, experimental evaluation presents increase in performance over the state-of-the-art in both experiments, with accuracy at 93.8%, latency at 420 ms and high semantic coherence (BLEU score of 0.74 is obtained). This result proves that the proposed system is usable to offer both user-friendly and intelligent audio exploration.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Purshottam J. Assudani, Balakrishnan P, A. Anny Leema, Rajesh K Nasare

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their published articles online (e.g., in institutional repositories or on their website, social networks like ResearchGate or Academia), as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

Except where otherwise noted, the content on this site is licensed under a Creative Commons Attribution 4.0 International License.



According to the