Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
I’ve been looking into getting started with using transformers for speech. I’ve been doing some reading and attended a talk where I learned about using Hubert for the encoder in most articles.
Published:
====== I was checking out few repositories for Language translation and came across set of following keywords which got me more interested towards checking these out …
Published:
ASR, or automatic speech recognition, is a technology that aims to convert spoken utterances into a textual representation such as words, syllables, or phonemes. Speech recognition technology involves three models: the lexicon model which understands how words are pronounced, the acoustic model which analyzes speech patterns, and the language model which predicts word sequences. These models work together in decoding to produce accurate transcriptions of spoken language.
Published:
Whisper is a cutting-edge speech recognition model developed by OpenAI in October 2022. Its primary purpose is to convert audio files into text with remarkable accuracy, supporting up to 99 languages, including Japanese. The model’s encoder was trained through a technique called weakly supervised learning, leveraging a vast dataset of over 68,000 hours of speech. This approach enabled the model to surpass the accuracy of traditional academic data sets.
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in INTERSPEECH 2023 DUBLIN , IRELAND, 2023
Recommended citation: Baghel, S., Ramoji, S., Sidharth, , H, R., Singh, P., Jain, S., Roy Chowdhuri, P., Kulkarni, K., Padhi, S., Vijayasenan, D., Ganapathy, S. (2023) The DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments. Proc. INTERSPEECH 2023, 3562-3566, doi: 10.21437/Interspeech.2023-2367 10.21437/Interspeech.2023-2367
Published in , 2023
This paper is Preprint .
Recommended citation: Baghel S, Ramoji S, Jain S, Chowdhuri PR, Singh P, Vijayasenan D, Ganapathy S. Summary of the DISPLACE Challenge 2023--DIarization of SPeaker and LAnguage in Conversational Environments. arXiv preprint arXiv:2311.12564. 2023 Nov 21. https://arxiv.org/abs/2311.12564
Published:
I am a huge fan of Python’s platform independence, which allows us to call shell scripts or run Python bindings of different frameworks built on CPP, among other things. Recently, I organized a talk with the help of the Mozilla Campus Club BVP, where I provided a basic introduction to logic building and OOPs for freshers at BVUCOEP.
Published:
Published:
Slides A app which includes expense tracking and budget forecasting. A formulated algorithm to provide financial assessment based on budget portfolio. Stock recommendations are benchmarked from different forms of forecasting models and try to recommend stocks on the basis of previous purchases . Apart from the transactions, our app aims to provide a comprehensive view of the wallet such as daily profits, value trends, portfolio distribution, networth etc. We like to mention the Chatbot assistance research provided by team @RASA and @PARLAI by META . These researches provides best NLU practices in business arena.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.