Alnur Ismail - Founder, Advisor, Investor

Illustration representing gaps in model availability

Phoneme-level ASR models

May 2, 2024 May 2, 2024 • • 1 min read 1 min

ASR phoneme-recognition

What I worked on

Searched for SpeechBrain models capable of phoneme-level recognition.

What I noticed

asr-transformer-phn-librispeech doesn’t exist.
Current SpeechBrain models mainly target word- or character-level transcriptions.
A workaround is mapping text outputs to phonemes manually using a phoneme dictionary.

”Aha” Moment

n/a

What still feels messy

Lack of official phoneme-level pre-trained models and clarity on which ASR checkpoints expose phoneme tokens.

Next step

Search Hugging Face for phoneme-compatible models or create a small fine-tuning setup.