Phoneme-level ASR models
• 1 min read 1 min
What I worked on
Searched for SpeechBrain models capable of phoneme-level recognition.
What I noticed
- asr-transformer-phn-librispeech doesn’t exist.
- Current SpeechBrain models mainly target word- or character-level transcriptions.
- A workaround is mapping text outputs to phonemes manually using a phoneme dictionary.
”Aha” Moment
n/a
What still feels messy
Lack of official phoneme-level pre-trained models and clarity on which ASR checkpoints expose phoneme tokens.
Next step
Search Hugging Face for phoneme-compatible models or create a small fine-tuning setup.