Skip to main content
Illustration representing gaps in model availability

Phoneme-level ASR models

1 min

What I worked on

Searched for SpeechBrain models capable of phoneme-level recognition.

What I noticed

  • asr-transformer-phn-librispeech doesn’t exist.
  • Current SpeechBrain models mainly target word- or character-level transcriptions.
  • A workaround is mapping text outputs to phonemes manually using a phoneme dictionary.

”Aha” Moment

n/a

What still feels messy

Lack of official phoneme-level pre-trained models and clarity on which ASR checkpoints expose phoneme tokens.

Next step

Search Hugging Face for phoneme-compatible models or create a small fine-tuning setup.