Skip to main content
Diagram showing SpeechBrain lobes and recipe system

SpeechBrain Architecture

1 min

What I worked on

Explored SpeechBrain’s internal structure and pre-trained model APIs.

What I noticed

  • Recipes are full experiment pipelines (data prep, training, eval).
  • Lobes are modular components (models, layers, extractors) used across recipes.
  • EncoderASR is encoder-only for feature extraction; EncoderDecoderASR handles full speech-to-text conversion.

”Aha” Moment

SpeechBrain is designed with modularity in mind — lobes for composable building blocks, recipes for reproducible experiments.

What still feels messy

How to adapt lobes or recipes for phoneme-level outputs when a direct model isn’t provided.

Next step

Experiment with feature extraction using EncoderASR and explore writing a minimal phoneme recognition recipe.