Skip to main content

Wav2Vec2 Has 12 Layers

1 min

What I worked on

Checked how many transformer layers Wav2Vec2 uses and what the initial embedding layer does. Noticed an extra hidden state appearing during inspection.

What I noticed

  • Wav2Vec2 base has 12 transformer layers
  • The extra hidden state comes from the embedding output
  • nn.Linear weights can be initialized manually

”Aha” Moment

The model’s first layer isn’t part of the transformer stack—it’s the feature encoder feeding into it.

What still feels messy

na

Next step

Extract representations from each hidden layer to visualize what features each captures.