JEPA GTA5 World Generation

What I worked on

Planned to use JEPA for generating new world frames from a GTA5 driving dataset. Explored whether JEPA learns through masking rather than autoregression and how to pick target frames.

What I noticed

JEPA learns context-to-target prediction like BERT, not sequence prediction
Latent space can be manipulated to generate new variations

”Aha” Moment

That JEPA focuses on learning latent representations through reconstruction, not by predicting pixel sequences.

What still feels messy

How to map latent manipulations to specific visual or motion changes.

Next step

Train a small JEPA variant on a subset of GTA5 frames to test reconstruction quality.

What I worked on

What I noticed

”Aha” Moment

What still feels messy

Next step

Command Palette

Choose Theme