Transformer Gallery
- Architected and implemented core Transformer architectures—including Transformer, Transformer-XL, Longformer, and Block-Recurrent Transformer—mirroring foundations of large-scale language models.
- Bootstrapped the codebase and personally wrote 80 % of the implementation, ensuring modularity for easy extension and experimentation.
- Validated model correctness through benchmarked language modeling tasks and attention‐visualization tools.