Wider, Not Deeper: Cambridge, Oxford & ICL Challenge Conventional Transformer Design Approaches | Synced
In the new paper Wide Attention Is The Way Forward For Transformers, a research team from the University of Cambridge, Imperial College London, and the University of Oxford challenges the commonly ...
Source: Synced | AI Technology & Industry Review
In the new paper Wide Attention Is The Way Forward For Transformers, a research team from the University of Cambridge, Imperial College London, and the University of Oxford challenges the commonly held belief that deeper is better for transformer architectures, demonstrating that wider layers result in superior performance on natural language processing tasks.