Scalable MatMul-free Language Modeling

Not an endorsement, but a recent paper here, that may be of interest.