Transformer-Based Learned Optimization - 42Papers