Focused transformer: Contrastive training for context scaling S Tworkowski, K Staniszewski, M Pacek, Y Wu, H Michalewski, P Miłoś Advances in Neural Information Processing Systems 36, 2024 | 106 | 2024 |
Analysing The Impact of Sequence Composition on Language Model Pre-Training Y Zhao, Y Qu, K Staniszewski, S Tworkowski, W Liu, P Miłoś, Y Wu, ... arXiv preprint arXiv:2402.13991, 2024 | 6 | 2024 |
Structured Packing in LLM Training Improves Long Context Utilization K Staniszewski, S Tworkowski, S Jaszczur, H Michalewski, Ł Kuciński, ... arXiv preprint arXiv:2312.17296, 2023 | 5 | 2023 |
Parity Games of Bounded Tree-Depth K Staniszewski arXiv preprint arXiv:2211.02926, 2022 | 1 | 2022 |
Training a Cooperating Team in GFootball Environment using Deep RL W Domitrz, Z Opała, M Sieniawski, K Staniszewski | | |