Takip et
Shwai He
Başlık
Alıntı yapanlar
Alıntı yapanlar
Yıl
Sparseadapter: An easy approach for improving the parameter-efficiency of adapters
S He, L Ding, D Dong, M Zhang, D Tao
EMNLP 2022 Findings, 2022
802022
Vega-mt: The jd explore academy translation system for wmt22
C Zan, K Peng, L Ding, B Qiu, B Liu, S He, Q Lu, Z Zhang, C Liu, W Liu, ...
Seventh Conference on Machine Translation (WMT22), 2022
562022
Superfiltering: Weak-to-strong data filtering for fast instruction-tuning
M Li, Y Zhang, S He, Z Li, H Zhao, J Wang, N Cheng, T Zhou
arXiv preprint arXiv:2402.00530, 2024
372024
Reflection-tuning: Data recycling improves llm instruction-tuning
M Li, L Chen, J Chen, S He, H Huang, J Gu, T Zhou
arXiv preprint arXiv:2310.11716, 2023
26*2023
Selective reflection-tuning: Student-selected data recycling for llm instruction-tuning
M Li, L Chen, J Chen, S He, J Gu, T Zhou
arXiv preprint arXiv:2402.10110, 2024
202024
Reformatted alignment
RZ Fan, X Li, H Zou, J Li, S He, E Chern, J Hu, P Liu
arXiv preprint arXiv:2402.12219, 2024
142024
Mera: Merging pretrained adapters for few-shot learning
S He, RZ Fan, L Ding, L Shen, T Zhou, D Tao
arXiv preprint arXiv:2308.15982, 2023
132023
Merging experts into one: Improving computational efficiency of mixture of experts
S He, RZ Fan, L Ding, L Shen, T Zhou, D Tao
EMNLP 2023 Oral, 2023
122023
PAD-Net: An Efficient Framework for Dynamic Networks
S He, L Ding, D Dong, B Liu, F Yu, D Tao
ACL 2023, 2023
12*2023
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework
S He, D Dong, L Ding, A Li
arXiv preprint arXiv:2406.02500, 2024
102024
Loki: Low-Rank Keys for Efficient Sparse Attention
P Singhania, S Singh, S He, S Feizi, A Bhatele
arXiv preprint arXiv:2406.02542, 2024
82024
Accurate prediction of antibody function and structure using bio-inspired antibody language model
H Jing, Z Gao, S Xu, T Shen, Z Peng, S He, T You, S Ye, W Lin, S Sun
Briefings in Bioinformatics 25 (4), bbae245, 2024
72024
What matters in transformers? not all attention is needed
S He, G Sun, Z Shen, A Li
arXiv preprint arXiv:2406.15786, 2024
72024
Sd-conv: Towards the parameter-efficiency of dynamic convolution
S He, C Jiang, D Dong, L Ding
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2023
42023
Multi-modal Attention Network for Stock Movements Prediction
S He, S Gu
The AAAI-22 Workshop on Knowledge Discovery from Unstructured Data in …, 2021
32021
RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation
S He, T Chen
arXiv preprint arXiv:2404.02424, 2024
12024
NeuralSlice: Neural 3D triangle mesh reconstruction via slicing 4D tetrahedral meshes
C Jiang, J Yang, S He, Y Lai, L Gao
12023
Towards counterfactual fairness thorough auxiliary variables
B Tian, Z Wang, S He, W Ye, G Sun, Y Dai, Y Wu, A Li
arXiv preprint arXiv:2412.04767, 2024
2024
Fair Diagnosis: Leveraging Causal Modeling to Mitigate Medical Bias
B Tian, Y He, M Liu, Y Dai, Z Wang, S He, G Sun, Z Shen, W Ye, Y Wu, ...
arXiv preprint arXiv:2412.04739, 2024
2024
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers
S He, T Ge, G Sun, B Tian, X Wang, A Li, D Yu
arXiv preprint arXiv:2410.13184, 2024
2024
Sistem, işlemi şu anda gerçekleştiremiyor. Daha sonra yeniden deneyin.
Makaleler 1–20