Moganet: Multi-order gated aggregation network

S Li, Z Wang, Z Liu, C Tan, H Lin, D Wu… - The Twelfth …, 2023 - openreview.net
By contextualizing the kernel as global as possible, Modern ConvNets have shown great
potential in computer vision tasks. However, recent progress on\textit {multi-order game …

5D Seismic data interpolation by continuous representation

D Liu, W Gao, W Xu, J Li, X Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
How to represent a seismic wavefield? Traditionally, while seismic wavefields are
conceptualized continuously, acquisition geometries capture seismic data discretely using 2 …

Surface-vqmae: Vector-quantized masked auto-encoders on molecular surfaces

F Wu, SZ Li - International Conference on Machine Learning, 2024 - proceedings.mlr.press
Molecular surfaces imply fingerprints of interaction patterns between proteins. However, non-
equivalent efforts have been paid to incorporating the abundant protein surface information …

Semireward: A general reward model for semi-supervised learning

S Li, W **, Z Wang, F Wu, Z Liu, C Tan… - arxiv preprint arxiv …, 2023 - arxiv.org
Semi-supervised learning (SSL) has witnessed great progress with various improvements in
the self-training framework with pseudo labeling. The main challenge is how to distinguish …

VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling

S Li, Z Wang, Z Liu, D Wu, C Tan, J Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Similar to natural language models, pre-trained genome language models are proposed to
capture the underlying intricacies within genomes with unsupervised sequence modeling …

Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning

C Tan, J Wei, L Sun, Z Gao, S Li, B Yu, R Guo… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models equipped with retrieval-augmented generation (RAG) represent a
burgeoning field aimed at enhancing answering capabilities by leveraging external …

Cross-Modal Conditioned Reconstruction for Language-guided Medical Image Segmentation

X Huang, H Li, M Cao, L Chen, C You, D An - arxiv preprint arxiv …, 2024 - arxiv.org
Recent developments underscore the potential of textual information in enhancing learning
models for a deeper understanding of medical visual semantics. However, language-guided …

Hi-End-MAE: Hierarchical encoder-driven masked autoencoders are stronger vision learners for medical image segmentation

F Tang, Q Yao, W Ma, C Wu, Z Jiang… - arxiv preprint arxiv …, 2025 - arxiv.org
Medical image segmentation remains a formidable challenge due to the label scarcity. Pre-
training Vision Transformer (ViT) through masked image modeling (MIM) on large-scale …

Interpretable and Generalizable Spatiotemporal Predictive Learning with Disentangled Consistency

J Wei, C Tan, Z Gao, L Sun, B Yu, R Guo… - Joint European Conference …, 2024 - Springer
In recent years, significant strides have been made in the field of spatiotemporal predictive
learning, a discipline that focuses on accurately forecasting future sequences based on …

Hybrid Self-Supervised and Semi-Supervised Framework for Robust Spatio-Temporal Action Detection

T Yan - 2024 IEEE 7th International Conference on Automation …, 2024 - ieeexplore.ieee.org
This paper presents a novel Hybrid Self-Supervised and Semi-Supervised Framework for
Robust Spatio-Temporal Action Detection, which integrates the advantages of self …