Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
Medical image segmentation review: The success of u-net
Automatic medical image segmentation is a crucial topic in the medical domain and
successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the …
successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the …
Vmamba: Visual state space model
Designing computationally efficient network architectures remains an ongoing necessity in
computer vision. In this paper, we adapt Mamba, a state-space language model, into …
computer vision. In this paper, we adapt Mamba, a state-space language model, into …
Yolov9: Learning what you want to learn using programmable gradient information
Today's deep learning methods focus on how to design the objective functions to make the
prediction as close as possible to the target. Meanwhile, an appropriate neural network …
prediction as close as possible to the target. Meanwhile, an appropriate neural network …
Biformer: Vision transformer with bi-level routing attention
As the core building block of vision transformers, attention is a powerful tool to capture long-
range dependency. However, such power comes at a cost: it incurs a huge computation …
range dependency. However, such power comes at a cost: it incurs a huge computation …
Efficientvit: Memory efficient vision transformer with cascaded group attention
Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …
However, their remarkable performance is accompanied by heavy computation costs, which …
Large selective kernel network for remote sensing object detection
Recent research on remote sensing object detection has largely focused on improving the
representation of oriented bounding boxes but has overlooked the unique prior knowledge …
representation of oriented bounding boxes but has overlooked the unique prior knowledge …
Videomae v2: Scaling video masked autoencoders with dual masking
Scale is the primary factor for building a powerful foundation model that could well
generalize to a variety of downstream tasks. However, it is still challenging to train video …
generalize to a variety of downstream tasks. However, it is still challenging to train video …
Run, don't walk: chasing higher FLOPS for faster neural networks
To design fast neural networks, many works have been focusing on reducing the number of
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …
PixArt-: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
The most advanced text-to-image (T2I) models require significant training costs (eg, millions
of GPU hours), seriously hindering the fundamental innovation for the AIGC community …
of GPU hours), seriously hindering the fundamental innovation for the AIGC community …