Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Scaling open-vocabulary object detection
Open-vocabulary object detection has benefited greatly from pretrained vision-language
models, but is still limited by the amount of available detection training data. While detection …
models, but is still limited by the amount of available detection training data. While detection …
Which tokens to use? investigating token reduction in vision transformers
Since the introduction of the Vision Transformer (ViT), researchers have sought to make ViTs
more efficient by removing redundant information in the processed tokens. While different …
more efficient by removing redundant information in the processed tokens. While different …
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Building on the foundations of language modeling in natural language processing, Next
Token Prediction (NTP) has evolved into a versatile training objective for machine learning …
Token Prediction (NTP) has evolved into a versatile training objective for machine learning …
Multi-resolution Time-Series Transformer for Long-term Forecasting
The performance of transformers for time-series forecasting has improved significantly.
Recent architectures learn complex temporal patterns by segmenting a time-series into …
Recent architectures learn complex temporal patterns by segmenting a time-series into …
Agglomerative Token Clustering
Abstract We present Agglomerative Token Clustering (ATC), a novel token merging method
that consistently outperforms previous token merging and pruning methods across image …
that consistently outperforms previous token merging and pruning methods across image …
Learning to Merge Tokens via Decoupled Embedding for Efficient Vision Transformers
Recent token reduction methods for Vision Transformers (ViTs) incorporate token merging,
which measures the similarities between token embeddings and combines the most similar …
which measures the similarities between token embeddings and combines the most similar …
Multi-dimension Transformer with Attention-based Filtering for Medical Image Segmentation
The accurate segmentation of medical images is crucial for diagnosing and treating
diseases. Recent studies demonstrate that vision transformer-based methods have …
diseases. Recent studies demonstrate that vision transformer-based methods have …
Accelerating Transformers with Spectrum-Preserving Token Merging
Increasing the throughput of the Transformer architecture, a foundational component used in
numerous state-of-the-art models for vision and language tasks (eg, GPT, LLaVa), is an …
numerous state-of-the-art models for vision and language tasks (eg, GPT, LLaVa), is an …
Token Compensator: Altering Inference Cost of Vision Transformer Without Re-tuning
Token compression expedites the training and inference of Vision Transformers (ViTs) by
reducing the number of the redundant tokens, eg, pruning inattentive tokens or merging …
reducing the number of the redundant tokens, eg, pruning inattentive tokens or merging …
From Similarity to Superiority: Channel Clustering for Time Series Forecasting
Time series forecasting has attracted significant attention in recent decades. Previous
studies have demonstrated that the Channel-Independent (CI) strategy improves forecasting …
studies have demonstrated that the Channel-Independent (CI) strategy improves forecasting …