Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Masked feature prediction for self-supervised visual pre-training
Abstract We present Masked Feature Prediction (MaskFeat) for self-supervised pre-training
of video models. Our approach first randomly masks out a portion of the input sequence and …
of video models. Our approach first randomly masks out a portion of the input sequence and …
Mvitv2: Improved multiscale vision transformers for classification and detection
In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for
image and video classification, as well as object detection. We present an improved version …
image and video classification, as well as object detection. We present an improved version …
Multiscale vision transformers
Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …
by connecting the seminal idea of multiscale feature hierarchies with transformer models …
Recurring the transformer for video action recognition
Existing video understanding approaches, such as 3D convolutional neural networks and
Transformer-Based methods, usually process the videos in a clip-wise manner. Hence huge …
Transformer-Based methods, usually process the videos in a clip-wise manner. Hence huge …
Transformer-based deep learning model and video dataset for unsafe action identification in construction projects
A large proportion of construction accidents are caused by unintentional and unsafe actions
and behaviors. It is of significant difficulties and ineffectiveness to monitor unsafe behaviors …
and behaviors. It is of significant difficulties and ineffectiveness to monitor unsafe behaviors …
A content-driven micro-video recommendation dataset at scale
Micro-videos have recently gained immense popularity, sparking critical research in micro-
video recommendation with significant implications for the entertainment, advertising, and e …
video recommendation with significant implications for the entertainment, advertising, and e …
Torchgeo: deep learning with geospatial data
Remotely sensed geospatial data are critical for applications including precision agriculture,
urban planning, disaster monitoring and response, and climate change research, among …
urban planning, disaster monitoring and response, and climate change research, among …
Spotting temporally precise, fine-grained events in video
We introduce the task of spotting temporally precise, fine-grained events in video (detecting
the precise moment in time events occur). Precise spotting requires models to reason …
the precise moment in time events occur). Precise spotting requires models to reason …
Augly: Data augmentations for robustness
We introduce AugLy, a data augmentation library with a focus on adversarial robustness.
AugLy provides a wide array of augmentations for multiple modalities (audio, image, text, & …
AugLy provides a wide array of augmentations for multiple modalities (audio, image, text, & …
Woods: Benchmarks for out-of-distribution generalization in time series
Machine learning models often fail to generalize well under distributional shifts.
Understanding and overcoming these failures have led to a research field of Out-of …
Understanding and overcoming these failures have led to a research field of Out-of …