Recognize anything: A strong image tagging model
Abstract We present the Recognize Anything Model (RAM): a strong foundation model for
image tagging. RAM makes a substantial step for foundation models in computer vision …
image tagging. RAM makes a substantial step for foundation models in computer vision …
Tag2text: Guiding vision-language model via image tagging
This paper presents Tag2Text, a vision language pre-training (VLP) framework, which
introduces image tagging into vision-language models to guide the learning of visual …
introduces image tagging into vision-language models to guide the learning of visual …
Prompt Stealing Attacks Against {Text-to-Image} Generation Models
Text-to-Image generation models have revolutionized the artwork design process and
enabled anyone to create high-quality images by entering text descriptions called prompts …
enabled anyone to create high-quality images by entering text descriptions called prompts …
CNN and transformer framework for insect pest classification
Y Peng, Y Wang - Ecological Informatics, 2022 - Elsevier
Insect pests pose a significant and increasing threat to agricultural production worldwide.
However, most existing recognition methods are built upon well-known convolutional neural …
However, most existing recognition methods are built upon well-known convolutional neural …
Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge
Many real-world image recognition problems, such as diagnostic medical imaging exams,
are “long-tailed”–there are a few common findings followed by many more relatively rare …
are “long-tailed”–there are a few common findings followed by many more relatively rare …
Learning to generate semantic layouts for higher text-image correspondence in text-to-image synthesis
Existing text-to-image generation approaches have set high standards for photorealism and
text-image correspondence, largely benefiting from web-scale text-image datasets, which …
text-image correspondence, largely benefiting from web-scale text-image datasets, which …
[HTML][HTML] Flowtransformer: A transformer framework for flow-based network intrusion detection systems
This paper presents the FlowTransformer framework, a novel approach for implementing
transformer-based Network Intrusion Detection Systems (NIDSs). FlowTransformer …
transformer-based Network Intrusion Detection Systems (NIDSs). FlowTransformer …
Multi-label classification with partial annotations using class-aware selective loss
Large-scale multi-label classification datasets are commonly, and perhaps inevitably,
partially annotated. That is, only a small subset of labels are annotated per sample. Different …
partially annotated. That is, only a small subset of labels are annotated per sample. Different …
Obj2seq: Formatting objects as sequences with class prompt for visual tasks
Visual tasks vary a lot in their output formats and concerned contents, therefore it is hard to
process them with an identical structure. One main obstacle lies in the high-dimensional …
process them with an identical structure. One main obstacle lies in the high-dimensional …
Label-aware global consistency for multi-label learning with single positive labels
In single positive multi-label learning (SPML), only one of multiple positive labels is
observed for each instance. The previous work trains the model by simply treating …
observed for each instance. The previous work trains the model by simply treating …