Conversational agents in therapeutic interventions for neurodevelopmental disorders: a survey
Neurodevelopmental Disorders (NDD) are a group of conditions with onset in the
developmental period characterized by deficits in the cognitive and social areas …
developmental period characterized by deficits in the cognitive and social areas …
Hierarchical graph network for multi-hop question answering
In this paper, we present Hierarchical Graph Network (HGN) for multi-hop question
answering. To aggregate clues from scattered texts across multiple paragraphs, a …
answering. To aggregate clues from scattered texts across multiple paragraphs, a …
Efficient visual tracking with exemplar transformers
The design of more complex and powerful neural network models has significantly
advanced the state-of-the-art in visual object tracking. These advances can be attributed to …
advanced the state-of-the-art in visual object tracking. These advances can be attributed to …
Sparse self-attention transformer for image inpainting
Learning-based image inpainting methods have made remarkable progress in recent years.
Nevertheless, these methods still suffer from issues such as blurring, artifacts, and …
Nevertheless, these methods still suffer from issues such as blurring, artifacts, and …
Poolingformer: Long document modeling with pooling attention
In this paper, we introduce a two-level attention schema, Poolingformer, for long document
modeling. Its first level uses a smaller sliding window pattern to aggregate information from …
modeling. Its first level uses a smaller sliding window pattern to aggregate information from …
Linrec: Linear attention mechanism for long-term sequential recommender systems
Transformer models have achieved remarkable success in sequential recommender
systems (SRSs). However, computing the attention matrix in traditional dot-product attention …
systems (SRSs). However, computing the attention matrix in traditional dot-product attention …
Token pooling in vision transformers for image classification
Pooling is commonly used to improve the computation-accuracy trade-off of convolutional
networks. By aggregating neighboring feature values on the image grid, pooling layers …
networks. By aggregating neighboring feature values on the image grid, pooling layers …
Efficient long sequence modeling via state space augmented transformer
Transformer models have achieved superior performance in various natural language
processing tasks. However, the quadratic computational cost of the attention mechanism …
processing tasks. However, the quadratic computational cost of the attention mechanism …
Understanding self-attention mechanism via dynamical system perspective
The self-attention mechanism (SAM) is widely used in various fields of artificial intelligence
and has successfully boosted the performance of different models. However, current …
and has successfully boosted the performance of different models. However, current …
Sparsity in transformers: A systematic literature review
Transformers have become the state-of-the-art architectures for various tasks in Natural
Language Processing (NLP) and Computer Vision (CV); however, their space and …
Language Processing (NLP) and Computer Vision (CV); however, their space and …