- Academic Search

G Ding, F Sener, A Yao - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Temporal action segmentation (TAS) in videos aims at densely identifying video frames in
minutes-long videos with multiple action classes. As a long-range video understanding task …

保存引用被引用数: 70 関連記事全 8 バージョン

[Free GPT-4]

[PDF] thecvf.com

Maxim: Multi-axis mlp for image processing

Z Tu, H Talebi, H Zhang, F Yang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recent progress on Transformers and multi-layer perceptron (MLP) models provide new
network architectural designs for computer vision tasks. Although these models proved to be …

保存引用被引用数: 566 関連記事全 10 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Multi-stage progressive image restoration

SW Zamir, A Arora, S Khan, M Hayat… - Proceedings of the …, 2021 - openaccess.thecvf.com

Image restoration tasks demand a complex balance between spatial details and high-level
contextualized information while recovering images. In this paper, we propose a novel …

保存引用被引用数: 1876 関連記事全 10 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Assembly101: A large-scale multi-view video dataset for understanding procedural activities

F Sener, D Chatterjee, D Shelepov… - Proceedings of the …, 2022 - openaccess.thecvf.com

Assembly101 is a new procedural activity dataset featuring 4321 videos of people
assembling and disassembling 101" take-apart" toy vehicles. Participants work without fixed …

保存引用被引用数: 209 関連記事全 8 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Hoi4d: A 4d egocentric dataset for category-level human-object interaction

Y Liu, Y Liu, C Jiang, K Lyu, W Wan… - Proceedings of the …, 2022 - openaccess.thecvf.com

We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the
research of category-level human-object interaction. HOI4D consists of 2.4 M RGB-D …

保存引用被引用数: 144 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Diffusion action segmentation

D Liu, Q Li, AD Dinh, T Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Temporal action segmentation is crucial for understanding long-form videos. Previous works
on this task commonly adopt an iterative refinement paradigm by using multi-stage models …

保存引用被引用数: 80 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Unified fully and timestamp supervised temporal action segmentation via sequence to sequence translation

N Behrmann, SA Golestaneh, Z Kolter, J Gall… - European conference on …, 2022 - Springer

This paper introduces a unified framework for video action segmentation via sequence to
sequence (seq2seq) translation in a fully and timestamp supervised setup. In contrast to …

保存引用被引用数: 91 関連記事全 4 バージョン

[Free GPT-4]

[PDF] thecvf.com

Bridge-prompt: Towards ordinal action understanding in instructional videos

M Li, L Chen, Y Duan, Z Hu, J Feng… - Proceedings of the …, 2022 - openaccess.thecvf.com

Action recognition models have shown a promising capability to classify human actions in
short video clips. In a real scenario, multiple correlated human actions commonly occur in …

保存引用被引用数: 76 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Error detection in egocentric procedural task videos

SP Lee, Z Lu, Z Zhang, M Hoai… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present a new egocentric procedural error dataset containing videos with various types
of errors as well as normal videos and propose a new framework for procedural error …

保存引用被引用数: 10 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

How Much Temporal Long-Term Context is Needed for Action Segmentation?

E Bahrami, G Francesca, J Gall - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Modeling long-term context in videos is crucial for many fine-grained tasks including
temporal action segmentation. An interesting question that is still open is how much long …

保存引用被引用数: 30 関連記事全 7 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Ms-tcn++: Multi-stage temporal convolutional network for action segmentation

Temporal action segmentation: An analysis of modern techniques

Maxim: Multi-axis mlp for image processing

Multi-stage progressive image restoration

Assembly101: A large-scale multi-view video dataset for understanding procedural activities

Hoi4d: A 4d egocentric dataset for category-level human-object interaction

Diffusion action segmentation

Unified fully and timestamp supervised temporal action segmentation via sequence to sequence translation

Bridge-prompt: Towards ordinal action understanding in instructional videos

Error detection in egocentric procedural task videos

How Much Temporal Long-Term Context is Needed for Action Segmentation?