Quilt-1m: One million image-text pairs for histopathology
Recent accelerations in multi-modal applications have been made possible with the
plethora of image and text data available online. However, the scarcity of analogous data in …
plethora of image and text data available online. However, the scarcity of analogous data in …
Dreamstruct: Understanding slides and user interfaces via synthetic data generation
Enabling machines to understand structured visuals like slides and user interfaces is
essential for making them accessible to people with disabilities. However, achieving such …
essential for making them accessible to people with disabilities. However, achieving such …
Large-scale video retrieval using image queries
Retrieving videos from large repositories using image queries is important for many
applications, such as brand monitoring or content linking. We introduce a new retrieval …
applications, such as brand monitoring or content linking. We introduce a new retrieval …
[HTML][HTML] Automatic prediction of presentation style and student engagement from videos
Presentation style is an important dimension to be considered for delivering lectures or
presentations. It affects the quality of the content delivery as well as the engagement of the …
presentations. It affects the quality of the content delivery as well as the engagement of the …
Semantic navigation of powerpoint-based lecture video for autonote generation
With the increasing popularity of open educational resources in the past few decades, more
and more users watch online videos to gain knowledge. However, most educational videos …
and more users watch online videos to gain knowledge. However, most educational videos …
Neural image representations for multi-image fusion and layer separation
We propose a framework for aligning and fusing multiple images into a single view using
neural image representations (NIRs), also known as implicit or coordinate-based neural …
neural image representations (NIRs), also known as implicit or coordinate-based neural …
A generic framework for generation of summarized video clips using transfer learning (SumVClip)
Video summarization aims to produce highlights of the original video showing informative
key events. Now a days video content is increasing enormously therefor to store, browse …
key events. Now a days video content is increasing enormously therefor to store, browse …
Large‐scale video retrieval via deep local convolutional features
C Zhang, B Hu, Y Suo, Z Zou, Y Ji - Advances in Multimedia, 2020 - Wiley Online Library
In this paper, we study the challenge of image‐to‐video retrieval, which uses the query
image to search relevant frames from a large collection of videos. A novel framework based …
image to search relevant frames from a large collection of videos. A novel framework based …
Wise—slide segmentation in the wild
We address the task of segmenting presentation slides, where the examined page was
captured as a live photo during lectures. Slides are important document types used as visual …
captured as a live photo during lectures. Slides are important document types used as visual …
[PDF][PDF] A deep analysis of image based video searching techniques
S Anayat, A Sikandar, SA Rasheed… - International Journal of …, 2020 - researchgate.net
For many applications like brand monitoring, it's important to search a video from large
database using image as query [1]. Numerous visual search technologies have emerged …
database using image as query [1]. Numerous visual search technologies have emerged …