- Academic Search

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Save Cite Cited by 77 Related articles

[Free GPT-4]

[HTML] nih.gov

[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

PP Liang, Y Lyu, X Fan, Z Wu, Y Cheng… - Advances in neural …, 2021 - ncbi.nlm.nih.gov

Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …

Save Cite Cited by 166 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Webui: A dataset for enhancing visual ui understanding with web semantics

J Wu, S Wang, S Shen, YH Peng, J Nichols… - Proceedings of the …, 2023 - dl.acm.org

Modeling user interfaces (UIs) from visual information allows systems to make inferences
about the functionality and semantics needed to support use cases in accessibility, app …

Save Cite Cited by 59 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] uclouvain.be

Evaluating a large language model on searching for gui layouts

P Brie, N Burny, A Sluÿters… - Proceedings of the ACM on …, 2023 - dl.acm.org

The field of generative artificial intelligence has seen significant advancements in recent
years with the advent of large language models, which have shown impressive results in …

Save Cite Cited by 27 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Towards complete icon labeling in mobile applications

J Chen, A Swearngin, J Wu, T Barik, J Nichols… - Proceedings of the …, 2022 - dl.acm.org

Accurately recognizing icon types in mobile applications is integral to many tasks, including
accessibility improvement, UI design search, and conversational agents. Existing research …

Save Cite Cited by 43 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Never-ending learning of user interfaces

J Wu, R Krosnick, E Schoop, A Swearngin… - Proceedings of the 36th …, 2023 - dl.acm.org

Machine learning models have been trained to predict semantic information about user
interfaces (UIs) to make apps more accessible, easier to test, and to automate. Currently …

Save Cite Cited by 19 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] springer.com

Data-driven prototy** via natural-language-based GUI retrieval

K Kolthoff, C Bartelt, SP Ponzetto - Automated software engineering, 2023 - Springer

Rapid GUI prototy** has evolved into a widely applied technique in early stages of
software development to facilitate the clarification and refinement of requirements …

Save Cite Cited by 25 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Dreamstruct: Understanding slides and user interfaces via synthetic data generation

YH Peng, F Huq, Y Jiang, J Wu, XY Li… - … on Computer Vision, 2024 - Springer

Enabling machines to understand structured visuals like slides and user interfaces is
essential for making them accessible to people with disabilities. However, achieving such …

Save Cite Cited by 4 Related articles All 16 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Predicting and explaining mobile ui tappability with vision modeling and saliency analysis

E Schoop, X Zhou, G Li, Z Chen, B Hartmann… - Proceedings of the 2022 …, 2022 - dl.acm.org

UI designers often correct false affordances and improve the discoverability of features when
users have trouble determining if elements are tappable. We contribute a novel system that …

Save Cite Cited by 31 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Understanding screen relationships from screenshots of smartphone applications

S Feiz, J Wu, X Zhang, A Swearngin, T Barik… - Proceedings of the 27th …, 2022 - dl.acm.org

All graphical user interfaces are comprised of one or more screens that may be shown to the
user depending on their interactions. Identifying different screens of an app and …

Save Cite Cited by 32 Related articles All 6 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Enrico: A dataset for topic modeling of mobile UI designs

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

Webui: A dataset for enhancing visual ui understanding with web semantics

Evaluating a large language model on searching for gui layouts

Towards complete icon labeling in mobile applications

Never-ending learning of user interfaces

Data-driven prototy** via natural-language-based GUI retrieval

Dreamstruct: Understanding slides and user interfaces via synthetic data generation

Predicting and explaining mobile ui tappability with vision modeling and saliency analysis

Understanding screen relationships from screenshots of smartphone applications