Empowering llm to use smartphone for intelligent task automation

H Wen, Y Li, G Liu, S Zhao, T Yu, TJJ Li, S Jiang… - arxiv preprint arxiv …, 2023 - arxiv.org
Mobile task automation is an attractive technique that aims to enable voice-based hands-
free user interaction with smartphones. However, existing approaches suffer from poor …

Personal llm agents: Insights and survey about the capability, efficiency and security

Y Li, H Wen, W Wang, X Li, Y Yuan, G Liu, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Since the advent of personal computing devices, intelligent personal assistants (IPAs) have
been one of the key technologies that researchers and engineers have focused on, aiming …

Foundations and recent trends in multimodal mobile agents: A survey

B Wu, Y Li, M Fang, Z Song, Z Zhang, Y Wei… - arxiv preprint arxiv …, 2024 - arxiv.org
Mobile agents are essential for automating tasks in complex and dynamic mobile
environments. As foundation models evolve, the demands for agents that can adapt in real …

Large language model-based agents for software engineering: A survey

J Liu, K Wang, Y Chen, X Peng, Z Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
The recent advance in Large Language Models (LLMs) has shaped a new paradigm of AI
agents, ie, LLM-based agents. Compared to standalone LLMs, LLM-based agents …

AssistGUI: Task-Oriented PC Graphical User Interface Automation

D Gao, L Ji, Z Bai, M Ouyang, P Li… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Graphical User Interface (GUI) automation holds significant promise for assisting
users with complex tasks thereby boosting human productivity. Existing works leveraging …

Mobile-bench: An evaluation benchmark for llm-based mobile agents

S Deng, W Xu, H Sun, W Liu, T Tan, J Liu, A Li… - arxiv preprint arxiv …, 2024 - arxiv.org
With the remarkable advancements of large language models (LLMs), LLM-based agents
have become a research hotspot in human-computer interaction. However, there is a …

Large multimodal agents: A survey

J **e, Z Chen, R Zhang, X Wan, G Li - arxiv preprint arxiv:2402.15116, 2024 - arxiv.org
Large language models (LLMs) have achieved superior performance in powering text-
based AI agents, endowing them with decision-making and reasoning abilities akin to …

Explore, select, derive, and recall: Augmenting llm with human-like memory for mobile task automation

S Lee, J Choi, J Lee, MH Wasi, H Choi, SY Ko… - arxiv preprint arxiv …, 2023 - arxiv.org
The advent of large language models (LLMs) has opened up new opportunities in the field
of mobile task automation. Their superior language understanding and reasoning …

Assistgui: Task-oriented desktop graphical user interface automation

D Gao, L Ji, Z Bai, M Ouyang, P Li, D Mao, Q Wu… - arxiv preprint arxiv …, 2023 - arxiv.org
Graphical User Interface (GUI) automation holds significant promise for assisting users with
complex tasks, thereby boosting human productivity. Existing works leveraging Large …

Guardian: A Runtime Framework for LLM-Based UI Exploration

D Ran, H Wang, Z Song, M Wu, Y Cao… - Proceedings of the 33rd …, 2024 - dl.acm.org
Tests for feature-based UI testing have been indispensable for ensuring the quality of mobile
applications (apps for short). The high manual labor costs to create such tests have led to a …