Evaluation of openai o1: Opportunities and challenges of agi

T Zhong, Z Liu, Y Pan, Y Zhang, Y Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
This comprehensive study evaluates the performance of OpenAI's o1-preview large
language model across a diverse array of complex reasoning tasks, spanning multiple …

Securing large language models: Addressing bias, misinformation, and prompt attacks

B Peng, K Chen, M Li, P Feng, Z Bi, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) demonstrate impressive capabilities across various fields,
yet their increasing use raises critical security concerns. This article reviews recent literature …

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

[HTML][HTML] From cobit to iso 42001: Evaluating cybersecurity frameworks for opportunities, risks, and regulatory compliance in commercializing large language models

TR McIntosh, T Susnjak, T Liu, P Watters, D Xu… - Computers & …, 2024 - Elsevier
This study investigated the integration readiness of four predominant cybersecurity
Governance, Risk and Compliance (GRC) frameworks–NIST CSF 2.0, COBIT 2019, ISO …

Localvaluebench: A collaboratively built and extensible benchmark for evaluating localized value alignment and ethical safety in large language models

GI Meadows, NWL Lau, EA Susanto, CL Yu… - arxiv preprint arxiv …, 2024 - arxiv.org
The proliferation of large language models (LLMs) requires robust evaluation of their
alignment with local values and ethical standards, especially as existing benchmarks often …

Automated summarization of multiple document abstracts and contents using large language models

O Langston, B Ashford - Authorea Preprints, 2024 - techrxiv.org
The exponential growth of textual data across various domains necessitates the
development of efficient and accurate summarization techniques to facilitate quick …

Open problems in technical ai governance

A Reuel, B Bucknall, S Casper, T Fist, L Soder… - arxiv preprint arxiv …, 2024 - arxiv.org
AI progress is creating a growing range of risks and opportunities, but it is often unclear how
they should be navigated. In many cases, the barriers and uncertainties faced are at least …

A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations

MTR Laskar, S Alqahtani, MS Bari… - Proceedings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) have recently gained significant attention due to
their remarkable capabilities in performing diverse tasks across various domains. However …

Enhancing compute-optimal inference for problem-solving with optimized large language model

S Hayashi, R Fujimoto, G Okamoto - Authorea Preprints, 2024 - techrxiv.org
The growing computational demands of advanced AI models necessitate innovative
approaches to enhance efficiency while maintaining high performance. Our novel concept …

International Scientific Report on the Safety of Advanced AI (Interim Report)

Y Bengio, S Mindermann, D Privitera… - arxiv preprint arxiv …, 2024 - arxiv.org
This is the interim publication of the first International Scientific Report on the Safety of
Advanced AI. The report synthesises the scientific understanding of general-purpose AI--AI …