Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Orca: A distributed serving system for {Transformer-Based} generative models
Large-scale Transformer-based models trained for generation tasks (eg, GPT-3) have
recently attracted huge interest, emphasizing the need for system support for serving models …
recently attracted huge interest, emphasizing the need for system support for serving models …
Evaluating large language models for radiology natural language processing
The rise of large language models (LLMs) has marked a pivotal shift in the field of natural
language processing (NLP). LLMs have revolutionized a multitude of domains, and they …
language processing (NLP). LLMs have revolutionized a multitude of domains, and they …
Achieving Peak Performance for Large Language Models: A Systematic Review
In recent years, large language models (LLMs) have achieved remarkable success in
natural language processing (NLP). LLMs require an extreme amount of parameters to …
natural language processing (NLP). LLMs require an extreme amount of parameters to …
Transformer uncertainty estimation with hierarchical stochastic attention
Transformers are state-of-the-art in a wide range of NLP tasks and have also been applied
to many real-world products. Understanding the reliability and certainty of transformer …
to many real-world products. Understanding the reliability and certainty of transformer …
Influential recommender system
H Zhu, H Ge, X Gu, P Zhao… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org
Traditional recommender systems are typically passive in that they try to adapt their
recommendations to the user's historical interests. However, it is highly desirable for …
recommendations to the user's historical interests. However, it is highly desirable for …
HPipe: Large Language Model Pipeline Parallelism for Long Context on Heterogeneous Cost-effective Devices
Micro-enterprises and individual developers emerge analysis demands for long sequence
with powerful Large Language Models (LLMs). They try to deploy the LLMs at local, but only …
with powerful Large Language Models (LLMs). They try to deploy the LLMs at local, but only …
TCP: A Tensor Contraction Processor for AI Workloads Industrial Product
H Kim, Y Choi, J Park, B Bae, H Jeong… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
We introduce a novel tensor contraction processor (TCP) architecture that offers a paradigm
shift from traditional architectures that rely on fixed-size matrix multiplications. TCP aims at …
shift from traditional architectures that rely on fixed-size matrix multiplications. TCP aims at …
iServe: An Intent-based Serving System for LLMs
Large Language Models (LLMs) are becoming ubiquitous across industries, where
applications demand they fulfill diverse user intents. However, developers currently face the …
applications demand they fulfill diverse user intents. However, developers currently face the …
Dynamic batching for inference system for transformer-based generation tasks
An inference system applies a machine-learning transformer model to a batch of requests
with variable input length or variable target length or variable internal state length by …
with variable input length or variable target length or variable internal state length by …
Selective batching for inference system for transformer-based generation tasks
YU Gyeongin, G Kim, JS Jeong, S Kim… - US Patent …, 2024 - Google Patents
An inference system applies a machine-learning transformer model to a batch of requests
with variable input length or variable target length or variable internal state length by …
with variable input length or variable target length or variable internal state length by …