- Academic Search

J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei - European Conference on …, 2024 - Springer

The diffusion model has been proven a powerful generative model in recent years, yet it
remains a challenge in generating visual text. Although existing work has endeavored to …

Lưu Trích dẫn Trích dẫn 44 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arxiv preprint arxiv …, 2024 - arxiv.org

Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

Lưu Trích dẫn Trích dẫn 59 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vita: Towards open-source interactive omni multimodal llm

C Fu, H Lin, Z Long, Y Shen, M Zhao, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

The remarkable multimodal capabilities and interactive experience of GPT-4o underscore
their necessity in practical applications, yet open-source models rarely excel in both areas …

Lưu Trích dẫn Trích dẫn 50 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multimodal pretraining, adaptation, and generation for recommendation: A survey

Q Liu, J Zhu, Y Yang, Q Dai, Z Du, XM Wu… - Proceedings of the 30th …, 2024 - dl.acm.org

Personalized recommendation serves as a ubiquitous channel for users to discover
information tailored to their interests. However, traditional recommendation models primarily …

Lưu Trích dẫn Trích dẫn 23 bài viết Bài viết có liên quan Tất cả 5 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Choose what you need: Disentangled representation learning for scene text recognition removal and editing

B Zhang, H **e, Z Gao, Y Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Scene text images contain not only style information (font background) but also content
information (character texture). Different scene text tasks need different information but …

Lưu Trích dẫn Trích dẫn 10 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

[PDF][PDF] Intelligent Artistic Typography: A Comprehensive Review of Artistic Text Design and Generation

Y Bai, Z Huang, W Gao, S Yang… - APSIPA Transactions on …, 2024 - nowpublishers.com

Artistic text generation aims to amplify the aesthetic qualities of text while maintaining
readability. It can make the text more attractive and better convey its expression, thus …

Lưu Trích dẫn Trích dẫn 5 bài viết Bài viết có liên quan Tất cả 4 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient diffusion models: A comprehensive survey from principles to practices

Z Ma, Y Zhang, G Jia, L Zhao, Y Ma, M Ma… - arxiv preprint arxiv …, 2024 - arxiv.org

As one of the most popular and sought-after generative models in the recent years, diffusion
models have sparked the interests of many researchers and steadily shown excellent …

Lưu Trích dẫn Trích dẫn 4 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Controllable generation with text-to-image diffusion models: A survey

P Cao, F Zhou, Q Song, L Yang - arxiv preprint arxiv:2403.04279, 2024 - arxiv.org

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …

Lưu Trích dẫn Trích dẫn 29 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Open-sora plan: Open-source large video generation model

B Lin, Y Ge, X Cheng, Z Li, B Zhu, S Wang, X He… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Open-Sora Plan, an open-source project that aims to contribute a large
generation model for generating desired high-resolution videos with long durations based …

Lưu Trích dẫn Trích dẫn 14 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Odm: A text-image further alignment pre-training approach for scene text detection and spotting

C Duan, P Fu, S Guo, Q Jiang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

In recent years text-image joint pre-training techniques have shown promising results in
various tasks. However in Optical Character Recognition (OCR) tasks aligning text instances …

Lưu Trích dẫn Trích dẫn 6 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Anytext: Multilingual visual text generation and editing

Textdiffuser-2: Unleashing the power of language models for text rendering

Multilingual large language model: A survey of resources, taxonomy and frontiers

Vita: Towards open-source interactive omni multimodal llm

Multimodal pretraining, adaptation, and generation for recommendation: A survey

Choose what you need: Disentangled representation learning for scene text recognition removal and editing

[PDF][PDF] Intelligent Artistic Typography: A Comprehensive Review of Artistic Text Design and Generation

Efficient diffusion models: A comprehensive survey from principles to practices

Controllable generation with text-to-image diffusion models: A survey

Open-sora plan: Open-source large video generation model

Odm: A text-image further alignment pre-training approach for scene text detection and spotting