TL;DR
EXAONE 4.5 is LG's first open-weight Vision-Language Model combining a 1.2B vision encoder with a 32B language model for industrial applications. It achieves 256K token context length and outperforms larger models on mathematical reasoning and document understanding benchmarks, including surpassing…
0
已证实
0
证据不足
0
无法验证
N/A
可复现性
置信度
0%

核心问题

How can an effective open-weight vision-language model be developed for industrial applications by integrating visual processing capabilities into a large language model architecture?

核心方法

{'approach': 'The approach trains a 1.2B-parameter vision encoder from scratch and integrates it with the EXAONE 4.0 32B language model using a two-stage pre-training pipeline for cross-modal alignment. The model employs Grouped Query Attention, 2D RoPE for vision encoding, and embeds context extension into supervised fine-tuning to achieve 256K token context length. Multi-stage offline preference optimization and reinforcement learning enhance reasoning capabilities across text and vision domains.', 'key_components': ['A 1.2B-parameter vision encoder was trained from scratch to avoid performance degradation from visual token reduction.', 'Grouped Query Attention (GQA) is employed in both vision encoder and language model for computational efficiency.', '2D RoPE is used for vision encoding while 1D RoPE is maintained for language processing to optimize cross-modal performance.', 'The K-EXAONE tokenizer provides enhanced multilingual support, particularly for Korean language processing.', 'The license agreement governs use of the EXAONE AI Model between Licensee and LG Management Development Institute Co., Ltd.', 'Users agree to the terms by downloading, installing, copying, or using the Model.', 'The Agreement constitutes a binding legal contract between Licensee and Licensor.', 'Users who do not agree to all terms must not download, install, copy, or use the Model.'], 'section_ids': ['sec_2', 'sec_20']}

论点验证

可复现性评估

较低可复现性 (0%)

缺失的复现细节

局限性(作者自述)

论文中未明确列出局限性。

本分析由 PDF 阅读助手 自动生成,仅供参考,不构成学术评审意见。验证结论和可复现性评估基于论文文本自动分析,可能存在偏差。原始论文请参阅 arXiv

分析时间:2026-04-22T07:38:00+00:00 · 数据来源:Paper Collector