Adam's Law: Textual Frequency Law on Large Language Models - AI 论文深度分析

TL;DR
Textual Frequency Law (TFL) proposes preferring higher-frequency data for LLM training when meanings are identical. The authors introduce TFL, frequency distillation, and curriculum training methods, achieving up to 8% accuracy gains in math reasoning and BLEU improvements across 99/100 translation…

已证实

证据不足

无法验证

N/A

可复现性

置信度

核心问题

Which data should be favored during LLM training and prompting when computational resources are limited, specifically whether higher-frequency paraphrases outperform lower-frequency ones when meanings are identical.

核心方法

{'approach': 'The authors construct a Textual Frequency Paired Dataset from GSM8K, FLORES-200, and CommonsenseQA using GPT-4o-mini to generate high and low-frequency paraphrases validated by human annotators. Sentence-level frequency is estimated using position-unaware multiplication of word-level Zipf frequencies from existing corpora. Experiments test both prompting and fine-tuning scenarios on closed-source (GPT-4o-mini) and open-source LLMs (DeepSeek-V3, Llama-3.3-70B-Instruct, qwen2.5-7b-instruct).', 'key_components': ['Paraphrasing is useful for evaluating language models, mitigating data contamination, and data augmentation.', 'Computational budgets for training and prompting are usually limited, raising questions about paraphrase selection.', 'Results suggest that high-frequency paraphrases should be preferred for both prompting and fine-tuning.'], 'section_ids': ['sec_3']}

论点验证

可复现性评估

较低可复现性 (0%)

缺失的复现细节

Specific LLM models/architectures used in experiments
Dataset details - what datasets were used, how they were sourced and processed
How 'frequency' was defined and calculated for paraphrases
Paraphrase generation methodology - how paraphrases were created or collected
Hyperparameters for fine-tuning (learning rate, batch size, epochs, optimizer settings)
Prompting configurations (prompt templates, number of shots, temperature, etc.)
Training/fine-tuning procedures and implementation details
Evaluation metrics and their exact implementation
Hardware specifications and computational environment
Random seeds and number of experimental runs

局限性（作者自述）

论文中未明确列出局限性。

本分析由 PDF 阅读助手自动生成，仅供参考，不构成学术评审意见。验证结论和可复现性评估基于论文文本自动分析，可能存在偏差。原始论文请参阅 arXiv。

分析时间：2026-04-08T13:18:09+00:00 · 数据来源：Paper Collector