LLM 관련 주요 논문 - 2025-12-18

1. Dynamic Learning Rate Scheduling based on Loss Changes Leads to Faster Convergence


2. Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling


3. Seismology modeling agent: A smart assistant for geophysical researchers


4. PortAgent: LLM-driven Vehicle Dispatching Agent for Port Terminals


5. Massive Editing for Large Language Models Based on Dynamic Weight Generation


6. Leveraging LLMs for Collaborative Ontology Engineering in Parkinson Disease Monitoring and Alerting


7. Gödel’s Poetry


8. Georeferencing complex relative locality descriptions with large language models


9. Grammar Search for Multi-Agent Systems


10. RADAR: Accelerating Large Language Model Inference With RL-Based Dynamic Draft Trees


11. OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value


12. Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation


13. Evaluating Small Language Models for Agentic On-Farm Decision Support Systems


14. MobileWorldBench: Towards Semantic World Modeling For Mobile Agents


15. Sparsity-Controllable Dynamic Top-p MoE for Large Foundation Model Pre-training


16. ReflCtrl: Controlling LLM Reflection via Representation Engineering


17. Evaluating Frontier LLMs on PhD-Level Mathematical Reasoning: A Benchmark on a Textbook in Theoretical Computer Science about Randomized Algorithms


18. EvoLattice: Persistent Internal-Population Evolution through Multi-Alternative Quality-Diversity Graph Representations for LLM-Guided Program Discovery


19. State-Dependent Refusal and Learned Incapacity in RLHF-Aligned Language Models


20. Compressed Causal Reasoning: Quantization and GraphRAG Effects on Interventional and Counterfactual Accuracy


21. ValuePilot: A Two-Phase Framework for Value-Driven Decision-Making


22. AI-Powered Annotation Pipelines for Stabilizing Large Language Models: A Human-AI Synergy Approach


23. LoopBench: Discovering Emergent Symmetry Breaking Strategies with LLM Swarms


24. Adjudicator: Correcting Noisy Labels with a KG-Informed Council of LLM Agents


25. Leveraging LLMs for Structured Data Extraction from Unstructured Patient Records


26. TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs


27. Spoken DialogSum: An Emotion-Rich Conversational Dataset for Spoken Dialogue Summarization


28. Towards Nepali-language LLMs: Efficient GPT training with a Nepali BPE tokenizer


29. Polypersona: Persona-Grounded LLM for Synthetic Survey Responses



31. Dual Language Models: Balancing Training Efficiency and Overfitting Resilience


32. SASQ: Static Activation Scaling for Quantization-Aware Training in Large Language Models


33. Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space


34. Effect of Document Packing on the Latent Multi-Hop Reasoning Capabilities of Large Language Models


35. DISCODE: Distribution-Aware Score Decoder for Robust Automatic Evaluation of Image Captioning


36. RePo: Language Models with Context Re-Positioning


37. Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity


38. From YOLO to VLMs: Advancing Zero-Shot and Few-Shot Detection of Wastewater Treatment Plants Using Satellite Imagery in MENA Region


39. The Trust in AI-Generated Health Advice (TAIGHA) Scale and Short Version (TAIGHA-S): Development and Validation Study


40. SPARQL-LLM: Real-Time SPARQL Query Generation from Natural Language Questions


41. From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition


42. PentestEval: Benchmarking LLM-based Penetration Testing with Modular and Stage-Level Design


43. Estimating problem difficulty without ground truth using Large Language Model comparisons


44. IntentMiner: Intent Inversion Attack via Tool Call Analysis in the Model Context Protocol


45. TorchTraceAP: A New Benchmark Dataset for Detecting Performance Anti-Patterns in Computer Vision Models


46. LAPPI: Interactive Optimization with LLM-Assisted Preference-Based Problem Instantiation


47. UIXPOSE: Mobile Malware Detection via Intention-Behaviour Discrepancy Analysis


48. SportsGPT: An LLM-driven Framework for Interpretable Sports Motion Assessment and Training Guidance


49. Neurosymbolic Inference On Foundation Models For Remote Sensing Text-to-image Retrieval With Complex Queries


50. SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations


51. SDAR-VL: Stable and Efficient Block-wise Diffusion for Vision-Language Understanding


52. Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed


53. OmniDrive-R1: Reinforcement-driven Interleaved Multi-modal Chain-of-Thought for Trustworthy Vision-Language Autonomous Driving


54. PerfCoder: Large Language Models for Interpretable Code Performance Optimization


55. KFS-Bench: Comprehensive Evaluation of Key Frame Sampling in Long Video Understanding


56. Multi-Agent Collaborative Framework for Intelligent IT Operations: An AOI System with Context-Aware Compression and Dynamic Task Scheduling


57. Informing Acquisition Functions via Foundation Models for Molecular Discovery


58. Hierarchical Multi-agent Large Language Model Reasoning for Autonomous Functional Materials Discovery


59. Context Branching for LLM Conversations: A Version Control Approach to Exploratory Programming


60. OPTIMA: Optimal One-shot Pruning for LLMs via Quadratic Programming Reconstruction


61. Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors


62. Improvise, Adapt, Overcome – Telescopic Adapters for Efficient Fine-tuning of Vision Language Models in Medical Imaging


63. STAR: STacked AutoRegressive Scheme for Unified Multimodal Learning


64. MIDUS: Memory-Infused Depth Up-Scaling


65. Why Text Prevails: Vision May Undermine Multimodal Medical Decision Making


66. DL$^3$M: A Vision-to-Language Framework for Expert-Level Medical Reasoning through Deep Learning and Large Language Models


67. The Laminar Flow Hypothesis: Detecting Jailbreaks via Semantic Turbulence in Large Language Models


68. Low-Rank Compression of Language Models via Differentiable Rank Selection


69. Complex Mathematical Expression Recognition: Benchmark, Large-Scale Dataset and Strong Baseline


70. CurvaDion: Curvature-Adaptive Distributed Orthonormalization


71. Made-in China, Thinking in America:U.S. Values Persist in Chinese LLMs


72. Scaling and Transferability of Annealing Strategies in Large Language Model Training


73. Safe2Harm: Semantic Isomorphism Attacks for Jailbreaking Large Language Models


74. Writing in Symbiosis: Mapping Human Creative Agency in the AI Era


75. EDGC: Entropy-driven Dynamic Gradient Compression for Efficient LLM Training