LLM 관련 주요 논문 - 2025-12-16

1. AI Benchmark Democratization and Carpentry


2. AI-MASLD Metabolic Dysfunction and Information Steatosis of Large Language Models in Unstructured Clinical Narratives


3. EmeraldMind: A Knowledge Graph-Augmented Framework for Greenwashing Detection


4. General-purpose AI models can generate actionable knowledge on agroecological crop protection


5. Motif-2-12.7B-Reasoning: A Practitioner’s Guide to RL Training Recipes


6. AgentBalance: Backbone-then-Topology Design for Cost-Effective Multi-Agent Systems under Budget Constraints


7. Towards Trustworthy Multi-Turn LLM Agents via Behavioral Guidance


8. CAPTURE: A Benchmark and Evaluation for LVLMs in CAPTCHA Resolving


9. TriFlow: A Progressive Multi-Agent Framework for Intelligent Trip Planning


10. A-LAMP: Agentic LLM-Based Framework for Automated MDP Modeling and Policy Generation


11. FutureWeaver: Planning Test-Time Compute for Multi-Agent Systems with Modularized Collaboration


12. Super Suffixes: Bypassing Text Generation Alignment and Guard Models Simultaneously


13. From Verification Burden to Trusted Collaboration: Design Goals for LLM-Assisted Literature Reviews


14. Bounding Hallucinations: Information-Theoretic Guarantees for RAG Systems via Merlin-Arthur Protocols


15. DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry


16. Does Less Hallucination Mean Less Creativity? An Empirical Investigation in LLMs


17. Towards Privacy-Preserving Code Generation: Differentially Private Code Language Models


18. Exploring MLLM-Diffusion Information Transfer with MetaCanvas


19. Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation


20. Task-Specific Sparse Feature Masks for Molecular Toxicity Prediction with Chemical Language Models


21. REMODEL-LLM: Transforming C code to Java using LLMs


22. Surveillance Video-Based Traffic Accident Detection Using Transformer Architecture


23. Few-Shot VLM-Based G-Code and HMI Verification in CNC Machining


24. CIP: A Plug-and-Play Causal Prompting Framework for Mitigating Hallucinations under Long-Context Noise


25. Adaptive Soft Rolling KV Freeze with Entropy-Guided Recovery: Sublinear Memory Growth for Efficient LLM Inference


26. amc: The Automated Mission Classifier for Telescope Bibliographies


27. Image Tiling for High-Resolution Reasoning: Balancing Local Detail with Global Context


28. MiniScope: A Least Privilege Framework for Authorizing Tool Calling Agents


29. FIBER: A Multilingual Evaluation Resource for Factual Inference Bias


30. Explanation Bias is a Product: Revealing the Hidden Lexical and Position Preferences in Post-Hoc Feature Attribution


31. MedBioRAG: Semantic Search and Retrieval-Augmented Generation with Large Language Models for Medical and Biological QA


32. Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning