LLM 관련 주요 논문 - 2026-01-15

1. Automating Supply Chain Disruption Monitoring via an Agentic AI Approach


2. Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning


3. LLM for Large-Scale Optimization Model Auto-Formulation: A Lightweight Few-Shot Learning Approach


4. Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning


5. What Do LLM Agents Know About Their World? Task2Quiz: A Paradigm for Studying Environment Understanding


6. EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines


7. Long-term Task-oriented Agent: Proactive Long-term Intent Maintenance in Dynamic Environments


8. Cluster Workload Allocation: Semantic Soft Affinity Using Natural Language Processing


9. STaR: Sensitive Trajectory Regulation for Unlearning in Large Reasoning Models


10. RISER: Orchestrating Latent Reasoning Skills for Adaptive Activation Steering


11. Coordinated Pandemic Control with Large Language Model Agents as Policymaking Assistants


12. Efficient Paths and Dense Rewards: Probabilistic Flow Reasoning for Large Language Models


13. MAXS: Meta-Adaptive Exploration with LLM Agents


14. Position on LLM-Assisted Peer Review: Addressing Reviewer Gap through Mentoring and Feedback


15. PrivacyReasoner: Can LLM Emulate a Human-like Privacy Mind?


16. The AI Hippocampus: How Far are We From Human Memory?


17. DScheLLM: Enabling Dynamic Scheduling through a Fine-Tuned Dual-System Large language Model


18. Programming over Thinking: Efficient and Robust Multi-Constraint Planning


19. The Hierarchy of Agentic Capabilities: Evaluating Frontier Models on Realistic RL Environments


20. ART: Action-based Reasoning Task Benchmarking for Medical AI Agents


21. ConvoLearn: A Dataset of Constructivist Tutor-Student Dialogue


22. Value-Aware Numerical Representations for Transformer Language Models


23. ShortCoder: Knowledge-Augmented Syntax Optimization for Token-Efficient Code Generation


24. LLMs can Compress LLMs: Adaptive Pruning by Agents


25. Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection


26. Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection


27. From Prompt to Protocol: Fast Charging Batteries with Large Language Models


28. The Promptware Kill Chain: How Prompt Injections Gradually Evolved Into a Multi-Step Malware


29. Toward Understanding Unlearning Difficulty: A Mechanistic Perspective and Circuit-Guided Difficulty Metric


30. CogRail: Benchmarking VLMs in Cognitive Intrusion Perception for Intelligent Railway Transportation Systems


31. DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing


32. Hot-Start from Pixels: Low-Resolution Visual Tokens for Chinese Language Modeling


33. Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats


34. Private LLM Inference on Consumer Blackwell GPUs: A Practical Guide for Cost-Effective Local Deployment in SMEs


35. Bridging Semantic Understanding and Popularity Bias with LLMs


36. SimMerge: Learning to Select Merge Operators from Similarity Signals


37. Personalized Multimodal Feedback Using Multiple External Representations: Strategy Profiles and Learning in High School Physics


38. Population-Aligned Audio Reproduction With LLM-Based Equalizers


39. Improving Symbolic Translation of Language Models for Logical Reasoning


40. Where Knowledge Collides: A Mechanistic Study of Intra-Memory Knowledge Conflict in Language Models


41. Bias Dynamics in BabyLMs: Towards a Compute-Efficient Sandbox for Democratising Pre-Training Debiasing


42. Ability Transfer and Recovery via Modularized Parameters Localization


43. Understanding or Memorizing? A Case Study of German Definite Articles in Language Models


44. On-Device Large Language Models for Sequential Recommendation


45. ReGraM: Region-First Knowledge Graph Reasoning for Medical Question Answering


46. RIFT: Repurposing Negative Samples via Reward-Informed Fine-Tuning


47. Mikasa: A Character-Driven Emotional AI Companion Inspired by Japanese Oshi Culture


48. A.X K1 Technical Report


49. ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection


50. SSVP: Synergistic Semantic-Visual Prompting for Industrial Zero-Shot Anomaly Detection


51. SkinFlow: Efficient Information Transmission for Open Dermatological Diagnosis via Dynamic Visual Encoding and Staged RL


52. LP-LLM: End-to-End Real-World Degraded License Plate Text Recognition via Large Multimodal Models


53. SubTokenTest: A Practical Benchmark for Real-World Sub-token Understanding


54. From Symbolic to Natural-Language Relations: Rethinking Knowledge Graph Construction in the Era of Large Language Models


55. Mi:dm 2.0 Korea-centric Bilingual Language Models


56. Is Grokking Worthwhile? Functional Analysis and Transferability of Generalization Circuits in Transformers


57. Can LLMs interpret figurative language as humans do?: surface-level vs representational similarity


58. A Decompilation-Driven Framework for Malware Detection with Large Language Models


59. Generalizable Geometric Prior and Recurrent Spiking Feature Learning for Humanoid Robot Manipulation


60. Proactively Detecting Threats: A Novel Approach Using LLMs


61. OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG


62. Navigating Ideation Space: Decomposed Conceptual Representations for Positioning Scientific Ideas


63. Evaluating Role-Consistency in LLMs for Counselor Training


64. Bridging the Gap: Empowering Small Models in Reliable OpenACC-based Parallelization via GEPA-Optimized Prompting


65. Semantic visually-guided acoustic highlighting with large vision-language models


66. Bias Detection and Rotation-Robustness Mitigation in Vision-Language Models and Generative Image Models


67. Adaptive Trust Metrics for Multi-LLM Systems: Enhancing Reliability in Regulated Industries


68. Revisiting Software Engineering Education in the Era of Large Language Models: A Curriculum Adaptation and Academic Integrity Framework


69. LAUDE: LLM-Assisted Unit Test Generation and Debugging of Hardware DEsigns


70. PediaMind-R1: A Temperament-Aware Language Model for Personalized Early Childhood Care Reasoning via Cognitive Modeling and Preference Alignment


71. Scalable and Reliable Evaluation of AI Knowledge Retrieval Systems: RIKER and the Coherent Simulated Universe


72. Directional Attractors in LLM Reasoning: How Similarity Retrieval Steers Iterative Summarization Based Reasoning


73. Emissions and Performance Trade-off Between Small and Large Language Models


74. Resisting Correction: How RLHF Makes Language Models Ignore External Safety Signals in Natural Conversation


75. Consistency-Aware Editing for Entity-level Unlearning in Language Models


76. DeliberationBench: When Do More Voices Hurt? A Controlled Study of Multi-LLM Deliberation Protocols


77. Revisiting Disaggregated Large Language Model Serving for Performance and Energy Implications


78. CrowdLLM: Building LLM-Based Digital Populations Augmented with Generative Models