LLM 관련 주요 논문 - 2026-03-19

1. AgentFactory: A Self-Evolving Framework Through Executable Subagent Accumulation and Reuse


2. RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy


3. Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory



5. MALLES: A Multi-agent LLMs-based Economic Sandbox with Consumer Preference Alignment


6. Sensi: Learn One Thing at a Time – Curriculum-Based Test-Time Learning for LLM Game Agents


7. VeriGrey: Greybox Agent Validation


8. InfoDensity: Rewarding Information-Dense Traces for Efficient Reasoning


9. Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures


10. How Clued up are LLMs? Evaluating Multi-Step Deductive Reasoning in a Text-Based Game Environment


11. Generative AI-assisted Participatory Modeling in Socio-Environmental Planning under Deep Uncertainty


12. Unified Spatio-Temporal Token Scoring for Efficient Video VLMs


13. Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models


14. VideoAtlas: Navigating Long-Form Video in Logarithmic Compute


15. IndicSafe: A Benchmark for Evaluating Multilingual LLM Safety in South Asia


16. Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs


17. scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns


18. RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference


19. AI-Assisted Goal Setting Improves Goal Progress Through Social Accountability


20. Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval


21. Text-to-Stage: Spatial Layouts from Long-form Narratives


22. FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair


23. Dropout Robustness and Cognitive Profiling of Transformer Models via Stochastic Inference


24. Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients


25. CoVerRL: Breaking the Consensus Trap in Label-Free Reasoning via Generator-Verifier Co-Evolution


26. SARE: Sample-wise Adaptive Reasoning for Training-free Fine-grained Visual Recognition


27. Can Blindfolded LLMs Still Trade? An Anonymization-First Framework for Portfolio Optimization


28. WeatherReasonSeg: A Benchmark for Weather-Aware Reasoning Segmentation in Visual Language Models


29. Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models


30. Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards


31. FINER: MLLMs Hallucinate under Fine-grained Negative Queries


32. Interpretable Cross-Domain Few-Shot Learning with Rectified Target-Domain Local Alignment


33. A Contextual Help Browser Extension to Assist Digital Illiterate Internet Users


34. Detecting the Machine: A Comprehensive Benchmark of AI-Generated Text Detectors Across Architectures, Domains, and Adversarial Conditions


35. VLM2Rec: Resolving Modality Collapse in Vision-Language Model Embedders for Multimodal Sequential Recommendation


36. AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement


37. Caging the Agents: A Zero Trust Security Architecture for Autonomous AI in Healthcare


38. Efficient Exploration at Scale



40. Recurrent Reasoning with Vision-Language Models for Estimating Long-Horizon Embodied Task Progress


41. From Words to Worlds: Benchmarking Cross-Cultural Cultural Understanding in Machine Translation


42. GUIDE: GenAI Units In Digital Design Education


43. Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients


44. From Drop-off to Recovery: A Mechanistic Analysis of Segmentation in MLLMs


45. TharuChat: Bootstrapping Large Language Models for a Low-Resource Language via Synthetic Data and Human Validation


46. Alignment Makes Language Models Normative, Not Descriptive


47. Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text


48. OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation


49. Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing


50. Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems


51. Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning


52. Generalist Multimodal LLMs Gain Biometric Expertise via Human Salience


53. REAL: Regression-Aware Reinforcement Learning for LLM-as-a-Judge


54. Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework


55. Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles


56. Evaluating Ill-Defined Tasks in Large Language Models


57. Early Quantization Shrinks Codebook: A Simple Fix for Diversity-Preserving Tokenization


58. Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models


59. LLM NL2SQL Robustness: Surface Noise vs. Linguistic Variation in Traditional and Agentic Settings


60. Empirical Recipes for Efficient and Compact Vision-Language Models


61. The State of Generative AI in Software Development: Insights from Literature and a Developer Survey


62. Are a Thousand Words Better Than a Single Picture? Beyond Images – A Framework for Multi-Modal Knowledge Graph Dataset Enrichment


63. MSRAMIE: Multimodal Structured Reasoning Agent for Multi-instruction Image Editing


64. CineSRD: Leveraging Visual, Acoustic, and Linguistic Cues for Open-World Visual Media Speaker Diarization


65. Adversarial attacks against Modern Vision-Language Models


66. PhysQuantAgent: An Inference Pipeline of Mass Estimation for Vision-Language Models


67. EmergeNav: Structured Embodied Inference for Zero-Shot Vision-and-Language Navigation in Continuous Environments


68. TDMM-LM: Bridging Facial Understanding and Animation via Language Models


69. AgriChat: A Multimodal Large Language Model for Agriculture Image Understanding


70. Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs


71. Script-to-Slide Grounding: Grounding Script Sentences to Slide Objects for Automatic Instructional Video Generation


72. TerraLingua: Emergence and Analysis of Open-endedness in LLM Ecologies


73. From Language to Action in Arabic: Reliable Structured Tool Calling via Data-Centric Fine-Tuning


74. Social physics in the age of artificial intelligence


75. Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment