LLM 관련 주요 논문 - 2025-11-20

1. SkillGen: Learning Domain Skills for In-Context Sequential Decision Making


2. AutoTool: Efficient Tool Selection for Large Language Model Agents


3. Operationalizing Pluralistic Values in Large Language Model Alignment Reveals Trade-offs in Safety, Inclusivity, and Model Behavior


4. When Words Change the Model: Sensitivity of LLMs for Constraint Programming Modelling


5. DataSage: Multi-agent Collaboration for Insight Discovery with External Knowledge Retrieval, Multi-role Debating, and Multi-path Reasoning


6. PathMind: A Retrieve-Prioritize-Reason Framework for Knowledge Graph Reasoning with Large Language Models


7. Enhancing Regional Airbnb Trend Forecasting Using LLM-Based Embeddings of Accessibility and Human Mobility


8. DevPiolt: Operation Recommendation for IoT Devices at Xiaomi Home


9. Do Large Language Models (LLMs) Understand Chronology?


10. HFL-FlowLLM: Large Language Models for Network Traffic Flow Classification in Heterogeneous Federated Learning


11. Run, Ruminate, and Regulate: A Dual-process Thinking System for Vision-and-Language Navigation


12. PRISM: Prompt-Refined In-Context System Modelling for Financial Retrieval


13. APD-Agents: A Large Language Model-Driven Multi-Agents Collaborative Framework for Automated Page Design


14. Collaborative QA using Interacting LLMs. Impact of Network Structure, Node Capability and Distributed Data


15. Syn-STARTS: Synthesized START Triage Scenario Generation Framework for Scalable LLM Evaluation


16. ALEX:A Light Editing-knowledge Extractor


17. Jailbreaking Large Vision Language Models in Intelligent Transportation Systems


18. When AI Does Science: Evaluating the Autonomous AI Scientist KOSMOS in Radiation Biology


19. Imagine in Space: Exploring the Frontier of Spatial Intelligence and Reasoning Efficiency in Vision Language Models


20. ARC Is a Vision Problem!


21. Near-Lossless Model Compression Enables Longer Context Inference in DNA Large Language Models


22. Attention via Synaptic Plasticity is All You Need: A Biologically Inspired Spiking Neuromorphic Transformer


23. Ground Truth Generation for Multilingual Historical NLP using LLMs


24. Enhancing Agentic Autonomous Scientific Discovery with Vision-Language Model Capabilities


25. Failure to Mix: Large language models struggle to answer according to desired probability distributions


26. Is Your VLM for Autonomous Driving Safety-Ready? A Comprehensive Benchmark for Evaluating External and In-Cabin Risks


27. ReflexGrad: Three-Way Synergistic Architecture for Zero-Shot Generalization in LLM Agents


28. Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language


29. Agentic Video Intelligence: A Flexible Framework for Advanced Video Exploration and Understanding


30. Tell Me: An LLM-powered Mental Well-being Assistant with RAG, Synthetic Dialogue Generation, and Agentic Planning


31. Watchdogs and Oracles: Runtime Verification Meets Large Language Models for Autonomous Systems


32. The Tokenization Bottleneck: How Vocabulary Extension Improves Chemistry Representation Learning in Pretrained Language Models


33. AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models


34. LLM-Aligned Geographic Item Tokenization for Local-Life Recommendation


35. Orion: A Unified Visual Agent for Multimodal Perception, Advanced Visual Reasoning and Execution


36. AdaTok: Adaptive Token Compression with Object-Aware Representations for Efficient Multimodal LLMs


37. SMART: Shot-Aware Multimodal Video Moment Retrieval with Audio-Enhanced MLLM


38. Multi-view Phase-aware Pedestrian-Vehicle Incident Reasoning Framework with Vision-Language Models


39. Real-Time Mobile Video Analytics for Pre-arrival Emergency Medical Services


40. FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration


41. NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval


42. Error-Driven Scene Editing for 3D Grounding in Large Language Models


43. GRPO Privacy Is at Risk: A Membership Inference Attack Against Reinforcement Learning With Verifiable Rewards


44. Knowledge-Grounded Agentic Large Language Models for Multi-Hazard Understanding from Reconnaissance Reports


45. FlakyGuard: Automatically Fixing Flaky Tests at Industry Scale


46. LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering


47. Node-Level Uncertainty Estimation in LLM-Generated SQL


48. What Works for ‘Lost-in-the-Middle’ in LLMs? A Study on GM-Extract and Mitigations


49. Can QE-informed (Re)Translation lead to Error Correction?


50. GeoPl@ntNet: A Platform for Exploring Essential Biodiversity Variables


51. Uncovering and Aligning Anomalous Attention Heads to Defend Against NLP Backdoor Attacks


52. Scaling Patterns in Adversarial Alignment: Evidence from Multi-LLM Jailbreak Experiments


53. Can LLMs Create Legally Relevant Summaries and Analyses of Videos?


54. ExplainableGuard: Interpretable Adversarial Defense for Large Language Models Using Chain-of-Thought Reasoning


55. PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning


56. What happens when nanochat meets DiLoCo?


57. Multi-Agent VLMs Guided Self-Training with PNU Loss for Low-Resource Offensive Content Detection


58. Robustness of LLM-enabled vehicle trajectory prediction under data security threats


59. AI Kill Switch for malicious web-based LLM agent


60. Signature vs. Substance: Evaluating the Balance of Adversarial Resistance and Linguistic Quality in Watermarking Large Language Models


61. From Legacy Fortran to Portable Kokkos: An Autonomous Agentic AI Workflow