LLM 관련 주요 논문 - 2025-11-17

1. Querying Labeled Time Series Data with Scenario Programs


2. SITA: A Framework for Structure-to-Instance Theorem Autoformalization


3. FactGuard: Event-Centric and Commonsense-Guided Fake News Detection


4. Fixed-Persona SLMs with Modular Memory: Scalable NPC Dialogue on Consumer Hardware


5. Causal-HalBench: Uncovering LVLMs Object Hallucinations Through Causal Intervention


6. PepTriX: A Framework for Explainable Peptide Analysis through Protein Language Models


7. ProgRAG: Hallucination-Resistant Progressive Retrieval and Reasoning over Knowledge Graphs


8. Bridging Synthetic and Real Routing Problems via LLM-Guided Instance Generation and Progressive Adaptation


9. Advanced Black-Box Tuning of Large Language Models with Limited API Calls


10. Enhancing the Medical Context-Awareness Ability of LLMs via Multifaceted Self-Refinement Learning


11. Efficient Thought Space Exploration through Strategic Intervention


12. Beyond ReAct: A Planner-Centric Framework for Complex Tool-Augmented LLM Reasoning


13. ChEmREF: Evaluating Language Model Readiness for Chemical Emergency Response


14. SPAN: Benchmarking and Improving Cross-Calendar Temporal Reasoning of Large Language Models


15. OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive


16. Learning to Pose Problems: Reasoning-Driven and Solver-Adaptive Data Synthesis for Large Reasoning Models


17. CTRL-ALT-DECEIT: Sabotage Evaluations for Automated AI R&D



19. SlideBot: A Multi-Agent Framework for Generating Informative, Reliable, Multi-Modal Presentations


20. AI Annotation Orchestration: Evaluating LLM verifiers to Improve the Quality of LLM Annotations in Learning Analytics


21. Echoing: Identity Failures when LLM Agents Talk to Each Other


22. Proceedings of the Second International Workshop on Next-Generation Language Models for Knowledge Representation and Reasoning (NeLaMKRR 2025)


23. Black-Box On-Policy Distillation of Large Language Models


24. Instella: Fully Open Language Models with Stellar Performance


25. SSR: Socratic Self-Refine for Large Language Model Reasoning


26. Know Your Limits: Entropy Estimation Modeling for Compression and Generalization


27. Towards an Agentic Workflow for Internet Measurement Research


28. Textual understanding boost in the WikiRace


29. Evaluating Prompting Strategies with MedGemma for Medical Order Extraction


30. Say It Differently: Linguistic Styles as Jailbreak Vectors


31. Scalable Synthesis of distributed LLM workloads through Symbolic Tensor Graphs


32. Beyond Elicitation: Provision-based Prompt Optimization for Knowledge-Intensive Tasks


33. LocalBench: Benchmarking LLMs on County-Level Local Knowledge and Reasoning


34. Reasoning About Intent for Ambiguous Requests


35. Rethinking the Reliability of Multi-agent System: A Perspective from Byzantine Fault Tolerance


36. AgentEvolver: Towards Efficient Self-Evolving Agent System


37. Simulating Misinformation Propagation in Social Networks using Large Language Models


38. BhashaKritika: Building Synthetic Pretraining Data at Scale for Indic Languages


39. Rethinking Visual Information Processing in Multimodal LLMs


40. Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models


41. Quality Assurance of LLM-generated Code: Addressing Non-Functional Quality Characteristics


42. MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models


43. Lost in Serialization: Invariance and Generalization of LLM Graph Reasoners


44. VocalNet-M2: Advancing Low-Latency Spoken Language Modeling via Integrated Multi-Codebook Tokenization and Multi-Token Prediction


45. Speech-Audio Compositional Attacks on Multimodal LLMs and Their Mitigation with SALMONN-Guard


46. Persona-Aware Alignment Framework for Personalized Dialogue Generation


47. On the Military Applications of Large Language Models


48. Opinion: Towards Unified Expressive Policy Optimization for Robust Robot Learning


49. BuddyMoE: Exploiting Expert Redundancy to Accelerate Memory-Constrained Mixture-of-Experts Inference


50. Anomagic: Crossmodal Prompt-driven Zero-shot Anomaly Generation


51. fastbmRAG: A Fast Graph-Based RAG Framework for Efficient Processing of Large-Scale Biomedical Literature


52. PustakAI: Curriculum-Aligned and Interactive Textbooks Using Large Language Models


53. Difference Vector Equalization for Robust Fine-tuning of Vision-Language Models


54. Owlgorithm: Supporting Self-Regulated Learning in Competitive Programming through LLM-Driven Reflection


55. EnvTrace: Simulation-Based Semantic Evaluation of LLM Code via Execution Trace Alignment – Demonstrated at Synchrotron Beamlines


56. EEGAgent: A Unified Framework for Automated EEG Analysis Using Large Language Models


57. Simulator and Experience Enhanced Diffusion Model for Comprehensive ECG Generation


58. Taught by the Flawed: How Dataset Insecurity Breeds Vulnerable AI Code


59. From Street to Orbit: Training-Free Cross-View Retrieval via Location Semantics and LLM Guidance


60. Test-Time Spectrum-Aware Latent Steering for Zero-Shot Generalization in Vision-Language Models


61. Predicate-Argument Structure Divergences in Chinese and English Parallel Sentences and their Impact on Language Transfer


62. How Small Can You Go? Compact Language Models for On-Device Critical Error Detection in Machine Translation


63. TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training


64. Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey


65. General Intelligence-based Fragmentation (GIF): A framework for peak-labeled spectra simulation


66. Probability-Biased Attention over Directed Bipartite Graphs for Long-Tail ICD Coding