LLM 관련 주요 논문 - 2025-09-09

1. LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation


2. Internet 3.0: Architecture for a Web-of-Agents with it’s Algorithm for Ranking Agents


3. Towards Ontology-Based Descriptions of Conversations with Qualitatively-Defined Concepts


4. SparkUI-Parser: Enhancing GUI Perception with Robust Grounding and Parsing


5. OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration


6. Cloning a Conversational Voice AI Agent from Call\,Recording Datasets for Telesales


7. Collaboration and Conflict between Humans and Language Models through the Lens of Game Theory


8. TalkToAgent: A Human-centric Explanation of Reinforcement Learning Agents with Large Language Models


9. What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking


10. Language-Driven Hierarchical Task Structures as Explicit World Models for Multi-Agent Learning


11. Towards Personalized Explanations for Health Simulations: A Mixed-Methods Framework for Stakeholder-Centric Summarization


12. Maestro: Joint Graph & Config Optimization for Reliable AI Agents


13. The Ethical Compass of the Machine: Evaluating Large Language Models for Decision Support in Construction Project Management


14. Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining


15. SpikingBrain Technical Report: Spiking Brain-inspired Large Models


16. Scaling Performance of Large Language Model Pretraining


17. CURE: Controlled Unlearning for Robust Embeddings – Mitigating Conceptual Shortcuts in Pre-Trained Language Models


18. HoPE: Hyperbolic Rotary Positional Encoding for Stable Long-Range Dependency Modeling in Large Language Models


19. AI Agents for Web Testing: A Case Study in the Wild


20. GenAI-based test case generation and execution in SDV platform


21. LLM Enabled Multi-Agent System for 6G Networks: Framework and Method of Dual-Loop Edge-Terminal Collaboration


22. Artificial intelligence for representing and characterizing quantum systems


23. PLaMo 2 Technical Report


24. Enhancing Diversity in Large Language Models via Determinantal Point Processes


25. The LLM Has Left The Chat: Evidence of Bail Preferences in Large Language Models


26. Decoders Laugh as Loud as Encoders


27. FloodVision: Urban Flood Depth Estimation Using Foundation Vision-Language Models and Domain Knowledge Graph


28. MCANet: A Multi-Scale Class-Specific Attention Network for Multi-Label Post-Hurricane Damage Assessment using UAV Imagery


29. A Study of Large Language Models for Patient Information Extraction: Model Architecture, Fine-Tuning Strategy, and Multi-task Instruction Tuning


30. SePA: A Search-enhanced Predictive Agent for Personalized Health Coaching


31. KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering


32. ODKE+: Ontology-Guided Open-Domain Knowledge Extraction with LLMs


33. Polysemantic Dropout: Conformal OOD Detection for Specialized LLMs


34. Scaling Environments for Organoid Intelligence with LLM-Automated Design and Plasticity-Based Evaluation


35. Schema Inference for Tabular Data Repositories Using Large Language Models


36. Sample-efficient Integration of New Modalities into Large Language Models


37. Manipulating Transformer-Based Models: Controllability, Steerability, and Robust Interventions


38. Emergent Social Dynamics of LLM Agents in the El Farol Bar Problem


39. Quantized Large Language Models in Biomedical Natural Language Processing: Evaluation and Recommendation


40. Mitigation of Gender and Ethnicity Bias in AI-Generated Stories through Model Explanations


41. From Silent Signals to Natural Language: A Dual-Stage Transformer-LLM Approach


42. Behavioral Fingerprinting of Large Language Models


43. VaccineRAG: Boosting Multimodal Large Language Models’ Immunity to Harmful RAG Samples


44. Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts


45. DeepTRACE: Auditing Deep Research AI Systems for Tracking Reliability Across Citations and Evidence


46. Where Should I Study? Biased Language Models Decide! Evaluating Fairness in LMs for Academic Recommendations


47. Learned Hallucination Detection in Black-Box LLMs using Token-level Entropy Production Rate


48. Serialized Output Prompting for Large Language Model-based Multi-Talker Speech Recognition


49. DecMetrics: Structured Claim Decomposition Scoring for Factually Consistent LLM Outputs


50. Energy Landscapes Enable Reliable Abstention in Retrieval-Augmented Large Language Models for Healthcare


51. Narrative-to-Scene Generation: An LLM-Driven Pipeline for 2D Game Environments


52. No Clustering, No Routing: How Transformers Actually Process Rare Tokens


53. Training Text-to-Molecule Models with Context-Aware Tokenization


54. ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute


55. Scaling Up, Speeding Up: A Benchmark of Speculative Decoding for Efficient LLM Test-Time Scaling


56. SpeechLLM: Unified Speech and Language Model for Enhanced Multi-Task Understanding in Low Resource Settings


57. RECAP: REwriting Conversations for Intent Understanding in Agentic Planning


58. MOSAIC: A Multilingual, Taxonomy-Agnostic, and Computationally Efficient Approach for Radiological Report Classification


59. COCORELI: Cooperative, Compositional Reconstitution \& Execution of Language Instructions


60. Multi-Modal Vision vs. Text-Based Parsing: Benchmarking LLM Strategies for Invoice Processing


61. Evaluating Large Language Models for Financial Reasoning: A CFA-Based Benchmark Study


62. Enhancing LLM Efficiency: Targeted Pruning for Prefill-Decode Disaggregation in Inference


63. Just-in-time and distributed task representations in language models


64. Emotionally-Aware Agents for Dispute Resolution


65. Can Multiple Responses from an LLM Reveal the Sources of Its Uncertainty?


66. CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection


67. Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving