LLM 관련 주요 논문 - 2025-10-24

1. Plan Then Retrieve: Reinforcement Learning-Guided Complex Reasoning over Knowledge Graphs


2. The Shape of Reasoning: Topological Analysis of Reasoning Traces in Large Language Models


3. Towards Reliable Evaluation of Large Language Models for Multilingual and Multimodal E-Commerce Applications


4. What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation


5. A computational model and tool for generating more novel opportunities in professional innovation processes


6. IKnow: Instruction-Knowledge-Aware Continual Pretraining for Effective Domain Adaptation


7. LLM-empowered knowledge graph construction: A survey


8. Using Large Language Models for Abstraction of Planning Domains - Extended Version


9. Individualized Cognitive Simulation in Large Language Models: Evaluating Different Cognitive Representation Methods


10. Merge and Conquer: Evolutionarily Optimizing AI for 2048


11. The Lock-In Phase Hypothesis: Identity Consolidation as a Precursor to AGI


12. TRUST: A Decentralized Framework for Auditing Large Language Model Reasoning


13. Human-Centered LLM-Agent System for Detecting Anomalous Digital Asset Transactions


14. LLMs can hide text in other text of the same length.ipynb


15. Surfer 2: The Next Generation of Cross-Platform Computer Use Agents


16. DAG-Math: Graph-Guided Mathematical Reasoning in LLMs


17. Branch-and-Browse: Efficient and Controllable Web Exploration with Tree-Structured Reasoning and Action Memory


18. Benchmarking Reasoning Reliability in Artificial Intelligence Models for Energy-System Analysis


19. Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation


20. On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?


21. Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples


22. Simple Context Compression: Mean-Pooling and Multi-Ratio Training


23. A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance in LLM-generated Text


24. RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines


25. Empathic Prompting: Non-Verbal Context Integration for Multimodal LLM Conversations


26. Thought Communication in Multiagent Collaboration



28. User Perceptions of Privacy and Helpfulness in LLM Responses to Privacy-Sensitive Scenarios


29. Fusing Narrative Semantics for Financial Volatility Forecasting


30. Exploring Large Language Models for Access Control Policy Synthesis and Summarization


31. Neural Diversity Regularizes Hallucinations in Small Models


32. Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection


33. Why Did Apple Fall To The Ground: Evaluating Curiosity In Large Language Model


34. Black Box Absorption: LLMs Undermining Innovative Ideas


35. The Dog the Cat Chased Stumped the Model: Measuring When Language Models Abandon Structure for Shortcuts


36. ARC-Encoder: learning compressed text representations for large language models


37. Fake-in-Facext: Towards Fine-Grained Explainable DeepFake Analysis


38. Metis-HOME: Hybrid Optimized Mixture-of-Experts for Multimodal Reasoning


39. Steering Evaluation-Aware Language Models To Act Like They Are Deployed


40. RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging


41. UniSE: A Unified Framework for Decoder-only Autoregressive LM-based Speech Enhancement


42. Relative-Based Scaling Law for Neural Language Models


43. The Impact of Negated Text on Hallucination with Large Language Models


44. Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models


45. Teaching Language Models to Reason with Tools


46. GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments?


47. RAG-Stack: Co-Optimizing RAG Quality and Performance From the Vector Database Perspective


48. A Parameter-Efficient Mixture-of-Experts Framework for Cross-Modal Geo-Localization


49. Breakdance Video classification in the age of Generative AI


50. Context-level Language Modeling by Learning Predictive Context Embeddings


51. Limits of PRM-Guided Tree Search for Mathematical Reasoning with LLMs


52. Towards AI Agents for Course Instruction in Higher Education: Early Experiences from the Field


53. Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context


54. FinCARE: Financial Causal Analysis with Reasoning and Evidence


55. Stuck in the Matrix: Probing Spatial Reasoning in Large Language Models


56. Mixture-of-Minds: Multi-Agent Reinforcement Learning for Table Understanding


57. Collective Communication for 100k+ GPUs


58. Are Stereotypes Leading LLMs’ Zero-Shot Stance Detection ?


59. SAID: Empowering Large Language Models with Self-Activating Internal Defense


60. Leveraging the Power of Large Language Models in Entity Linking via Adaptive Routing and Targeted Reasoning


61. CreativityPrism: A Holistic Benchmark for Large Language Model Creativity


62. Beyond One-Way Influence: Bidirectional Opinion Dynamics in Multi-Turn Human-LLM Interactions


63. Optimized Distortion in Linear Social Choice


64. Beyond MedQA: Towards Real-world Clinical Decision Making in the Era of LLMs


65. LLM-Augmented Symbolic NLU System for More Reliable Continuous Causal Statement Interpretation


66. A Tutorial on Cognitive Biases in Agentic AI-Driven 6G Autonomous Networks


67. On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization


68. Learning from Supervision with Semantic and Episodic Memory: A Reflective Approach to Agent Adaptation


69. Large Language Model enabled Mathematical Modeling


70. Can They Dixit? Yes they Can! Dixit as a Playground for Multimodal Language Model Capabilities


71. Stream: Scaling up Mechanistic Interpretability to Long Context in LLMs via Sparse Attention


72. From Large to Small: Transferring CUDA Optimization Expertise via Reasoning Graph


73. An Evaluation of the Pedagogical Soundness and Usability of AI-Generated Lesson Plans Across Different Models and Prompt Frameworks in High-School Physics


74. Prompt Decorators: A Declarative and Composable Syntax for Reasoning, Formatting, and Control in LLMs


75. CourtGuard: A Local, Multiagent Prompt Injection Classifier