LLM 관련 주요 논문 - 2025-11-21

1. Know Your Intent: An Autonomous Multi-Perspective LLM Agent Framework for DeFi User Transaction Intent Mining


2. SOLID: a Framework of Synergizing Optimization and LLMs for Intelligent Decision-Making


3. As If We’ve Met Before: LLMs Exhibit Certainty in Recognizing Seen Files


4. HISE-KT: Synergizing Heterogeneous Information Networks and LLMs for Explainable Knowledge Tracing with Meta-Path Optimization


5. SafeRBench: A Comprehensive Benchmark for Safety Assessment in Large Reasoning Models


6. Knowledge-Informed Automatic Feature Extraction via Collaborative Large Language Model Agents


7. ProRAC: A Neuro-symbolic Method for Reasoning about Actions with LLM-based Progression


8. Beyond GeneGPT: A Multi-Agent Architecture with Open-Source LLMs for Enhanced Genomic Question Answering


9. Subnational Geocoding of Global Disasters Using Large Language Models


10. Ask WhAI:Probing Belief Formation in Role-Primed LLM Agents


11. Learning Interestingness in Automated Mathematical Theory Formation


12. The Illusion of Procedural Reasoning: Measuring Long-Horizon FSM Execution in LLMs


13. VisPlay: Self-Evolving Vision-Language Models from Images


14. The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification


15. HSKBenchmark: Modeling and Benchmarking Chinese Second Language Acquisition in Large Language Models through Curriculum Tuning


16. Multimodal Evaluation of Russian-language Architectures


17. Insights from the ICLR Peer Review and Rebuttal Process


18. Small Language Models for Phishing Website Detection: Cost, Performance, and Privacy Trade-Offs


19. Towards Understanding Layer Contributions in Tabular In-Context Learning Models


20. NAMeGEn: Creative Name Generation via A Novel Agent-based Multiple Personalized Goal Enhancement Framework


21. DEPO: Dual-Efficiency Preference Optimization for LLM Agents



23. Parameter Importance-Driven Continual Learning for Foundation Models


24. The Empowerment of Science of Science by Large Language Models: New Tools and Methods


25. Reflexive Evidence-Based Multimodal Learning for Clean Energy Transitions: Causal Insights on Cooking Fuel Access, Urbanization, and Carbon Emissions


26. Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models


27. EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control


28. OEMA: Ontology-Enhanced Multi-Agent Collaboration Framework for Zero-Shot Clinical Named Entity Recognition


29. Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story


30. Physics-Based Benchmarking Metrics for Multimodal Synthetic Images


31. Taxonomy, Evaluation and Exploitation of IPI-Centric LLM Agent Defense Frameworks


32. Finetuning LLMs for Automatic Form Interaction on Web-Browser in Selenium Testing Framework


33. Can MLLMs Detect Phishing? A Comprehensive Security Benchmark Suite Focusing on Dynamic Threats and Multimodal Evaluation in Academic Environments


34. Teaching According to Students’ Aptitude: Personalized Mathematics Tutoring via Persona-, Memory-, and Forgetting-Aware LLMs


35. ItemRAG: Item-Based Retrieval-Augmented Generation for LLM-Based Recommendation


36. From Solving to Verifying: A Unified Objective for Robust Reasoning in LLMs


37. Effective Code Membership Inference for Code Completion Models via Adversarial Prompts


38. BBox DocVQA: A Large Scale Bounding Box Grounded Dataset for Enhancing Reasoning in Document Visual Question Answer


39. Dynamic Expert Quantization for Scalable Mixture-of-Experts Inference


40. Mathematical Analysis of Hallucination Dynamics in Large Language Models: Uncertainty Quantification, Advanced Decoding, and Principled Mitigation


41. SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification


42. Harmful Traits of AI Companions


43. MermaidSeqBench: An Evaluation Benchmark for LLM-to-Mermaid Sequence Diagram Generation


44. Fifty Shades of Greenwashing: The Political Economy of Climate Change Advertising on Social Media


45. Skin-R1: Toward Trustworthy Clinical Reasoning for Dermatological Diagnosis


46. Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization


47. Scalable and Efficient Large-Scale Log Analysis with LLMs: An IT Software Support Case Study


48. Evaluating Generative AI for CS1 Code Grading: Direct vs Reverse Methods


49. irace-evo: Automatic Algorithm Configuration Extended With LLM-Based Code Evolution


50. LiveCLKTBench: Towards Reliable Evaluation of Cross-Lingual Knowledge Transfer in Multilingual LLMs


51. Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective


52. ExplainRec: Towards Explainable Multi-Modal Zero-Shot Recommendation with Preference Attribution and Large Language Models


53. Cluster-based Adaptive Retrieval: Dynamic Context Selection for RAG Applications


54. An LLM-Powered Agent for Real-Time Analysis of the Vietnamese IT Job Market


55. Optimizing Agricultural Research: A RAG-Based Approach to Mycorrhizal Fungi Information



57. Membership Inference Attack against Large Language Model-based Recommendation Systems: A New Distillation-based Paradigm


58. TacEleven: generative tactic discovery for football open play


59. ESA: Energy-Based Shot Assembly Optimization for Automatic Video Editing