LLM 관련 주요 논문 - 2025-09-17

1. RepIt: Representing Isolated Targets to Steer Language Models


2. Simulating Clinical AI Assistance using Multimodal LLMs: A Case Study in Diabetic Retinopathy


3. Reasoning with Preference Constraints: A Benchmark for Language Models in Many-to-One Matching Markets


4. A Visualized Framework for Event Cooperation with Generative Agents


5. Toward PDDL Planning Copilot


6. Black-box Model Merging for Language-Model-as-a-Service with Massive Model Repositories


7. The Anatomy of Alignment: Decomposing Preference Optimization by Steering Sparse Features


8. HLSMAC: A New StarCraft Multi-Agent Challenge for High-Level Strategic Decision-Making


9. Stochastic Streets: A Walk Through Random LLM Address Generation in four European Cities


10. LTA-thinker: Latent Thought-Augmented Training Framework for Large Language Models on Complex Reasoning


11. H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents


12. Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs


13. Large Language Models Imitate Logical Reasoning, but at what Cost?


14. Learn to Relax with Large Language Models: Solving Nonlinear Combinatorial Optimization Problems via Bidirectional Coevolution


15. ECG-aBcDe: Overcoming Model Dependence, Encoding ECG into a Universal Language for Any LLM


16. GBV-SQL: Guided Generation and SQL2Text Back-Translation Validation for Multi-Agent Text2SQL


17. Analogy-Driven Financial Chain-of-Thought (AD-FCoT): A Prompting Approach for Financial Sentiment Analysis


18. DaSAThco: Data-Aware SAT Heuristics Combinations Optimization via Large Language Models


19. Empowering Clinical Trial Design through AI: A Randomized Evaluation of PowerGPT


20. Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction


21. Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization


22. Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition


23. AIssistant: An Agentic Approach for Human–AI Collaborative Scientific Work on Reviews and Perspectives in Machine Learning


24. LLMAP: LLM-Assisted Multi-Objective Route Planning with User Preferences


25. RadGame: An AI-Powered Platform for Radiology Education


26. Metacognitive Reuse: Turning Recurring LLM Reasoning Into Concise Behaviors


27. Single-stream Policy Optimization


28. FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning


29. Shaping Explanations: Semantic Reward Modeling with Encoder-Only Transformers for GRPO


30. Multi-Model Synthetic Training for Mission-Critical Small Language Models


31. Perception Before Reasoning: Two-Stage Reinforcement Learning for Visual Reasoning in Vision-Language Models


32. GView: A Survey of Binary Forensics via Visual, Semantic, and AI-Enhanced Analysis


33. Validating Solidity Code Defects using Symbolic and Concrete Execution powered by Large Language Models


34. xOffense: An AI-driven autonomous penetration testing framework with offensive knowledge-enhanced LLMs and multi agent systems


35. Investigating ReLoRA: Effects on the Learning Dynamics of Small Language Models


36. Jailbreaking Large Language Models Through Content Concretization


37. All Roads Lead to Rome: Graph-Based Confidence Estimation for Large Language Model Reasoning


38. Cross-Layer Vision Smoothing: Enhancing Visual Understanding via Sustained Focus on Key Objects in Large Vision-Language Models


39. Conan-Embedding-v2: Training an LLM from Scratch for Text Embeddings


40. The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations


41. Multi-Robot Task Planning for Multi-Object Retrieval Tasks with Distributed On-Site Knowledge via Large Language Models


42. LLM-Based Approach for Enhancing Maintainability of Automotive Architectures


43. InfoGain-RAG: Boosting Retrieval-Augmented Generation via Document Information Gain-based Reranking and Filtering


44. Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative Model


45. Defense-to-Attack: Bypassing Weak Defenses Enables Stronger Jailbreaks in Vision-Language Models


46. Instance-level Randomization: Toward More Stable LLM Evaluations


47. Don’t Change My View: Ideological Bias Auditing in Large Language Models


48. A Systematic Evaluation of Parameter-Efficient Fine-Tuning Methods for the Security of Code LLMs


49. ScaleDoc: Scaling LLM-based Predicates over Large Document Collections


50. EconProver: Towards More Economical Test-Time Scaling for Automated Theorem Proving


51. FunAudio-ASR Technical Report


52. MedFact: Benchmarking the Fact-Checking Capabilities of Large Language Models on Chinese Medical Texts


53. Understanding Prompt Management in GitHub Repositories: A Call for Best Practices


54. Evaluating Large Language Models for Functional and Maintainable Code in Industrial Settings: A Case Study at ASML


55. MORABLES: A Benchmark for Assessing Abstract Moral Reasoning in LLMs with Fables


56. An integrated process for design and control of lunar robotics using AI and simulation


57. PATIMT-Bench: A Multi-Scenario Benchmark for Position-Aware Text Image Machine Translation in Large Vision-Language Models


58. RL Fine-Tuning Heals OOD Forgetting in SFT


59. Towards Trustworthy Agentic IoEV: AI Agents for Explainable Cyberthreat Mitigation and State Analytics


60. Profiling LoRA/QLoRA Fine-Tuning Efficiency on Consumer GPUs: An RTX 4060 Case Study


61. MEUV: Achieving Fine-Grained Capability Activation in Large Language Models via Mutually Exclusive Unlock Vectors


62. TinyServe: Query-Aware Cache Selection for Efficient LLM Serving