전체 AI 논문 - 2025-09-15

1. Mutual Information Tracks Policy Coherence in Reinforcement Learning


2. Abduct, Act, Predict: Scaffolding Causal Inference for Automated Failure Attribution in Multi-Agent Systems


3. State Algebra for Propositional Logic


4. The Morality of Probability: How Implicit Moral Biases in LLMs May Shape the Future of Human-AI Symbiosis


5. Investigating Language Model Capabilities to Represent and Process Formal Knowledge: A Preliminary Study to Assist Ontology Engineering


6. Compartmentalised Agentic Reasoning for Clinical NLI


7. Towards Fully Automated Molecular Simulations: Multi-Agent Framework for Simulation Setup and Force Field Extraction


8. Online Robust Planning under Model Uncertainty: A Sample-Based Approach


9. Virtual Agent Economies


10. AI Harmonics: a human-centric and harms severity-adaptive AI risk assessment framework


11. XAgents: A Unified Framework for Multi-Agent Cooperation via IF-THEN Rules and Multipolar Task Processing Graph


12. GAMA: A General Anonymizing Multi-Agent System for Privacy Preservation Enhanced by Domain Rules and Disproof Method


13. Evaluation of Black-Box XAI Approaches for Predictors of Values of Boolean Formulae


14. A Markovian Framing of WaveFunctionCollapse for Procedurally Generating Aesthetically Complex Environments


15. The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science


16. LLMs as Agentic Cooperative Players in Multiplayer UNO


17. Towards an AI-based knowledge assistant for goat farmers based on Retrieval-Augmented Generation


18. Towards a Common Framework for Autoformalization


19. A Modular and Multimodal Generative AI Framework for Urban Building Energy Data: Generating Synthetic Homes


20. How well can LLMs provide planning feedback in grounded environments?


21. Executable Ontologies: Synthesizing Event Semantics with Dataflow Architecture


22. Human-AI Collaboration Increases Efficiency in Regulatory Writing


23. Standards in the Preparation of Biomedical Research Metadata: A Bridge2AI Perspective


24. Is In-Context Learning Learning?


25. Multimodal SAM-adapter for Semantic Segmentation


26. Diversified recommendations of cultural activities with personalized determinantal point processes


27. Improving Audio Event Recognition with Consistency Regularization


28. Data distribution impacts the performance and generalisability of contrastive learning-based foundation models of electrocardiograms


29. Towards Understanding Visual Grounding in Visual Language Models


30. GLAM: Geometry-Guided Local Alignment for Multi-View VLP in Mammography


31. I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation


32. Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Data


33. We Need a New Ethics for a World of AI Agents


34. SignClip: Leveraging Mouthing Cues for Sign Language Translation by Multimodal Contrastive Fusion


35. Openness in AI and downstream governance: A global value chain approach


36. SI-FACT: Mitigating Knowledge Conflict via Self-Improving Faithfulness-Aware Contrastive Tuning


37. Benchmark of stylistic variation in LLM-generated texts


38. BenchECG and xECG: a benchmark and baseline for ECG foundation models


39. Efficient Learning-Based Control of a Legged Robot in Lunar Gravity


40. Population-Aligned Persona Generation for LLM-based Social Simulation


41. Realism Control One-step Diffusion for Real-World Image Super-Resolution


42. Generating Energy-Efficient Code via Large-Language Models – Where are we now?


43. Established Psychometric vs. Ecologically Valid Questionnaires: Rethinking Psychological Assessments in Large Language Models


44. Predictive Spike Timing Enables Distributed Shortest Path Computation in Spiking Neural Networks


45. TwinTac: A Wide-Range, Highly Sensitive Tactile Sensor with Real-to-Sim Digital Twin Sensor Model


46. Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration


47. Reinforcement learning for spin torque oscillator tasks


48. Exploring Expert Specialization through Unsupervised Training in Sparse Mixture of Experts


49. Intrinsic Dimension Estimating Autoencoder (IDEA) Using CancelOut Layer and a Projected Loss


50. Unsupervised Hallucination Detection by Inspecting Reasoning Processes


51. Drone-Based Multispectral Imaging and Deep Learning for Timely Detection of Branched Broomrape in Tomato Farms


52. Securing LLM-Generated Embedded Firmware through AI Agent-Driven Validation and Patching



54. Limited Reference, Reliable Generation: A Two-Component Framework for Tabular Data Generation in Low-Data Regimes


55. Zero-Shot Referring Expression Comprehension via Visual-Language True/False Verification


56. Adaptive Token Merging for Efficient Transformer Semantic Communication at the Edge


57. SmartCoder-R1: Towards Secure and Explainable Smart Contract Generation with Security-Aware Group Relative Policy Optimization


58. WALL: A Web Application for Automated Quality Assurance using Large Language Models


59. An Autoencoder and Vision Transformer-based Interpretability Analysis of the Differences in Automated Staging of Second and Third Molars


60. Tackling One Health Risks: How Large Language Models are leveraged for Risk Negotiation and Consensus-building


61. Self-Augmented Robot Trajectory: Efficient Imitation Learning via Safe Self-augmentation with Demonstrator-annotated Precision


62. Automated Tuning for Diffusion Inverse Problem Solvers without Generative Prior Retraining


63. From Hugging Face to GitHub: Tracing License Drift in the Open-Source AI Ecosystem


64. Emulating Public Opinion: A Proof-of-Concept of AI-Generated Synthetic Survey Responses for the Chilean Case


65. Vibe Check: Understanding the Effects of LLM-Based Conversational Agents’ Personality and Alignment on User Perceptions in Goal-Oriented Tasks


66. Surrogate Supervision for Robust and Generalizable Deformable Image Registration


67. Latency and Token-Aware Test-Time Compute


68. SWE-Effi: Re-Evaluating Software AI Agent System Effectiveness Under Resource Constraints


69. HGEN: Heterogeneous Graph Ensemble Networks


70. Revisiting Actor-Critic Methods in Discrete Action Off-Policy Reinforcement Learning


71. CoDiCodec: Unifying Continuous and Discrete Compressed Representations of Audio


72. SoilSound: Smartphone-based Soil Moisture Estimation


73. HEFT: A Coarse-to-Fine Hierarchy for Enhancing the Efficiency and Accuracy of Language Model Reasoning


74. ZORRO: Zero-Knowledge Robustness and Privacy for Split Learning (Full Version)


75. LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation


76. Meta-Learning Reinforcement Learning for Crypto-Return Prediction


77. A Co-Training Semi-Supervised Framework Using Faster R-CNN and YOLO Networks for Object Detection in Densely Packed Retail Images


78. D-CAT: Decoupled Cross-Attention Transfer between Sensor Modalities for Unimodal Inference


79. Structure Matters: Brain Graph Augmentation via Learnable Edge Masking for Data-efficient Psychiatric Diagnosis


80. HypoGeneAgent: A Hypothesis Language Agent for Gene-Set Cluster Resolution Selection Using Perturb-seq Datasets


81. World Modeling with Probabilistic Structure Integration


82. MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools


83. MITS: A Large-Scale Multimodal Benchmark Dataset for Intelligent Traffic Surveillance


84. MultimodalHugs: Enabling Sign Language Processing in Hugging Face


85. DiTTO-LLM: Framework for Discovering Topic-based Technology Opportunities via Large Language Model


86. ALIGNS: Unlocking nomological networks in psychological measurement through a large language model


87. A Multimodal RAG Framework for Housing Damage Assessment: Collaborative Optimization of Image Encoding and Policy Vector Retrieval


88. VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions


89. Investigating Symbolic Triggers of Hallucination in Gemma Models Across HaluEval and TruthfulQA


90. How Small Transformation Expose the Weakness of Semantic Similarity Measures


91. HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Question Answering


92. The Thinking Therapist: Training Large Language Models to Deliver Acceptance and Commitment Therapy using Supervised Fine-Tuning and Odds Ratio Policy Optimization


93. Psychiatry-Bench: A Multi-Task Benchmark for LLMs in Psychiatry


94. Generating Individual Travel Diaries Using Large Language Models Informed by Census and Land-Use Data


95. Assisting Research Proposal Writing with Large Language Models: Evaluation and Refinement


96. Beyond I’m Sorry, I Can’t: Dissecting Large Language Model Refusal


97. LLM-Based Instance-Driven Heuristic Bias In the Context of a Biased Random Key Genetic Algorithm


98. Differential Robustness in Transformer Language Models: Empirical Evaluation Under Adversarial Text Attacks


99. The Non-Determinism of Small LLMs: Evidence of Low Answer Consistency in Repetition Trials of Standard Multiple-Choice Benchmarks


100. Temporal Preferences in Language Models for Long-Horizon Assistance


101. CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor


102. Creativity Benchmark: A benchmark for marketing creativity for LLM models


103. Cross-Layer Attention Probing for Fine-Grained Hallucination Detection


104. Structured Information Matters: Explainable ICD Coding with Patient-Level Knowledge Graphs


105. Wave-Based Semantic Memory with Resonance-Based Retrieval: A Phase-Aware Alternative to Vector Embedding Stores


106. Personas within Parameters: Fine-Tuning Small Language Models with Low-Rank Adapters to Mimic User Behaviors


107. AI-Powered Assistant for Long-Term Access to RHIC Knowledge


108. GeoGPT.RAG Technical Report


109. TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation


110. Text-to-SQL Oriented to the Process Mining Domain: A PT-EN Dataset for Query Translation


111. Forecasting Clicks in Digital Advertising: Multimodal Inputs and Interpretable Outputs


112. DB3 Team’s Solution For Meta KDD Cup’ 25


113. AEGIS: An Agent for Extraction and Geographic Identification in Scholarly Proceedings


114. Clip Your Sequences Fairly: Enforcing Length Fairness for Sequence-Level RL