전체 AI 논문 - 2026-05-08

1. AI Co-Mathematician: Accelerating Mathematicians with Agentic AI


2. GlazyBench: A Benchmark for Ceramic Glaze Property Prediction and Image Generation


3. Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key


4. MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems


5. SkillOS: Learning Skill Curation for Self-Evolving Agents


6. NeuroAgent: LLM Agents for Multimodal Neuroimaging Analysis and Research


7. Improved techniques for fine-tuning flow models via adjoint matching: a deterministic control pipeline


8. Ex Ante Evaluation of AI-Induced Idea Diversity Collapse


9. SpatialEpiBench: Benchmarking Spatial Information and Epidemic Priors in Forecasting


10. Market-Alignment Risk in Pricing Agents: Trace Diagnostics and Trace-Prior RL under Hidden Competitor State


11. Process Matters more than Output for Distinguishing Humans from Machines


12. From Token Lists to Graph Motifs: Weisfeiler-Lehman Analysis of Sparse Autoencoder Features


13. Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors


14. ReasonSTL: Bridging Natural Language and Signal Temporal Logic via Tool-Augmented Process-Rewarded Learning


15. Patch-Effect Graph Kernels for LLM Interpretability


16. Probabilistic Dating of Historical Manuscripts via Evidential Deep Regression on Visual Script Features


17. Beyond Task Success: Measuring Workflow Fidelity in LLM-Based Agentic Payment Systems


18. PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors


19. SCRuB: Social Concept Reasoning under Rubric-Based Evaluation



21. Automated alignment is harder than you think


22. Rethinking Vacuity for OOD Detection in Evidential Deep Learning


23. Debiased Multimodal Personality Understanding through Dual Causal Intervention


24. From Agent Loops to Deterministic Graphs: Execution Lineage for Reproducible AI-Native Work


25. Prediction and Empowerment: A Theory of Agency through Bridge Interfaces


26. More Than Can Be Said: A Benchmark and Framework for Pre-Question Scientific Ideation


27. Mind the Gap? A Distributional Comparison of Real and Synthetic Priors for Tabular Foundation Models


28. A Regime Theory of Controller Class Selection for LLM Action Decisions


29. Measuring Black-Box Confidence via Reasoning Trajectories: Geometry, Coverage, and Verbalization


30. Addressing Labelled Data Scarcity: Taxonomy-Agnostic Annotation of PII Values in HTTP Traffic using LLMs


31. Data Language Models: A New Foundation Model Class for Tabular Data


32. Safactory: A Scalable Agent Factory for Trustworthy Autonomous Intelligence


33. Price of Fairness in Short-Term and Long-Term Algorithmic Selections


34. A Versatile AI Agent for Rare Disease Diagnosis and Risk Gene Prioritization


35. Proactive Instance Navigation with Comparative Judgment for Ambiguous User Queries


36. Joint Consistency: A Unified Test-Time Aggregation Framework via Energy Minimization


37. Beyond Fixed Benchmarks and Worst-Case Attacks: Dynamic Boundary Evaluation for Language Models


38. Towards Annotation-Free Validation of MLLMs: A Vision-Language Logical Consistency Metric


39. The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models


40. Systematic Evaluation of Large Language Models for Post-Discharge Clinical Action Extraction


41. OPSD Compresses What RLVR Teaches: A Post-RL Compaction Stage for Reasoning Models


42. Event-Causal RAG: A Retrieval-Augmented Generation Framework for Long Video Reasoning in Complex Scenarios


43. Rethinking Adapter Placement: A Dominant Adaptation Module Perspective


44. BioMedArena: An Open-source Toolkit for Building and Evaluating Biomedical Deep Research Agents


45. Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost


46. Beyond Accuracy: Policy Invariance as a Reliability Test for LLM Safety Judges


47. Graphlets as Building Blocks for Structural Vocabulary in Knowledge Graph Foundation Models


48. Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning


49. P-Guide: Parameter-Efficient Prior Steering for Single-Pass CFG Inference


50. Back to the Beginning of Heuristic Design: Bridging Code and Knowledge with LLMs


51. Policy-Guided Stepwise Model Routing for Cost-Effective Reasoning


52. CrossCult-KIBench: A Benchmark for Cross-Cultural Knowledge Insertion in MLLMs


53. On Time, Within Budget: Constraint-Driven Online Resource Allocation for Agentic Workflows


54. Shallow Prefill, Deep Decoding: Efficient Long-Context Inference via Layer-Asymmetric KV Visibility


55. Safety Certification is Classification


56. VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?


57. Visual Fingerprints for LLM Generation Comparison


58. Novelty-based Tree-of-Thought Search for LLM Reasoning and Planning


59. Pathways to AGI


60. Strat-LLM: Stratified Strategy Alignment for LLM-based Stock Trading with Real-time Multi-Source Signals


61. BioResearcher: Scenario-Guided Multi-Agent for Translational Medicine


62. TACT: Mitigating Overthinking and Overacting in Coding Agents via Activation Steering


63. BehaviorGuard: Online Backdoor Defense for Deep Reinforcement Learning


64. TheraAgent: Self-Improving Therapeutic Agent for Precise and Comprehensive Treatment Planning


65. From Coordinate Matching to Structural Alignment: Rethinking Prototype Alignment in Heterogeneous Federated Learning


66. Temporal Smoothness Doubly Robust Learning for Debiased Knowledge Tracing


67. HaM-World: Soft-Hamiltonian World Models with Selective Memory for Planning


68. MAS-Algorithm: A Workflow for Solving Algorithmic Programming Problems with a Multi-Agent System


69. ICU-Bench:Benchmarking Continual Unlearning in Multimodal Large Language Models


70. In Data or Invisible: Toward a Better Digital Representation of Low-Resource Languages with Knowledge Graphs


71. Which Are the Low-Resource Languages of the Semantic Web?


72. Intentmaking and Sensemaking: Human Interaction with AI-Guided Mathematical Discovery


73. Wisteria: A Unified Multi-Scale Feature Learning Framework for DNA Language Model


74. PREFER: Personalized Review Summarization with Online Preference Learning


75. Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning


76. Agentic, Context-Aware Risk Intelligence in the Internet of Value


77. XDecomposer: Learning Prior-Free Set Decomposition for Multiphase X-ray Diffraction


78. SANEmerg: An Emergent Communication Framework for Semantic-aware Agentic AI Networking


79. AirQualityBench: A Realistic Evaluation Benchmark for Global Air Quality Forecasting


80. Taklif.AI: LLM-Powered Platform for Interest-Based Personalized College Assignments


81. On the Role of Language Representations in Auto-Bidding: Findings and Implications


82. MolRecBench-Wild: A Real-World Benchmark for Optical Chemical Structure Recognition


83. AGPO: Asymmetric Group Policy Optimization for Verifiable Reasoning and Search Ads Relevance at JD


84. Long-Horizon Q-Learning: Accurate Value Learning via n-Step Inequalities


85. Sheet as Token: A Graph-Enhanced Representation for Multi-Sheet Spreadsheet Understanding


86. Von Neumann Networks


87. HEDP: A Hybrid Energy-Distance Prompt-based Framework for Domain Incremental Learning


88. CircuitFormer: A Circuit Language Model for Analog Topology Design from Natural Language Prompt


89. Confidence is the key: how conformal prediction enhances the generative design of permeable peptides


90. Evaluating Explainability in Safety-Critical ATR Systems: Limitations of Post-Hoc Methods and Paths Toward Robust XAI


91. Best Arm Identification in Generalized Linear Bandits via Hybrid Feedback


92. HyperLens: Quantifying Cognitive Effort in LLMs with Fine-grained Confidence Trajectory


93. ReFlect: An Effective Harness System for Complex Long-Horizon LLM Reasoning


94. SDFlow: Similarity-Driven Flow Matching for Time Series Generation


95. Knee Osteoarthritis Severity Grading Using Optimized Deep Learning and LLM-Driven Intelligent AI on Computationally Limited Systems


96. SkillRet: A Large-Scale Benchmark for Skill Retrieval in LLM Agents


97. Detecting Time Series Anomalies Like an Expert: A Multi-Agent LLM Framework with Specialized Analyzers


98. More Is Not Always Better: Cross-Component Interference in LLM Agent Scaffolding


99. Decodable but Not Corrected by Fixed Residual-Stream Linear Steering: Evidence from Medical LLM Failure Regimes


100. Conceal, Reconstruct, Jailbreak: Exploiting the Reconstruction-Concealment Tradeoff in MLLMs


101. Resolving the bias-precision paradox with stochastic causal representation learning for personalized medicine


102. Knowledge-Graph Paths as Intermediate Supervision for Self-Evolving Search Agents


103. Inference-Time Budget Control for LLM Search Agents


104. Saliency-Aware Regularized Quantization Calibration for Large Language Models


105. GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model


106. DataDignity: Training Data Attribution for Large Language Models


107. Attractor Geometry of Transformer Memory: From Conflict Arbitration to Confident Hallucination


108. Chain of Risk: Safety Failures in Large Reasoning Models and Mitigation via Adaptive Multi-Principle Steering


109. Large Vision-Language Models Get Lost in Attention


110. Retrieval-Conditioned Topology Selection with Provable Budget Conservation for Multi-Agent Code Generation


111. Text-Graph Synergy: A Bidirectional Verification and Completion Framework for RAG


112. Prober.ai: Gated Inquiry-Based Feedback via LLM-Constrained Personas for Argumentative Writing Development


113. Causal Probing for Internal Visual Representations in Multimodal Large Language Models


114. Belief Memory: Agent Memory Under Partial Observability


115. AlphaCrafter: A Full-Stack Multi-Agent Framework for Cross-Sectional Quantitative Trading


116. Locality-aware Private Class Identification for Domain Adaptation with Extreme Label Shift


117. Nonsense Helps: Prompt Space Perturbation Broadens Reasoning Exploration


118. BitCal-TTS: Bit-Calibrated Test-Time Scaling for Quantized Reasoning Models


119. Who Prices Cognitive Labor in the Age of Agents? A Position on Compute-Anchored Wages


120. SPARK: Self-Play with Asymmetric Reward from Knowledge Graphs


121. AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases


122. Housing Potential Common Data Model and City Digital Twin


123. FoodCHA: Multi-Modal LLM Agent for Fine-Grained Food Analysis


124. FinRAG-12B: A Production-Validated Recipe for Grounded Question Answering in Banking


125. LANTERN: LLM-Augmented Neurosymbolic Transfer with Experience-Gated Reasoning Networks


126. Intentionality is a Design Decision: Measuring Functional Intentionality for Accountable AI Systems


127. Agentic Discovery of Exchange-Correlation Density Functionals


128. Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructure


129. The Geopolitics of AI Safety: A Causal Analysis of Regional LLM Bias


130. From History to State: Constant-Context Skill Learning for LLM Agents


131. LaTA: A Drop-in, FERPA-Compliant Local-LLM Autograder for Upper-Division STEM Coursework


132. Agentic Retrieval-Augmented Generation for Financial Document Question Answering


133. PRISM: Perception Reasoning Interleaved for Sequential Decision Making


134. When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models


135. Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections


136. BALAR : A Bayesian Agentic Loop for Active Reasoning


137. Partial Evidence Bench: Benchmarking Authorization-Limited Evidence in Agentic Systems


138. ZAYA1-8B Technical Report


139. Understanding Annotator Safety Policy with Interpretability


140. ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation


141. UniPool: A Globally Shared Expert Pool for Mixture-of-Experts


142. BAMI: Training-Free Bias Mitigation in GUI Grounding


143. Verifier-Backed Hard Problem Generation for Mathematical Reasoning


144. Optimizer-Model Consistency: Full Finetuning with the Same Optimizer as Pretraining Forgets Less


145. When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels


146. Superintelligent Retrieval Agent: The Next Frontier of Information Retrieval


147. Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study


148. StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction


149. Concept-Based Abductive and Contrastive Explanations for Behaviors of Vision Models


150. Recursive Agent Optimization


151. When and Why SignSGD Outperforms SGD: A Theoretical Study Based on $\ell_1$-norm Lower Bounds


152. The Structural Origin of Attention Sink: Variance Discrepancy, Super Neurons, and Dimension Disparity


153. AI CFD Scientist: Toward Open-Ended Computational Fluid Dynamics Discovery with Physics-Aware AI Agents


154. Patch2Vuln: Agentic Reconstruction of Vulnerabilities from Linux Distribution Binary Patches


155. UniSD: Towards a Unified Self-Distillation Framework for Large Language Models


156. Cross-Modal Navigation with Multi-Agent Reinforcement Learning


157. DINORANKCLIP: DINOv3 Distillation and Injection for Vision-Language Pretraining with High-Order Ranking Consistency


158. Towards Metric-Faithful Neural Graph Matching


159. Directional Consistency as a Complementary Optimization Signal: The GONO Framework


160. Coordination Matters: Evaluation of Cooperative Multi-Agent Reinforcement Learning


161. Continuous Latent Diffusion Language Model


162. Sparkle: Realizing Lively Instruction-Guided Video Background Replacement via Decoupled Guidance


163. On the Implicit Reward Overfitting and the Low-rank Dynamics in RLVR


164. Learning to Cut: Reinforcement Learning for Benders Decomposition


165. Is One Layer Enough? Understanding Inference Dynamics in Tabular Foundation Models


166. On the Security of Research Artifacts


167. PACZero: PAC-Private Fine-Tuning of Language Models via Sign Quantization


168. Operator-Guided Invariance Learning for Continuous Reinforcement Learning


169. 3D MRI Image Pretraining via Controllable 2D Slice Navigation Task


170. Litespark Inference on Consumer CPUs: Custom SIMD Kernels for Ternary Neural Networks


171. Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching


172. ORTHOBO: Orthogonal Bayesian Hyperparameter Optimization


173. Constraint Decay: The Fragility of LLM Agents in Backend Code Generation


174. COVID-19 Infodemic. Understanding content features in detecting fake news using a machine learning approach


175. E = T*H/(O+B): A Dimensionless Control Parameter for Mixture-of-Experts Ecology


176. WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling


177. Consistent Geometric Deep Learning via Hilbert Bundles and Cellular Sheaves


178. Asymmetric On-Policy Distillation: Bridging Exploitation and Imitation at the Token Level


179. MinMax Recurrent Neural Cascades


180. Continuous-Time Distribution Matching for Few-Step Diffusion Distillation


181. eXplaining to Learn (eX2L): Regularization Using Contrastive Visual Explanation Pairs for Distribution Shifts


182. Flow Matching with Arbitrary Auxiliary Paths


183. Memory Efficient Full-gradient Attacks (MEFA) Framework for Adversarial Defense Evaluations


184. Topological Signatures of Grokking


185. Is Escalation Worth It? A Decision-Theoretic Characterization of LLM Cascades


186. Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Systems Perspective


187. CoupleEvo: Evolving Heuristics for Coupled Optimization Problems Using Large Language Models


188. TinyBayes: Closed-Form Bayesian Inference via Jacobi Prior for Real-Time Image Classification on Edge Devices


189. Fine-Tuning Small Language Models for Solution-Oriented Windows Event Log Analysis


190. Measuring Evaluation-Context Divergence in Open-Weight LLMs: A Paired-Prompt Protocol with Pilot Evidence of Alignment-Pipeline-Specific Heterogeneity


191. Improving the Efficiency of Language Agent Teams with Adaptive Task Graphs


192. NavOne: One-Step Global Planning for Vision-Language Navigation on Top-Down Maps


193. Pro-KLShampoo: Projected KL-Shampoo with Whitening Recovered by Orthogonalization


194. Render, Don’t Decode: Weight-Space World Models with Latent Structural Disentanglement


195. Attributions All the Way Down? The Metagame of Interpretability


196. Log-Likelihood, Simpson’s Paradox, and the Detection of Machine-Generated Text


197. Multimodal Deep Generative Model for Semi-Supervised Learning under Class Imbalance


198. A Topological Sorting Criterion for Random Causal Directed Acyclic Graphs


199. Correct Code, Vulnerable Dependencies: A Large Scale Measurement Study of LLM-Specified Library Versions


200. Linear Semantic Segmentation for Low-Resource Spoken Dialects


201. Inference-Time Refinement Closes the Synthetic-Real Gap in Tabular Diffusion


202. The Weight Gram Matrix Captures Sequential Feature Linearization in Deep Networks


203. Cumulative-Goodness Free-Riding in Forward-Forward Networks: Real, Repairable, but Not Accuracy-Dominant


204. Band Together: Untargeted Adversarial Training with Multimodal Coordination against Evasion-based Promotion Attacks


205. OBLIQ-Bench: Exposing Overlooked Bottlenecks in Modern Retrievers with Latent and Implicit Queries


206. Soft Deterministic Policy Gradient with Gaussian Smoothing


207. Memory Inception: Latent-Space KV Cache Manipulation for Steering LLMs


208. When to Trust Imagination: Adaptive Action Execution for World Action Models


209. TIDE: Every Layer Knows the Token Beneath the Context


210. FunctionalAgent: Towards end-to-end on-top functional design


211. Contrastive Identification and Generation in the Limit


212. Super-Level-Set Regression: Conditional Quantiles via Volume Minimization


213. Taming the Entropy Cliff: Variable Codebook Size Quantization for Autoregressive Visual Generation


214. EA-WM: Event-Aware Generative World Model with Structured Kinematic-to-Visual Action Fields


215. In-Context Black-Box Optimization with Unreliable Feedback


216. Retina-RAG: Retrieval-Augmented Vision-Language Modeling for Joint Retinal Diagnosis and Clinical Report Generation


217. HNC: Leveraging Hard Negative Captions towards Models with Fine-Grained Visual-Linguistic Comprehension Capabilities


218. Entropy-Regularized Adjoint Matching for Offline RL


219. AdaGamma: State-Dependent Discounting for Temporal Adaptation in Reinforcement Learning


220. Learning Discrete Autoregressive Priors with Wasserstein Gradient Flow


221. Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization


222. AI-Generated Images: What Humans and Machines See When They Look at the Same Image


223. IRC-Bench: Recognizing Entities from Contextual Cues in First-Person Reminiscences


224. SymDrift: One-Shot Generative Modeling under Symmetries


225. Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex


226. Autoregressive Visual Generation Needs a Prologue


227. BUILD-AND-FIND: An Effort-Aware Protocol for Evaluating Agent-Managed Codebases


228. Continuous Expert Assembly: Instance-Conditioned Low-Rank Residuals for All-in-One Image Restoration


229. Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking


230. Schedule-and-Calibrate: Utility-Guided Multi-Task Reinforcement Learning for Code LLMs


231. Beyond Autoregressive RTG: Conditioning via Injection Outside Sequential Modeling in Decision Transformer


232. CredibleDFGO: Differentiable Factor Graph Optimization with Credibility Supervision


233. VISD: Enhancing Video Reasoning via Structured Self-Distillation


234. Milestone-Guided Policy Learning for Long-Horizon Language Agents


235. Normalized Architectures are Natively 4-Bit


236. Causal Reinforcement Learning for Complex Card Games: A Magic The Gathering Benchmark


237. TFM-Retouche: A Lightweight Input-Space Adapter for Tabular Foundation Models


238. Optimal Transport for LLM Reward Modeling from Noisy Preference


239. Quantum Kernels for Audio Deepfake Detection Using Spectrogram Patch Features


240. When AI Meets Science: Research Diversity, Interdisciplinarity, Visibility, and Retractions across Disciplines in a Global Surge


241. Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters


242. Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven


243. T2I-VeRW: Part-level Fine-grained Perception for Text-to-Image Vehicle Retrieval


244. Adding Thermal Awareness to Visual Systems in Real-Time via Distilled Diffusion Models


245. PersonaKit (PK): A Plug-and-Play Platform for User Testing Diverse Roles in Full-Duplex Dialogue


246. A Fine-Grained Understanding of Uniform Convergence for Halfspaces


247. Safety Anchor: Defending Harmful Fine-tuning via Geometric Bottlenecks


248. iPhoneBlur: A Difficulty-Stratified Benchmark for Consumer Device Motion Deblurring


249. PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts


250. Towards Reliable LLM Evaluation: Correcting the Winner’s Curse in Adaptive Benchmarking


251. Beyond Uniform Credit Assignment: Selective Eligibility Traces for RLVR


252. Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits


253. LLM-Driven Design Space Exploration of FPGA-based Accelerators


254. Quantum-enhanced Large Language Models on Quantum Hardware via Cayley Unitary Adapters


255. Architecture-agnostic Lipschitz-constant Bayesian header and its application to resolve semantically proximal classification errors with vision transformers


256. VARS-FL: Validation-Aligned Client Selection for Non-IID Federated Learning in IoT Systems


257. Detecting AI-Generated Videos with Spiking Neural Networks


258. Logic-Regularized Verifier Elicits Reasoning from LLMs


259. MTL-MAD: Multi-Task Learners are Effective Medical Anomaly Detectors


260. DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation


261. Tuning Derivatives for Causal Fairness in Machine Learning


262. CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency


263. SOPE: Stabilizing Off-Policy Evaluation for Online RL with Prior Data


264. VideoRouter: Query-Adaptive Dual Routing for Efficient Long-Video Understanding


265. LoopTrap: Termination Poisoning Attacks on LLM Agents


266. LeakDojo: Decoding the Leakage Threats of RAG Systems


267. A Testable Certificate for Constant Collapse in Teacher-Guided VAEs


268. LCC-LLM: Leveraging Code-Centric Large Language Models for Malware Attribution


269. Revealing Modular Gradient Noise Imbalance in LLMs: Calibrating Adam via Signal-to-Noise Ratio


270. Steering Visual Generation in Unified Multimodal Models with Understanding Supervision


271. The autoPET3 Challenge – Automated Lesion Segmentation in Whole-Body PET/CT - Multitracer Multicenter Generalization


272. Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning


273. Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement


274. Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring and FFCI Testing in Mixed Data


275. Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using LLM Judges with Closed-Loop Reinforcement Learning Feedback


276. CoMemNet: Contrastive Sampling with Memory Replay Network for Continual Traffic Prediction


277. CRAFT: Forgetting-Aware Intervention-Based Adaptation for Continual Learning


278. WARP: A Benchmark for Primal-Dual Warm-Starting of Interior-Point Solvers


279. Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes


280. SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety


281. Active Learning for Communication Structure Optimization in LLM-Based Multi-Agent Systems


282. An Empirical Study of Proactive Coding Assistants in Real-World Software Development


283. When Quantization Is Free: An int4 KV Cache That Outruns fp16 on Apple Silicon


284. Budgeted Attention Allocation: Cost-Conditioned Compute Control for Efficient Transformers


285. Irminsul: MLA-Native Position-Independent Caching for Agentic LLM Serving


286. CFE-PPAR: Compression-friendly encryption for privacy-preserving action recognition leveraging video transformers


287. Temporal Functional Circuits: From Spline Plots to Faithful Explanations in KAN Forecasting


288. PersonaTeaming: Supporting Persona-Driven Red-Teaming for Generative AI


289. Decomposing the Basic Abilities of Large Language Models: Mitigating Cross-Task Interference in Multi-Task Instruct-Tuning


290. EGA: Adapting Frozen Encoders for Vector Search with Bounded Out-of-Distribution Degradation


291. XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity


292. The Missing Evaluation Axis: What 10,000 Student Submissions Reveal About AI Tutor Effectiveness


293. One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue


294. Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping


295. When2Speak: A Dataset for Temporal Participation and Turn-Taking in Multi-Party Conversations for Large Language Models


296. X-Voice: Enabling Everyone to Speak 30 Languages via Zero-Shot Cross-Lingual Voice Cloning


297. Nearly Optimal Attention Coresets


298. Accelerating LMO-Based Optimization via Implicit Gradient Transport


299. AstroAlertBench: Evaluating the Accuracy, Reasoning, and Honesty of Multimodal LLMs in Astronomical Classification


300. MOSAIC: Module Discovery via Sparse Additive Identifiable Causal Learning for Scientific Time Series


301. When Semantic Communication Meets Queueing: Cross-Layer Latency and Task Fidelity Optimization


302. ReaComp: Compiling LLM Reasoning into Symbolic Solvers for Efficient Program Synthesis


303. GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation


304. A Unified Benchmark for Evaluating Knowledge Graph Construction Methods and Graph Neural Networks


305. The Pedagogy of AI Mistakes: Fostering Higher-Order Thinking


306. Robustness of Graph Self-Supervised Learning to Real-World Noise: A Case Study on Text-Driven Biomedical Graphs


307. SLAM: Structural Linguistic Activation Marking for Language Models


308. On Semantic Loss Fine-Tuning Approach for Preventing Model Collapse in Causal Reasoning


309. Information Theoretic Adversarial Training of Large Language Models


310. Creative Robot Tool Use by Counterfactual Reasoning


311. Mise en Place for Agentic Coding: Deliberate Preparation as Context Engineering Methodology


312. Generating Query-Focused Summarization Datasets from Query-Free Summarization Datasets


313. Two-Stage Learned Decomposition for Scalable Routing on Multigraphs


314. Two Steps Are All You Need: Efficient 3D Point Cloud Anomaly Detection with Consistency Models


315. SPADE: Faster Drug Discovery by Learning from Sparse Data


316. Towards an Inferentialist Account of Information Through Proof-theoretic Semantics


317. Tamaththul3D: High-Fidelity 3D Saudi Sign Language Avatars from Monocular Video


318. COPYCOP: Ownership Verification for Graph Neural Networks


319. Counterargument for Critical Thinking as Judged by AI and Humans


320. Making AI Drafts Count: A Quality Threshold in Audio Description Workflows


321. Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery


322. Feature Starvation as Geometric Instability in Sparse Autoencoders


323. How Far Are VLMs from Privacy Awareness in the Physical World? An Empirical Study


324. ViTok-v2: Scaling Native Resolution Auto-Encoders to 5 Billion Parameters


325. Graph Normalization: Fast Binarizing Dynamics for Differentiable MWIS


326. Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use


327. Shattering the Echo Chamber: Hidden Safeguards in Manuscripts Against the AI Takeover of Peer Review


328. Bridging Generation and Training: A Systematic Review of Quality Issues in LLMs for Code


329. Maximizing Rollout Informativeness under a Fixed Budget: A Submodular View of Tree Search for Tool-Use Agentic Reinforcement Learning


330. Enhancing Cryo-EM Density Map Segmentation in Phenix for Improved Atomic Model Building


331. Career-Aware Resume Tailoring via Multi-Source Retrieval-Augmented Generation with Provenance Tracking: A Case Study


332. Automated Population-Level Audit Assurance via AI-Based Document Intelligence


333. Decision-aware User Simulation Agent for Evaluating Conversational Recommender Systems


334. Governed Metaprogramming for Intelligent Systems: Reclassifying Eval as a Governed Effec


335. Memory-Efficient EDA Denoising via Knowledge Distillation for Wearable IoT Under Severe Motion Artifacts and Underwater Conditions


336. Towards Dependable Retrieval-Augmented Generation Using Factual Confidence Prediction


337. Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction


338. PPO-Based Dynamic Positioning of HAPS-BS in Wind-Disturbed Stratospheric Maritime Networks


339. Topology-Driven Anti-Entanglement Control for Soft Robots


340. Evolutionary fine tuning of quantized convolution-based deep learning models


341. Rethinking Data Curation in LLM Training: Online Reweighting Offers Better Generalization than Offline Methods


342. Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning


343. MACS: Modality-Aware Capacity Scaling for Efficient Multimodal MoE Inference


344. Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms


345. Structural Instability of Feature Composition


346. Adaptive Computation Depth via Learned Token Routing in Transformers


347. MidSteer: Optimal Affine Framework for Steering Generative Models


348. Sparse Prefix Caching for Hybrid and Recurrent LLM Serving


349. Horizon-Constrained Rashomon Sets for Chaotic Forecasting


350. Physics-Informed Neural Networks with Learnable Loss Balancing and Transfer Learning


351. Layout-Aware Representation Learning for Open-Set ID Fraud Discovery


352. MedMamba: Recasting Mamba for Medical Time Series Classification


353. A Review of Large Language Models for Stock Price Forecasting from a Hedge-Fund Perspective


354. Are Flat Minima an Illusion?


355. A Note on TurboQuant and the Earlier DRIVE/EDEN Line of Work