전체 AI 논문 - 2026-02-20

1. CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts


2. AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing


3. MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models


4. AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games


5. A Hybrid Federated Learning Based Ensemble Approach for Lung Disease Diagnosis Leveraging Fusion of SWIN Transformer and CNN


6. ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment


7. KLong: Training LLM Agent for Extremely Long-horizon Tasks


8. Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability


9. Enhancing Large Language Models (LLMs) for Telecom using Dynamic Knowledge Graphs and Explainable Retrieval-Augmented Generation


10. Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems


11. WarpRec: Unifying Academic Rigor and Industrial Scale for Responsible, Reproducible, and Efficient Recommendation


12. A Privacy by Design Framework for Large Language Model-Based Applications for Children


13. A Contrastive Variational AutoEncoder for NSCLC Survival Prediction with Missing Modalities


14. Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval


15. Dataless Weight Disentanglement in Task Arithmetic via Kronecker-Factored Approximate Curvature


16. MedClarify: An information-seeking AI agent for medical diagnosis with case-specific follow-up questions


17. ArXiv-to-Model: A Practical Study of Scientific LM Training


18. Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web


19. All Leaks Count, Some Count More: Interpretable Temporal Contamination Detection in LLM Backtesting


20. Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom’s Taxonomy


21. Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight


22. From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan’s Humanities and Social Sciences


23. Continual learning and refinement of causal models through dynamic predicate invention


24. Texo: Formula Recognition within 20M Parameters


25. JEPA-DNA: Grounding Genomic Foundation Models through Joint-Embedding Predictive Architectures


26. Bonsai: A Framework for Convolutional Neural Network Acceleration Using Criterion-Based Pruning


27. Efficient Parallel Algorithm for Decomposing Hard CircuitSAT Instances


28. Epistemology of Generative AI: The Geometry of Knowing


29. Instructor-Aligned Knowledge Graphs for Personalized Learning


30. Owen-based Semantics and Hierarchy-Aware Explanation (O-Shap)


31. Toward Trustworthy Evaluation of Sustainability Rating Methodologies: A Human-AI Collaborative Framework for Benchmark Dataset Construction


32. Agentic Wireless Communication for 6G: Intent-Aware and Continuously Evolving Physical-Layer Intelligence


33. How AI Coding Agents Communicate: A Study of Pull Request Description Characteristics and Human Review Responses


34. Predictive Batch Scheduling: Accelerating Language Model Training Through Loss-Aware Sample Prioritization


35. Retaining Suboptimal Actions to Follow Shifting Optima in Multi-Agent Reinforcement Learning


36. RFEval: Benchmarking Reasoning Faithfulness under Counterfactual Reasoning Intervention in Large Reasoning Models


37. IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use Agents


38. Dynamic System Instructions and Tool Exposure for Efficient Agentic LLMs


39. Phase-Aware Mixture of Experts for Agentic Reinforcement Learning


40. Sales Research Agent and Sales Research Bench


41. M2F: Automated Formalization of Mathematical Literature at Scale


42. Cinder: A fast and fair matchmaking system


43. Sonar-TS: Search-Then-Verify Natural Language Querying for Time Series Databases


44. Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation


45. Fundamental Limits of Black-Box Safety Evaluation: Information-Theoretic and Computational Barriers from Latent Context Conditioning


46. HQFS: Hybrid Quantum Classical Financial Security with VQC Forecasting, QUBO Annealing, and Audit-Ready Post-Quantum Signing


47. Automating Agent Hijacking via Structural Template Injection


48. LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation


49. Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents


50. SourceBench: Can AI Answers Reference Quality Web Sources?


51. DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs


52. Narrow fine-tuning erodes safety alignment in vision-language agents


53. LLM-WikiRace: Benchmarking Long-term Planning and Reasoning over Real-World Knowledge Graphs


54. AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks


55. OpenSage: Self-programming Agent Generation Engine


56. Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents


57. IndicJR: A Judge-Free Benchmark of Jailbreak Robustness in South Asian Languages


58. An order-oriented approach to scoring hesitant fuzzy elements


59. Node Learning: A Framework for Adaptive, Decentralised and Collaborative Network Edge AI


60. NeuDiff Agent: A Governed AI Workflow for Single-Crystal Neutron Crystallography


61. Improved Upper Bounds for Slicing the Hypercube


62. Simple Baselines are Competitive with Code Evolution


63. When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation


64. Mobility-Aware Cache Framework for Scalable LLM-Based Human Mobility Simulation


65. Contextuality from Single-State Representations: An Information-Theoretic Principle for Adaptive Intelligence


66. Retrieval Augmented (Knowledge Graph), and Large Language Model-Driven Design Structure Matrix (DSM) Generation of Cyber-Physical Systems


67. AIdentifyAGE Ontology for Decision Support in Forensic Dental Age Assessment


68. Sink-Aware Pruning for Diffusion Language Models


69. MARS: Margin-Aware Reward-Modeling with Self-Refinement


70. Pushing the Frontier of Black-Box LVLM Attacks via Fine-Grained Detail Targeting


71. FAMOSE: A ReAct Approach to Automated Feature Discovery


72. Reverso: Efficient Time Series Foundation Models for Zero-shot Forecasting


73. When to Trust the Cheap Check: Weak and Strong Verification for Reasoning


74. SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer


75. Stable Asynchrony: Variance-Controlled Off-Policy RL for LLMs


76. Towards Anytime-Valid Statistical Watermarking


77. Adapting Actively on the Fly: Relevance-Guided Online Meta-Learning with Latent Concepts for Geospatial Discovery


78. The Cascade Equivalence Hypothesis: When Do Speech LLMs Behave Like ASR$\rightarrow$LLM Pipelines?


79. Conditional Flow Matching for Continuous Anomaly Detection in Autonomous Driving on a Manifold-Aware Spectral Space


80. Be Wary of Your Time Series Preprocessing


81. Probability-Invariant Random Walk Learning on Gyral Folding-Based Cortical Similarity Networks for Alzheimer’s and Lewy Body Dementia Diagnosis


82. MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sample-Efficient LLM Reasoning


83. Toward a Fully Autonomous, AI-Native Particle Accelerator


84. Systematic Evaluation of Single-Cell Foundation Model Interpretability Reveals Attention Captures Co-Expression Rather Than Unique Regulatory Signal


85. Position: Evaluation of ECG Representations Must Be Fixed


86. The Anxiety of Influence: Bloom Filters in Transformer Attention Heads


87. LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights


88. Learning with Boolean threshold functions


89. Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection


90. What Do LLMs Associate with Your Name? A Human-Centered Black-Box Audit of Personal Data


91. Jolt Atlas: Verifiable Inference via Lookup Arguments in Zero Knowledge


92. Beyond Pipelines: A Fundamental Study on the Rise of Generative-Retrieval Architectures in Web Research


93. Fine-Grained Uncertainty Quantification for Long-Form Language Model Outputs: A Comparative Study


94. Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking


95. Improving LLM-based Recommendation with Self-Hard Negatives from Intermediate Layers


96. A High-Level Survey of Optical Remote Sensing


97. SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery


98. Voice-Driven Semantic Perception for UAV-Assisted Emergency Networks


99. A feature-stable and explainable machine learning framework for trustworthy decision-making under incomplete clinical data


100. What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?


101. From Subtle to Significant: Prompt-Driven Self-Improving Optimization in Test-Time Graph OOD Detection


102. SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework


103. WebFAQ 2.0: A Multilingual QA Dataset with Mined Hard Negatives for Dense Retrieval


104. Same Meaning, Different Scores: Lexical and Syntactic Sensitivity in LLM Evaluation


105. Flickering Multi-Armed Bandits


106. Towards Cross-lingual Values Assessment: A Consensus-Pluralism Perspective


107. Federated Latent Space Alignment for Multi-user Semantic Communications


108. TAPO-Structured Description Logic for Information Behavior: Procedural and Oracle-Based Extensions


109. Extending quantum theory with AI-assisted deterministic game theory


110. Deeper detection limits in astronomical imaging using self-supervised spatiotemporal denoising


111. The Bots of Persuasion: Examining How Conversational Agents’ Linguistic Expressions of Personality Affect User Perceptions and Decisions


112. Robustness and Reasoning Fidelity of Large Language Models in Long-Context Code Question Answering


113. Universal Fine-Grained Symmetry Inference and Enforcement for Rigorous Crystal Structure Prediction


114. Continual uncertainty learning


115. In-Context Learning in Linear vs. Quadratic Attention Models: An Empirical Study on Regression Tasks


116. TimeOmni-VL: Unified Models for Time Series Understanding and Generation


117. VP-VAE: Rethinking Vector Quantization via Adaptive Vector Perturbation


118. 3D Scene Rendering with Multimodal Gaussian Splatting


119. TIFO: Time-Invariant Frequency Operator for Stationarity-Aware Representation Learning in Time Series


120. Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization


121. FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment


122. AdvSynGNN: Structure-Adaptive Graph Neural Nets via Adversarial Synthesis and Self-Corrective Propagation


123. General sample size analysis for probabilities of causation: a delta method approach


124. Sign Lock-In: Randomly Initialized Weight Signs Persist and Bottleneck Sub-Bit Model Compression


125. ALPS: A Diagnostic Challenge Set for Arabic Linguistic & Pragmatic Reasoning


126. Evaluating Cross-Lingual Classification Approaches Enabling Topic Discovery for Multilingual Social Media Data


127. Wink: Recovering from Misbehaviors in Coding Agents


128. Forecasting Anomaly Precursors via Uncertainty-Aware Time-Series Ensembles


129. Transforming Behavioral Neuroscience Discovery with In-Context Learning and AI-Enhanced Tensor Methods


130. ReIn: Conversational Error Recovery with Reasoning Inception


131. Persona2Web: Benchmarking Personalized Web Agents for Contextual Reasoning with User History


132. Exploring LLMs for User Story Extraction from Mockups


133. DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers


134. Early-Warning Signals of Grokking via Loss-Landscape Geometry


135. A Unified Framework for Locality in Scalable MARL


136. Eigenmood Space: Uncertainty-Aware Spectral Graph Analysis of Psychological Patterns in Classical Persian Poetry


137. When Semantic Overlap Is Not Enough: Cross-Lingual Euphemism Transfer Between Turkish and English


138. Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning


139. RankEvolve: Automating the Discovery of Retrieval Algorithms via LLM-Driven Evolution


140. Say It My Way: Exploring Control in Conversational Visual Question Answering with Blind Users


141. Discovering Multiagent Learning Algorithms with Large Language Models


142. Xray-Visual Models: Scaling Vision models on Industry Scale Data


143. A Reversible Semantics for Janus


144. MALLVI: a multi agent framework for integrated generalized robotics manipulation


145. AdaptOrch: Task-Adaptive Multi-Agent Orchestration in the Era of LLM Performance Convergence


146. Position: Why a Dynamical Systems Perspective is Needed to Advance Time Series Modeling


147. SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation


148. Overseeing Agents Without Constant Oversight: Challenges and Opportunities


149. VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training – A Chess Case Study


150. Learning under noisy supervision is governed by a feedback-truth gap


151. HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind


152. AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course


153. One-step Language Modeling via Continuous Denoising


154. Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark


155. References Improve LLM Alignment in Non-Verifiable Domains


156. Large-scale online deanonymization with LLMs


157. Attending to Routers Aids Indoor Wireless Localization


158. PREFER: An Ontology for the PREcision FERmentation Community


159. LiveClin: A Live Clinical Benchmark without Leakage


160. Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking


161. PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency


162. DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning


163. Can Adversarial Code Comments Fool AI Security Reviewers – Large-Scale Empirical Study of Comment-Based Attacks and Defenses Against LLM Code Analysis


164. Quantifying LLM Attention-Head Stability: Implications for Circuit Universality


165. The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution


166. Intent Laundering: AI Safety Datasets Are Not What They Seem


167. Is Mamba Reliable for Medical Imaging?


168. APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL


169. GPU-Accelerated Algorithms for Graph Vector Search: Taxonomy, Empirical Study, and Research Directions