2026-04-06 科技日报

38 min

🤖 AI 深度分析 457 篇 · 💻 科技动态 15 条 · 共 472 篇

⏰ 生成时间 05

UTC


Part I: 🤖 AI 深度日报 (457 篇)

AI 科技日报 — 2026-04-06

📰 457 篇文章 · 26 个分类 · 🤖 AI 智能摘要


🔥 今日重点

刚刚,Claude 4小时血洗全球最安全系统!人类最后防线失守

来源: 新智元 | 为什么重要: AI自主攻破高安全系统意味着传统安全防御体系面临颠覆性威胁,网络安全行业将被迫全面升级防御策略。AI从辅助工具转变为自主攻击者,这一质变对国家安全和企业安全均有深远影响。

AI每天揪出10个真漏洞!Linux老兵发文求救:根本修不完

来源: 新智元 | 为什么重要: AI驱动的漏洞发现速度远超人类修复能力,暴露出开源基础设施安全维护的人力瓶颈。这一趋势可能导致关键系统长期暴露于未修复漏洞之中,亟需自动化安全修复方案的突破。

卡帕西引爆硅谷!公开「第二大脑」黑科技,1250万人围观

来源: 新智元 | 为什么重要: Karpathy作为AI领域顶级影响力的实践者,其个人知识管理方案代表了一种全新的AI原生工作流范式。该方案提出「RAG已死」的大胆论断,可能深刻影响个人知识管理工具的发展方向。

LLM Reasoning with Process Rewards for Outcome-Guided Steps

来源: ArXiv ML (cs.LG) | 为什么重要: 过程奖励模型(PRM)是当前LLM推理训练的热点方向,本文提出的结果条件中心化方法解决了PRM奖励作弊问题。该方法在多个数学基准上稳定提升Pass@1,且无需额外可训练组件,对GRPO等主流训练流程具有直接实用价值。

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

来源: ArXiv ML (cs.LG) | 为什么重要: 这项研究揭示了模型间知识传递的极致效率——仅10个yes/no问题就能恢复小模型到大模型能力差距的72%,压缩比达0.0006-0.004,比先前方法提升超100倍。这对边缘部署、知识蒸馏和模型间通信协议设计具有深远启示。


AI安全 (3 篇)

⭐ 必读

1. 刚刚,Claude 4小时血洗全球最安全系统!人类最后防线失守

来源: 新智元 | 为什么重要: AI自主攻破高安全系统意味着传统安全防御体系面临颠覆性威胁,网络安全行业将被迫全面升级防御策略。AI从辅助工具转变为自主攻击者,这一质变对国家安全和企业安全均有深远影响。

2. AI每天揪出10个真漏洞!Linux老兵发文求救:根本修不完

来源: 新智元 | 为什么重要: AI驱动的漏洞发现速度远超人类修复能力,暴露出开源基础设施安全维护的人力瓶颈。这一趋势可能导致关键系统长期暴露于未修复漏洞之中,亟需自动化安全修复方案的突破。

📰 岼得关注

#文章来源要点
1AI融入社会的三阶段风险!以自主演化为轴,重构智能体安全威胁新智元为AI智能体在医疗、金融等高风险场景部署提供了按自主性分级的安全评估思路。

AI应用 (1 篇)

⭐ 必读

1. 卡帕西引爆硅谷!公开「第二大脑」黑科技,1250万人围观

来源: 新智元 | 为什么重要: Karpathy作为AI领域顶级影响力的实践者,其个人知识管理方案代表了一种全新的AI原生工作流范式。该方案提出「RAG已死」的大胆论断,可能深刻影响个人知识管理工具的发展方向。


AI伦理 (1 篇)

📰 岼得关注

#文章来源要点
1越预警越被骂!AI三巨头陷入「奥本海默」死局新智元AI行业领袖在推动技术发展的同时预警风险,陷入无论怎么做都会被批评的公共关系困境。

AI模型 (4 篇)

📰 岼得关注

#文章来源要点
1OpenAI新模型不是GPTX!全新预训练“土豆”曝光,Sora成弃子的原因找到了量子位OpenAI放弃GPT命名体系暗示底层架构的重大变革,新预训练方法可能改变行业技术路线。
2LiME: Lightweight Mixture of Experts for Efficient Multimodal Multi-task LearningArXiv ML (cs.LG)在MoE与参数高效微调结合方向提出更轻量的方案,降低多任务适配的计算成本。
3Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language ModelsArXiv ML (cs.LG)为扩散语言模型的推理加速提供了新思路,有助于缩小与自回归模型的速度差距。
📋 简讯 (1 篇)
#文章来源
1SIEVE: Sample-Efficient Parametric Learning from Natural LanguageArXiv ML (cs.LG)

AI产业 (1 篇)

📋 简讯 (1 篇)
#文章来源
1太初元碁向员工发放百亿算力token并将共建高校AI科教融合学院量子位

AI医疗 (1 篇)

📰 岼得关注

#文章来源要点
1Generating Counterfactual Patient Timelines from Real-World DataArXiv ML (cs.LG)反事实临床模拟可为医生提供「如果选择另一种疗法会怎样」的决策参考。

AI/LLM Reasoning (1 篇)

⭐ 必读

1. LLM Reasoning with Process Rewards for Outcome-Guided Steps

来源: ArXiv ML (cs.LG) | 为什么重要: 过程奖励模型(PRM)是当前LLM推理训练的热点方向,本文提出的结果条件中心化方法解决了PRM奖励作弊问题。该方法在多个数学基准上稳定提升Pass@1,且无需额外可训练组件,对GRPO等主流训练流程具有直接实用价值。


AI/LLM Compression (1 篇)

⭐ 必读

1. Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

来源: ArXiv ML (cs.LG) | 为什么重要: 这项研究揭示了模型间知识传递的极致效率——仅10个yes/no问题就能恢复小模型到大模型能力差距的72%,压缩比达0.0006-0.004,比先前方法提升超100倍。这对边缘部署、知识蒸馏和模型间通信协议设计具有深远启示。


AI/Systems & Infrastructure (1 篇)

⭐ 必读

1. Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers

来源: ArXiv ML (cs.LG) | 为什么重要: WebGPU是浏览器端运行LLM的关键技术,本文首次系统量化了其调度开销瓶颈(24-71微秒/操作),并构建了完整的torch-webgpu工具链。研究发现后端选择是主要影响因素,为浏览器端AI推理优化提供了重要基准数据。


AI/GUI Agents (1 篇)

📰 岼得关注

#文章来源要点
1UI-Oceanus: Scaling GUI Agents with Synthetic Environmental DynamicsArXiv ML (cs.LG)通过合成环境动态自动生成训练数据,为通用GUI代理的数据扩展提供新范式。

AI/Drug Discovery (1 篇)

📰 岼得关注

#文章来源要点
1DrugPlayGround: Benchmarking Large Language Models and Embeddings for Drug DiscoveryArXiv ML (cs.LG)为LLM在药物发现领域的应用建立了首个系统性基准测试框架,填补了客观评估空白。

AI/Time Series (1 篇)

📋 简讯 (1 篇)
#文章来源
1FTimeXer: Frequency-aware Time-series Transformer with Exogenous variables for Robust Carbon Footprint ForecastingArXiv ML (cs.LG)

AI/Neuro-Symbolic Reasoning (1 篇)

📰 岼得关注

#文章来源要点
1Differentiable Symbolic Planning: A Neural Architecture for Constraint Reasoning with Learned FeasibilityArXiv ML (cs.LG)将符号推理的可解释性与神经网络的可微性结合,在约束推理任务上显著超越传统方法。

ML/Ops & Deployment (1 篇)

📰 岼得关注

#文章来源要点
1Modeling and Controlling Deployment Reliability under Temporal Distribution ShiftArXiv ML (cs.LG)将部署可靠性视为可控多目标系统,为非平稳环境下ML模型的运维决策提供新框架。

AI/Code Generation (1 篇)

📋 简讯 (1 篇)
#文章来源
1An Initial Exploration of Contrastive Prompt Tuning to Generate Energy-Efficient CodeArXiv ML (cs.LG)

AI/Image Generation (1 篇)

📰 岼得关注

#文章来源要点
1From Broad Exploration to Stable Synthesis: Entropy-Guided Optimization for Autoregressive Image GenerationArXiv ML (cs.LG)揭示了CoT探索与RL优化间的熵交互机制,为自回归图像生成提供新优化范式。

AI/Finance & Startups (1 篇)

📋 简讯 (1 篇)
#文章来源
1YC Bench: a Live Benchmark for Forecasting Startup Outperformance in Y Combinator BatchesArXiv ML (cs.LG)

深度学习理论 (1 篇)

⭐ 必读

1. Dynamical structure of vanishing gradient and overfitting in multi-layer perceptrons

来源: ArXiv ML (cs.LG) | 为什么重要: 该工作为MLP学习动力学提供了严格的理论描述,揭示了训练过程中鞍点结构与过拟合的必然联系。对于理解深度学习泛化失败的根本原因具有重要理论意义,挑战了常规正则化手段的充分性假设。


LLM推理评估 (1 篇)

⭐ 必读

1. Do We Need Frontier Models to Verify Mathematical Proofs?

来源: ArXiv ML (cs.LG) | 为什么重要: 研究揭示小模型实际具备验证数学证明的能力,关键在于提示工程而非模型规模,Qwen3.5-35B即可媲美Gemini 3.1 Pro。这对降低AI数学验证成本、推动开源模型在形式推理中的应用具有实际指导价值。


LLM机理研究 (1 篇)

⭐ 必读

1. On the Geometric Structure of Layer Updates in Deep Language Models

来源: ArXiv ML (cs.LG) | 为什么重要: 该研究提出了架构无关的分析框架,发现Transformer和SSM模型中层更新可分解为主导逐token分量与几何独立残差,残差与输出扰动的Spearman相关高达0.95。这为理解LLM内部计算机制提供了全新几何视角。


应用机器学习/推荐系统 (1 篇)

📰 岼得关注

#文章来源要点
1VALOR: Value-Aware Revenue Uplift Modeling with Treatment-Gated Representation for B2B SalesArXiv ML (cs.LG)在生产A/B测试中验证了2.7倍增量收入提升,为B2B销售场景提供了实用的因果推断解决方案。

生成模型/因果推断 (1 篇)

📰 岼得关注

#文章来源要点
1SEDGE: Structural Extrapolated Data GenerationArXiv ML (cs.LG)首次为外推数据生成提供了理论可识别性保证,并结合扩散后验采样提供了实用算法。

LLM训练优化/量化 (1 篇)

📰 岼得关注

#文章来源要点
1AdaHOP: Fast and Accurate Low-Precision Training via Outlier-Pattern-Aware RotationArXiv ML (cs.LG)首次系统研究LLM训练中异常值模式分类,通过自适应策略在MXFP4精度下实现BF16等效训练质量。

LLM应用/强化学习 (1 篇)

📰 岼得关注

#文章来源要点
1Jump Start or False Start? A Theoretical and Empirical Evaluation of LLM-initialized BanditsArXiv ML (cs.LG)给出了LLM热启动优于冷启动的充分条件理论证明,为LLM在推荐系统中的实际部署提供了可靠性边界。

LLM推理优化/系统 (1 篇)

📰 岼得关注

#文章来源要点
1Fast NF4 Dequantization Kernels for Large Language Model InferenceArXiv ML (cs.LG)提供即插即用的HuggingFace兼容方案,端到端推理提升1.54倍,降低大模型在现有GPU上的部署门槛。

🧠 AI 研究前沿 (427 篇)

📰 岼得关注

#文章来源要点
1Communication-Efficient Distributed Learning with Differential PrivacyArXiv ML (cs.LG)
2ROMAN: A Multiscale Routing Operator for Convolutional Time Series ModelsArXiv ML (cs.LG)
3VoxelCodeBench: Benchmarking 3D World Modeling Through Code GenerationArXiv ML (cs.LG)
4WGFINNs: Weak formulation-based GENERIC formalism informed neural networks’ArXiv ML (cs.LG)
5Steerable but Not Decodable: Function Vectors Operate Beyond the Logit LensArXiv ML (cs.LG)
6Complex-Valued GNNs for Distributed Basis-Invariant Control of Planar SystemsArXiv ML (cs.LG)
7Analytic Drift Resister for Non-Exemplar Continual Graph LearningArXiv ML (cs.LG)
8AXELRAM: Quantize Once, Never DequantizeArXiv ML (cs.LG)
9Conditional Sampling via Wasserstein Autoencoders and Triangular TransportArXiv ML (cs.LG)
10Communication-free Sampling and 4D Hybrid Parallelism for Scalable Mini-batch GNN TrainingArXiv ML (cs.LG)
11Generalization Limits of Reinforcement Learning AlignmentArXiv ML (cs.LG)
12Product-Stability: Provable Convergence for Gradient Descent on the Edge of StabilityArXiv ML (cs.LG)
13Low-Rank Compression of Pretrained Models via Randomized Subspace IterationArXiv ML (cs.LG)
14A Numerical Method for Coupling Parameterized Physics-Informed Neural Networks and FDM for Advanced Thermal-Hydraulic System SimulationArXiv ML (cs.LG)
15Cross-subject Muscle Fatigue Detection via Adversarial and Supervised Contrastive Learning with Inception-Attention NetworkArXiv ML (cs.LG)
16Finding Belief Geometries with Sparse AutoencodersArXiv ML (cs.LG)
17Beyond Semantic Manipulation: Token-Space Attacks on Reward ModelsArXiv ML (cs.LG)
18Adaptive Semantic Communication for Wireless Image Transmission Leveraging Mixture-of-Experts MechanismArXiv ML (cs.LG)
19LieTrunc-QNN: Lie Algebra Truncation and Quantum Expressivity Phase Transition from LiePrune to Provably Stable Quantum Neural NetworksArXiv ML (cs.LG)
20FluxMoE: Decoupling Expert Residency for High-Performance MoE ServingArXiv ML (cs.LG)
21Generative Frontiers: Why Evaluation Matters for Diffusion Language ModelsArXiv ML (cs.LG)
22Understanding Latent Diffusability via Fisher GeometryArXiv ML (cs.LG)
23STDDN: A Physics-Guided Deep Learning Framework for Crowd SimulationArXiv ML (cs.LG)
24Towards Realistic Class-Incremental Learning with Free-Flow IncrementsArXiv ML (cs.LG)
25Random Is Hard to Beat: Active Selection in online DPO with Modern LLMsArXiv ML (cs.LG)
26Structure-Aware Commitment Reduction for Network-Constrained Unit Commitment with Solver-Preserving GuaranteesArXiv ML (cs.LG)
27Toward an Operational GNN-Based Multimesh Surrogate for Fast Flood ForecastingArXiv ML (cs.LG)
28Extracting Money Laundering Transactions from Quasi-Temporal Graph RepresentationArXiv ML (cs.LG)
29Efficient Logistic Regression with Mixture of SigmoidsArXiv ML (cs.LG)
30Towards Near-Real-Time Telemetry-Aware Routing with Neural Routing AlgorithmsArXiv ML (cs.LG)
31Explainable Machine Learning Reveals 12-Fold Ucp1 Upregulation and Thermogenic Reprogramming in Female Mouse White Adipose Tissue After 37 Days of Microgravity: First AI/ML Analysis of NASA OSD-970ArXiv ML (cs.LG)
32Mitigating Reward Hacking in RLHF via Advantage Sign RobustnessArXiv ML (cs.LG)
33FedSQ: Optimized Weight Averaging via Fixed GatingArXiv ML (cs.LG)
34Generating DDPM-based Samples from Tilted DistributionsArXiv ML (cs.LG)
35Co-Evolution of Policy and Internal Reward for Language AgentsArXiv ML (cs.LG)
36Self-Distilled RLVRArXiv ML (cs.LG)
37HyperFitS — Hypernetwork Fitting Spectra for metabolic quantification of 1{}^1H MR spectroscopic imagingArXiv ML (cs.LG)
38DSBD: Dual-Aligned Structural Basis Distillation for Graph Domain AdaptationArXiv ML (cs.LG)
39Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning ModelsArXiv ML (cs.LG)
40PRISM: LLM-Guided Semantic Clustering for High-Precision TopicsArXiv ML (cs.LG)
41Reflective Context Learning: Studying the Optimization Primitives of Context SpaceArXiv ML (cs.LG)
42Gradient Boosting within a Single Attention LayerArXiv ML (cs.LG)
43Real-Time Surrogate Modeling for Personalized Blood Flow Prediction and Hemodynamic AnalysisArXiv ML (cs.LG)
44Hierarchical Planning with Latent World ModelsArXiv ML (cs.LG)
45Enhancing Robustness of Federated Learning via Server LearningArXiv ML (cs.LG)
46MLFCIL: A Multi-Level Forgetting Mitigation Framework for Federated Class-Incremental Learning in LEO SatellitesArXiv ML (cs.LG)
47Fighting AI with AI: AI-Agent Augmented DNS Blocking of LLM Services during Student EvaluationsArXiv ML (cs.LG)
48TRACE: Traceroute-based Internet Route change Analysis with Ensemble LearningArXiv ML (cs.LG)
49Backdoor Attacks on Decentralised Post-TrainingArXiv ML (cs.LG)
50Photonic convolutional neural network with pre-trained in-situ trainingArXiv ML (cs.LG)
51PlayGen-MoG: Framework for Diverse Multi-Agent Play Generation via Mixture-of-Gaussians Trajectory PredictionArXiv ML (cs.LG)
52Guideline2Graph: Profile-Aware Multimodal Parsing for Executable Clinical Decision GraphsArXiv ML (cs.LG)
53Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language ModelsArXiv ML (cs.LG)
54Optimal Projection-Free Adaptive SGD for Matrix OptimizationArXiv ML (cs.LG)
55Reinforcement Learning from Human Feedback: A Statistical PerspectiveArXiv ML (cs.LG)
56Neural posterior estimation for scalable and accurate inverse parameter inference in Li-ion batteriesArXiv ML (cs.LG)
57AQVolt26: High-Temperature r2^2SCAN Halide Dataset for Universal ML Potentials and Solid-State BatteriesArXiv ML (cs.LG)
58Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle OptimizationArXiv ML (cs.LG)
59Feature Attribution Stability Suite: How Stable Are Post-Hoc Attributions?ArXiv ML (cs.LG)
60Synapse: Evolving Job-Person Fit with Explainable Two-phase Retrieval and LLM-guided Genetic Resume OptimizationArXiv ML (cs.LG)
61Overconfidence and Calibration in Medical VQA: Empirical Findings and Hallucination-Aware MitigationArXiv ML (cs.LG)
62Contrastive Language-Colored Pointmap Pretraining for Unified 3D Scene UnderstandingArXiv ML (cs.LG)
63Financial Anomaly Detection for the Canadian MarketArXiv ML (cs.LG)
64Robust Learning with Optimal ErrorArXiv ML (cs.LG)
65WSVD: Weighted Low-Rank Approximation for Fast and Efficient Execution of Low-Precision Vision-Language ModelsArXiv ML (cs.LG)
66Understanding the Effects of Safety Unalignment on Large Language ModelsArXiv ML (cs.LG)
67Learning interacting particle systems from unlabeled dataArXiv ML (cs.LG)
68Structure-Preserving Multi-View Embedding Using Gromov-Wasserstein Optimal TransportArXiv ML (cs.LG)
69AutoVerifier: An Agentic Automated Verification Framework Using Large Language ModelsArXiv ML (cs.LG)
70Reinforcement Learning-based Knowledge Distillation with LLM-as-a-JudgeArXiv ML (cs.LG)
71Transfer Learning for Meta-analysis Under Covariate ShiftArXiv ML (cs.LG)
72Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky HierarchyArXiv ML (cs.LG)
73MOMO: Mars Orbital Model Foundation Model for Mars Orbital ApplicationsArXiv ML (cs.LG)
74State estimations and noise identifications with intermittent corrupted observations via Bayesian variational inferenceArXiv ML (cs.LG)
75Transfer Learning for Loan Recovery Prediction under Distribution Shifts with Heterogeneous Feature SpacesArXiv ML (cs.LG)
76Lipschitz bounds for integral kernelsArXiv ML (cs.LG)
77Rethinking Forward Processes for Score-Based Data Assimilation in High DimensionsArXiv ML (cs.LG)
78Toward an Artificial General Teacher: Procedural Geometry Data Generation and Visual Grounding with Vision-Language ModelsArXiv ML (cs.LG)
79Split and Conquer Partial Deepfake SpeechArXiv ML (cs.LG)
80Scalable Mean-Variance Portfolio Optimization via Subspace Embeddings and GPU-Friendly Nesterov-Accelerated Projected GradientArXiv ML (cs.LG)
81Learning from Synthetic Data via Provenance-Based Input Gradient GuidanceArXiv ML (cs.LG)
82Inversion-Free Natural Gradient Descent on Riemannian ManifoldsArXiv ML (cs.LG)
83A semicontinuous relaxation of Saito’s criterion and freeness as angular minimizationArXiv ML (cs.LG)
84Learning Contractive Integral Operators with Fredholm Integral Neural OperatorsArXiv ML (cs.LG)
85On Data-Driven Koopman Representations of Nonlinear Delay Differential EquationsArXiv ML (cs.LG)
86SkillRT: Compiling Skills for Efficient Execution EverywhereArXiv ML (cs.LG)
87Characterization of Gaussian Universality Breakdown in High-Dimensional Empirical Risk MinimizationArXiv ML (cs.LG)
88The Compression Gap: Why Discrete Tokenization Limits Vision-Language-Action Model ScalingArXiv ML (cs.LG)
89Learning the Signature of Memorization in Autoregressive Language ModelsArXiv ML (cs.LG)
90PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome predictionArXiv ML (cs.LG)
91A Tsetlin Machine-driven Intrusion Detection System for Next-Generation IoMT SecurityArXiv ML (cs.LG)
92Efficient Causal Graph Discovery Using Large Language ModelsArXiv ML (cs.LG)
93Output-Constrained Decision TreesArXiv ML (cs.LG)
94Supplementary Materials to Graph Convolutional Branch and BoundArXiv ML (cs.LG)
95Amortized Inference of Causal Models via Conditional Fixed-Point IterationsArXiv ML (cs.LG)
96Distributional Statistics Restore Training Data Auditability in One-step Distilled Diffusion ModelsArXiv ML (cs.LG)
97Zero-shot Concept Bottleneck ModelsArXiv ML (cs.LG)
98A Unified Approach to Analysis and Design of Denoising Markov ModelsArXiv ML (cs.LG)
99Accelerated Learning with Linear Temporal Logic using Differentiable SimulationArXiv ML (cs.LG)
100PVD-ONet: A Multi-scale Neural Operator Method for Singularly Perturbed Boundary Layer ProblemsArXiv ML (cs.LG)
101A Unifying Framework for Parallelizing Sequential Models with Linear Dynamical SystemsArXiv ML (cs.LG)
102Beyond Noisy-TVs: Noise-Robust Exploration Via Learning Progress MonitoringArXiv ML (cs.LG)
103ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix FactorizationArXiv ML (cs.LG)
104High-probability Convergence Guarantees of Decentralized SGDArXiv ML (cs.LG)
105Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-FunctionsArXiv ML (cs.LG)
106f-INE: A Hypothesis Testing Framework for Estimating Influence under Training RandomnessArXiv ML (cs.LG)
107Diffusion Models as Dataset Distillation PriorsArXiv ML (cs.LG)
108Towards best practices in low-dimensional semi-supervised latent Bayesian optimization for the design of antimicrobial peptidesArXiv ML (cs.LG)
109Steering Autoregressive Music Generation with Recursive Feature MachinesArXiv ML (cs.LG)
110Fast and Robust Simulation-Based Inference With Optimization Monte CarloArXiv ML (cs.LG)
111Goal-Driven Reward by Video Diffusion Models for Reinforcement LearningArXiv ML (cs.LG)
112Pushing the Limits of Distillation-Based Continual Learning via Classifier-Proximal Lightweight PluginsArXiv ML (cs.LG)
113Resting Neurons, Active Insights: Robustify Activation Sparsity for Large Language ModelsArXiv ML (cs.LG)
114Community-Based Early-Stage Chronic Kidney Disease Screening using Explainable Machine Learning for Low-Resource SettingsArXiv ML (cs.LG)
115Forget Many, Forget Right: Scalable and Precise Concept Unlearning in Diffusion ModelsArXiv ML (cs.LG)
116On the Extreme Variance of Certified Local Robustness Across Model SeedsArXiv ML (cs.LG)
117Textual Equilibrium Propagation for Deep Compound AI SystemsArXiv ML (cs.LG)
118Early Classification of Time Series in Non-Stationary Cost RegimesArXiv ML (cs.LG)
119ChronoSpike: An Adaptive Spiking Graph Neural Network for Dynamic GraphsArXiv ML (cs.LG)
120When RL Meets Adaptive Speculative Training: A Unified Training-Serving SystemArXiv ML (cs.LG)
121Infusion: Shaping Model Behavior by Editing Training Data via Influence FunctionsArXiv ML (cs.LG)
122Equivariant Evidential Deep Learning for Interatomic PotentialsArXiv ML (cs.LG)
123Low-Dimensional and Transversely Curved Optimization Dynamics in GrokkingArXiv ML (cs.LG)
124Early-Warning Signals of Grokking via Loss-Landscape GeometryArXiv ML (cs.LG)
125The Geometry of Multi-Task Grokking: Transverse Instability, Superposition, and Weight Decay Phase StructureArXiv ML (cs.LG)
126CeRA: Overcoming the Linear Ceiling of Low-Rank Adaptation via Capacity ExpansionArXiv ML (cs.LG)
127Learning Physical Operators using Neural OperatorsArXiv ML (cs.LG)
128SafeSci: Safety Evaluation of Large Language Models in Science Domains and BeyondArXiv ML (cs.LG)
129CRISP: Compressed Reasoning via Iterative Self-Policy DistillationArXiv ML (cs.LG)
130Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment AnalysisArXiv ML (cs.LG)
131JointFM-0.1: A Foundation Model for Multi-Target Joint Distributional PredictionArXiv ML (cs.LG)
132Pretrained Video Models as Differentiable Physics Simulators for Urban Wind FlowsArXiv ML (cs.LG)
133λ\lambda-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep NetworksArXiv ML (cs.LG)
134ERPO: Token-Level Entropy-Regulated Policy Optimization for Large Reasoning ModelsArXiv ML (cs.LG)
135Temporal Credit Is FreeArXiv ML (cs.LG)
136The Spectral Edge Thesis: A Mathematical Framework for Intra-Signal Phase Transitions in Neural Network TrainingArXiv ML (cs.LG)
137Transfer learning for nonparametric Bayesian networksArXiv ML (cs.LG)
138Efficient and Principled Scientific Discovery through Bayesian Optimization: A TutorialArXiv ML (cs.LG)
139annbatch unlocks terabyte-scale training of biological data in anndataArXiv ML (cs.LG)
140ResidualPlanner+: a scalable matrix mechanism for marginals and beyondArXiv ML (cs.LG)
141Central Limit Theorems for Stochastic Gradient Descent Quantile EstimatorsArXiv ML (cs.LG)
142Learn then Decide: A Learning Approach for Designing Data MarketplacesArXiv ML (cs.LG)
143gen2seg: Generative Models Enable Generalizable Instance SegmentationArXiv ML (cs.LG)
144LMask: Learn to Solve Constrained Routing Problems with Lazy MaskingArXiv ML (cs.LG)
145Are Statistical Methods Obsolete in the Era of Deep Learning? A Study of ODE Inverse ProblemsArXiv ML (cs.LG)
146AI-informed model-analogs for understanding subseasonal-to-seasonal jet stream and North American temperature predictabilityArXiv ML (cs.LG)
147Decoding RWA Tokenized U.S. Treasuries: Functional Dissection and Address Role InferenceArXiv ML (cs.LG)
148Constrained free energy minimization for the design of thermal states and stabilizer thermodynamic systemsArXiv ML (cs.LG)
149DRtool: An Interactive Tool for Analyzing High-Dimensional ClusteringsArXiv ML (cs.LG)
150LLM Analysis of 150+ years of German Parliamentary Debates on Migration Reveals Shift from Post-War Solidarity to Anti-Solidarity in the Last DecadeArXiv ML (cs.LG)
151ROPA: Synthetic Robot Pose Generation for RGB-D Bimanual Data AugmentationArXiv ML (cs.LG)
152Adaptive randomized pivoting and volume samplingArXiv ML (cs.LG)
153Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM InferenceArXiv ML (cs.LG)
154Fast Best-in-Class Regret for Contextual BanditsArXiv ML (cs.LG)
155Stability of the Kim—Milman flow mapArXiv ML (cs.LG)
156Tensor Computation of Euler Characteristic Functions and TransformsArXiv ML (cs.LG)
157Seer: Online Context Learning for Fast Synchronous LLM Reinforcement LearningArXiv ML (cs.LG)
158Investigating Test Overfitting on SWE-benchArXiv ML (cs.LG)
159Reward-Forcing: Autoregressive Video Generation with Reward FeedbackArXiv ML (cs.LG)
160Parameter-Efficient Fine-Tuning of DINOv2 for Large-Scale Font ClassificationArXiv ML (cs.LG)
161Fisher-Geometric Diffusion in Stochastic Gradient Descent: Optimal Rates, Oracle Complexity, and Information-Theoretic LimitsArXiv ML (cs.LG)
162Adaptive Guidance for Retrieval-Augmented Masked Diffusion ModelsArXiv ML (cs.LG)
163Amortized Inference for Correlated Discrete Choice Models via Equivariant Neural NetworksArXiv ML (cs.LG)
164Privacy-Accuracy Trade-offs in High-Dimensional LASSO under Perturbation MechanismsArXiv ML (cs.LG)
165Kill-Chain Canaries: Stage-Level Tracking of Prompt Injection Across Attack Surfaces and Model Safety TiersArXiv ML (cs.LG)
166Yau’s Affine Normal Descent: Algorithmic Framework and Convergence AnalysisArXiv ML (cs.LG)
167Functional Natural Policy GradientsArXiv ML (cs.LG)
168Multimodal Language Models Cannot Spot Spatial InconsistenciesArXiv ML (cs.LG)
169When AI Gets it Wrong: Reliability and Risk in AI-Assisted Medication Decision SystemsArXiv ML (cs.LG)
170ProdCodeBench: A Production-Derived Benchmark for Evaluating AI Coding AgentsArXiv ML (cs.LG)
171Language-Pretraining-Induced Bias: A Strong Foundation for General Vision TasksArXiv ML (cs.LG)
172(PAC-)Learning state machines from data streams: A generic strategy and an improved heuristic (Extended version)ArXiv ML (cs.LG)
173Copilot is ‘for entertainment purposes only,’ according to Microsoft’s terms of useTechCrunch AI
174Can orbital data centers help justify a massive valuation for SpaceX?TechCrunch AI
175In Japan, the robot isn’t coming for your job; it’s filling the one nobody wantsTechCrunch AI
176The New York Times drops freelancer whose AI tool copied from an existing book reviewThe Decoder
177Study maps developer frustration over “AI slop” as a “tragedy of the commons” in software developmentThe Decoder
178AI offensive cyber capabilities are doubling every six months, safety researchers findThe Decoder
179AI benchmarks systematically ignore how humans disagree, Google study findsThe Decoder
180AI chatbot traffic grows seven times faster than social media but still trails by a factor of fourThe Decoder
181Proxy-Pointer RAG: Achieving Vectorless Accuracy at Vector RAG Scale and CostTowards Data Science
182A Data Scientist’s Take on the $599 MacBook NeoTowards Data Science
183Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating PsychosisArXiv CL (cs.CL)
184CIPHER: Conformer-based Inference of Phonemes from High-density EEGArXiv CL (cs.CL)
185SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating SycophancyArXiv CL (cs.CL)
186Skeleton-based Coherence Modeling in NarrativesArXiv CL (cs.CL)
187Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token BudgetsArXiv CL (cs.CL)
188Social Meaning in Large Language Models: Structure, Magnitude, and Pragmatic PromptingArXiv CL (cs.CL)
189PolyJarvis: LLM Agent for Autonomous Polymer MD SimulationsArXiv CL (cs.CL)
190Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic ProgrammingArXiv CL (cs.CL)
191Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and EvaluationArXiv CL (cs.CL)
192Dependency-Guided Parallel Decoding in Discrete Diffusion Language ModelsArXiv CL (cs.CL)
193An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource LanguagesArXiv CL (cs.CL)
194Train Yourself as an LLM: Exploring Effects of AI Literacy on Persuasion via Role-playing LLM TrainingArXiv CL (cs.CL)
195Overcoming the “Impracticality” of RAG: Proposing a Real-World Benchmark and Multi-Dimensional Diagnostic FrameworkArXiv CL (cs.CL)
196Speaking of Language: Reflections on Metalanguage Research in NLPArXiv CL (cs.CL)
197Revealing the Learning Dynamics of Long-Context Continual Pre-trainingArXiv CL (cs.CL)
198SocioEval: A Template-Based Framework for Evaluating Socioeconomic Status Bias in Foundation ModelsArXiv CL (cs.CL)
199Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent SystemsArXiv CL (cs.CL)
200Redirected, Not Removed: Task-Dependent Stereotyping Reveals the Limits of LLM AlignmentsArXiv CL (cs.CL)
201Trivial Vocabulary Bans Improve LLM Reasoning More Than Deep Linguistic ConstraintsArXiv CL (cs.CL)
202Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive ContextsArXiv CL (cs.CL)
203Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language ModelsArXiv CL (cs.CL)
204When Modalities Remember: Continual Learning for Multimodal Knowledge GraphsArXiv CL (cs.CL)
205Rubrics to Tokens: Bridging Response-level Rubrics and Token-level Rewards in Instruction Following TasksArXiv CL (cs.CL)
206Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time SelectionArXiv CL (cs.CL)
207GRADE: Probing Knowledge Gaps in LLMs through Gradient Subspace DynamicsArXiv CL (cs.CL)
208LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extractionArXiv CL (cs.CL)
209One Model to Translate Them All? A Journey to Mount Doom for Multilingual Model MergingArXiv CL (cs.CL)
210BioUNER: A Benchmark Dataset for Clinical Urdu Named Entity RecognitionArXiv CL (cs.CL)
211Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent ConsensusArXiv CL (cs.CL)
212A Multi-head-based architecture for effective morphological tagging in Russian with open dictionaryArXiv CL (cs.CL)
213How Annotation Trains Annotators: Competence Development in Social Influence RecognitionArXiv CL (cs.CL)
214LogicPoison: Logical Attacks on Graph Retrieval-Augmented GenerationArXiv CL (cs.CL)
215NeuReasoner: Towards Explainable, Controllable, and Unified Reasoning via Mixture-of-NeuronsArXiv CL (cs.CL)
216R2-Write: Reflection and Revision for Open-Ended Writing with Deep ReasoningArXiv CL (cs.CL)
217JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token EfficiencyArXiv CL (cs.CL)
218Querying Structured Data Through Natural Language Using Language ModelsArXiv CL (cs.CL)
219Verbalizing LLMs’ assumptions to explain and control sycophancyArXiv CL (cs.CL)
220Multi-Aspect Knowledge Distillation for Language Model with Low-rank FactorizationArXiv CL (cs.CL)
221Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue ActsArXiv CL (cs.CL)
222StoryScope: Investigating idiosyncrasies in AI fictionArXiv CL (cs.CL)
223Beyond Precision: Importance-Aware Recall for Factuality Evaluation in Long-Form LLM GenerationArXiv CL (cs.CL)
224Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral ControlArXiv CL (cs.CL)
225Detecting and Correcting Reference Hallucinations in Commercial LLMs and Deep Research AgentsArXiv CL (cs.CL)
226Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented GenerationArXiv CL (cs.CL)
227Reliability Gated Multi-Teacher Distillation for Low Resource Abstractive SummarizationArXiv CL (cs.CL)
228BAS: A Decision-Theoretic Approach to Evaluating Large Language Model ConfidenceArXiv CL (cs.CL)
229Evaluating Small Language Models for Front-Door Routing: A Harmonized Benchmark and Synthetic-Traffic ExperimentArXiv CL (cs.CL)
230Xpertbench: Expert Level Tasks with Rubrics-Based EvaluationArXiv CL (cs.CL)
231Internalized Reasoning for Long-Context Visual Document UnderstandingArXiv CL (cs.CL)
232Measuring What Cannot Be Surveyed: LLMs as Instruments for Latent Cognitive Variables in Labor EconomicsArXiv CL (cs.CL)
233VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic AnchorsArXiv CL (cs.CL)
234High Volatility and Action Bias Distinguish LLMs from Humans in Group CoordinationArXiv CL (cs.CL)
235Mitigating LLM biases toward spurious social contexts using direct preference optimizationArXiv CL (cs.CL)
236IndustryCode: A Benchmark for Industry Code GenerationArXiv CL (cs.CL)
237EnsemHalDet: Robust VLM Hallucination Detection via Ensemble of Internal State DetectorsArXiv CL (cs.CL)
238Analysis of Optimality of Large Language Models on Planning ProblemsArXiv CL (cs.CL)
239Open-Loop Planning, Closed-Loop Verification: Speculative Verification for VLAArXiv CL (cs.CL)
240FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning ModelsArXiv CL (cs.CL)
241Prompt Compression in the Wild: Measuring Latency, Rate Adherence, and Quality for Faster LLM InferenceArXiv CL (cs.CL)
242Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASRArXiv CL (cs.CL)
243Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill EcosystemsArXiv CL (cs.CL)
244An Independent Safety Evaluation of Kimi K2.5ArXiv CL (cs.CL)
245InCoder-32B-Thinking: Industrial Code World Model for ThinkingArXiv CL (cs.CL)
246BibTeX Citation Hallucinations in Scientific Publishing Agents: Evaluation and MitigationArXiv CL (cs.CL)
247Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Output PrefillingArXiv CL (cs.CL)
248Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!ArXiv CL (cs.CL)
249Debating Truth: Debate-driven Claim Verification with Multiple Large Language Model AgentsArXiv CL (cs.CL)
250AutoPCR: Automated Phenotype Concept Recognition by PromptingArXiv CL (cs.CL)
251Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use AgentsArXiv CL (cs.CL)
252VeriOS: Query-Driven Proactive Human-Agent-GUI Interaction for Trustworthy OS AgentsArXiv CL (cs.CL)
253SciNLP: A Domain-Specific Benchmark for Full-Text Scientific Entity and Relation Extraction in NLPArXiv CL (cs.CL)
254Human Psychometric Questionnaires Mischaracterize LLM Psychology: Evidence from Generation BehaviorArXiv CL (cs.CL)
255Future Policy Approximation for Offline Reinforcement Learning Improves Mathematical ReasoningArXiv CL (cs.CL)
256What Is The Political Content in LLMs’ Pre- and Post-Training Data?ArXiv CL (cs.CL)
257CQA-Eval: Designing Reliable Evaluations of Multi-paragraph Clinical QA under Resource ConstraintsArXiv CL (cs.CL)
258Escaping the BLEU Trap: A Signal-Grounded Framework with Decoupled Semantic Guidance for EEG-to-Text DecodingArXiv CL (cs.CL)
259IslamicMMLU: A Benchmark for Evaluating LLMs on Islamic KnowledgeArXiv CL (cs.CL)
260APEX-EM: Non-Parametric Online Learning for Autonomous Agents via Structured Procedural-Episodic Experience ReplayArXiv CL (cs.CL)
261Are Finer Citations Always Better? Rethinking Granularity for Attributed GenerationArXiv CL (cs.CL)
262Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTSArXiv CL (cs.CL)
263WiseMind: a knowledge-guided multi-agent framework for accurate and empathetic psychiatric diagnosisArXiv CL (cs.CL)
264StructEval: Benchmarking LLMs’ Capabilities to Generate Structural OutputsArXiv CL (cs.CL)
265AutiHero: Engaging Parents in Creating Personalized, Multi-path Social Narratives for Autistic ChildrenArXiv CL (cs.CL)
266Glia: A Human-Inspired AI for Automated Systems Design and OptimizationArXiv CL (cs.CL)
267CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use AgentsArXiv CL (cs.CL)
268Machine Translation in the Wild: User Reaction to Xiaohongshu’s Built-In Translation FeatureArXiv CL (cs.CL)
269The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent ReasoningArXiv CL (cs.CL)
270Borderless Long Speech SynthesisArXiv CL (cs.CL)
271Terminal Agents Suffice for Enterprise AutomationArXiv CL (cs.CL)
272OSCAR: Orchestrated Self-verification and Cross-path RefinementArXiv CL (cs.CL)
273Beyond Fixed Inference: Quantitative Flow Matching for Adaptive Image DenoisingArXiv CV (cs.CV)
274Environment-Aware Channel Prediction for Vehicular Communications: A Multimodal Visual Feature Fusion FrameworkArXiv CV (cs.CV)
275Variational Encoder—Multi-Decoder (VE-MD) for Privacy-by-functional-design (Group) Emotion RecognitionArXiv CV (cs.CV)
276LumiVideo: An Intelligent Agentic System for Video Color GradingArXiv CV (cs.CV)
277From Elevation Maps To Contour Lines: SVM and Decision Trees to Detect Violin Width ReductionArXiv CV (cs.CV)
278Street-Legal Physical-World Adversarial Rim for License PlatesArXiv CV (cs.CV)
279VERTIGO: Visual Preference Optimization for Cinematic Camera Trajectory GenerationArXiv CV (cs.CV)
280Hierarchical, Interpretable, Label-Free Concept Bottleneck ModelArXiv CV (cs.CV)
281Generating Satellite Imagery Data for Wildfire Detection through Mask-Conditioned Generative AIArXiv CV (cs.CV)
282Token-Efficient Multimodal Reasoning via Image Prompt PackagingArXiv CV (cs.CV)
283Delaunay Canopy: Building Wireframe Reconstruction from Airborne LiDAR Point Clouds via Delaunay GraphArXiv CV (cs.CV)
284An Explainable Vision-Language Model Framework with Adaptive PID-Tversky Loss for Lumbar Spinal Stenosis DiagnosisArXiv CV (cs.CV)
285Rapidly deploying on-device eye tracking by distilling visual foundation modelsArXiv CV (cs.CV)
286FusionBERT: Multi-View Image-3D Retrieval via Cross-Attention Visual Fusion and Normal-Aware 3D EncoderArXiv CV (cs.CV)
287TrackerSplat: Exploiting Point Tracking for Fast and Robust Dynamic 3D Gaussians ReconstructionArXiv CV (cs.CV)
288Moondream Segmentation: From Words to MasksArXiv CV (cs.CV)
289Rascene: High-Fidelity 3D Scene Imaging with mmWave Communication SignalsArXiv CV (cs.CV)
290Unlocking Multi-Site Clinical Data: A Federated Approach to Privacy-First Child Autism Behavior AnalysisArXiv CV (cs.CV)
291Smart Transfer: Leveraging Vision Foundation Model for Rapid Building Damage Mapping with Post-Earthquake VHR ImageryArXiv CV (cs.CV)
292Cross-Vehicle 3D Geometric Consistency for Self-Supervised Surround Depth Estimation on Articulated VehiclesArXiv CV (cs.CV)
293Drift-Resilient Temporal Priors for Visual TrackingArXiv CV (cs.CV)
294Efficient3D: A Unified Framework for Adaptive and Debiased Token Reduction in 3D MLLMsArXiv CV (cs.CV)
295Parser-Oriented Structural Refinement for a Stable Layout Interface in Document ParsingArXiv CV (cs.CV)
296DocShield: Towards AI Document Safety via Evidence-Grounded Agentic ReasoningArXiv CV (cs.CV)
297XrayClaw: Cooperative-Competitive Multi-Agent Alignment for Trustworthy Chest X-ray DiagnosisArXiv CV (cs.CV)
298VBGS-SLAM: Variational Bayesian Gaussian Splatting Simultaneous Localization and MappingArXiv CV (cs.CV)
299ExploreVLA: Dense World Modeling and Exploration for End-to-End Autonomous DrivingArXiv CV (cs.CV)
300THOM: Generating Physically Plausible Hand-Object Meshes From TextArXiv CV (cs.CV)
301Visual Instruction-Finetuned Language Model for Versatile Brain MR Image TasksArXiv CV (cs.CV)
302Differentiable Stroke Planning with Dual Parameterization for Efficient and High-Fidelity Painting CreationArXiv CV (cs.CV)
303DeCo-DETR: Decoupled Cognition DETR for efficient Open-Vocabulary Object DetectionArXiv CV (cs.CV)
304InverseDraping: Recovering Sewing Patterns from 3D Garment Surfaces via BoxMesh BridgingArXiv CV (cs.CV)
305Generalized Small Object Detection
Point-Prompted Paradigm and Benchmark
ArXiv CV (cs.CV)
306A Unified Perspective on Adversarial Membership Manipulation in Vision ModelsArXiv CV (cs.CV)
307CANDLE: Illumination-Invariant Semantic Priors for Color Ambient Lighting NormalizationArXiv CV (cs.CV)
308LumaFlux: Lifting 8-Bit Worlds to HDR Reality with Physically-Guided Diffusion TransformersArXiv CV (cs.CV)
309UNICA: A Unified Neural Framework for Controllable 3D AvatarsArXiv CV (cs.CV)
310PaveBench: A Versatile Benchmark for Pavement Distress Perception and Interactive Vision-Language AnalysisArXiv CV (cs.CV)
311CMCC-ReID: Cross-Modality Clothing-Change Person Re-IdentificationArXiv CV (cs.CV)
312QAPruner: Quantization-Aware Vision Token Pruning for Multimodal Large Language ModelsArXiv CV (cs.CV)
313MMPhysVideo: Scaling Physical Plausibility in Video Generation via Joint Multimodal ModelingArXiv CV (cs.CV)
314NavCrafter: Exploring 3D Scenes from a Single ImageArXiv CV (cs.CV)
315STRNet: Visual Navigation with Spatio-Temporal Representation through Dynamic Graph AggregationArXiv CV (cs.CV)
316Factorized Multi-Resolution HashGrid for Efficient Neural Radiance Fields: Execution on Edge-DevicesArXiv CV (cs.CV)
317Deformation-based In-Context Learning for Point Cloud UnderstandingArXiv CV (cs.CV)
318Adaptive Local Frequency Filtering for Fourier-Encoded Implicit Neural RepresentationsArXiv CV (cs.CV)
319HiDiGen: Hierarchical Diffusion for B-Rep Generation with Explicit Topological ConstraintsArXiv CV (cs.CV)
320A Paradigm Shift: Fully End-to-End Training for Temporal Sentence Grounding in VideosArXiv CV (cs.CV)
321HairOrbit: Multi-view Aware 3D Hair Modeling from Single PortraitsArXiv CV (cs.CV)
322Token Warping Helps MLLMs Look from Nearby ViewpointsArXiv CV (cs.CV)
323SPG: Sparse-Projected Guides with Sparse Autoencoders for Zero-Shot Anomaly DetectionArXiv CV (cs.CV)
324Unlocking Positive Transfer in Incrementally Learning Surgical Instruments: A Self-reflection Hierarchical Prompt FrameworkArXiv CV (cs.CV)
325InstructTable: Improving Table Structure Recognition Through InstructionsArXiv CV (cs.CV)
326Information-Regularized Constrained Inversion for Stable Avatar Editing from Sparse SupervisionArXiv CV (cs.CV)
327Progressive Video Condensation with MLLM Agent for Long-form Video UnderstandingArXiv CV (cs.CV)
328EvaNet: Towards More Efficient and Consistent Infrared and Visible Image Fusion AssessmentArXiv CV (cs.CV)
329RayMamba: Ray-Aligned Serialization for Long-Range 3D Object DetectionArXiv CV (cs.CV)
330UniSpector: Towards Universal Open-set Defect Recognition via Spectral-Contrastive Visual PromptingArXiv CV (cs.CV)
331SentiAvatar: Towards Expressive and Interactive Digital HumansArXiv CV (cs.CV)
332GP-4DGS: Probabilistic 4D Gaussian Splatting from Monocular Video via Variational Gaussian ProcessesArXiv CV (cs.CV)
333BEVPredFormer: Spatio-temporal Attention for BEV Instance Prediction in Autonomous DrivingArXiv CV (cs.CV)
334PolyReal: A Benchmark for Real-World Polymer Science WorkflowsArXiv CV (cs.CV)
335Modality-Specific Hierarchical Enhancement for RGB-D Camouflaged Object DetectionArXiv CV (cs.CV)
336MMTalker: Multiresolution 3D Talking Head Synthesis with Multimodal Feature FusionArXiv CV (cs.CV)
337CrossWeaver: Cross-modal Weaving for Arbitrary-Modality Semantic SegmentationArXiv CV (cs.CV)
338Collaborative Multi-Mode Pruning for Vision-Language ModelsArXiv CV (cs.CV)
339Visual Prototype Conditioned Focal Region Generation for UAV-Based Object DetectionArXiv CV (cs.CV)
340Exploring Motion-Language Alignment for Text-driven Motion GenerationArXiv CV (cs.CV)
341Effect of Input Resolution on Retinal Vessel Segmentation Performance: An Empirical Study Across Five DatasetsArXiv CV (cs.CV)
342Not All Frames Deserve Full Computation: Accelerating Autoregressive Video Generation via Selective Computation and Predictive ExtrapolationArXiv CV (cs.CV)
343Rendering Multi-Human and Multi-Object with 3D Gaussian SplattingArXiv CV (cs.CV)
344Explicit Time-Frequency Dynamics for Skeleton-Based Gait RecognitionArXiv CV (cs.CV)
345GenSmoke-GS: A Multi-Stage Method for Novel View Synthesis from Smoke-Degraded Images Using a Generative ModelArXiv CV (cs.CV)
346QVAD: A Question-Centric Agentic Framework for Efficient and Training-Free Video Anomaly DetectionArXiv CV (cs.CV)
347STEAR: Layer-Aware Spatiotemporal Evidence Intervention for Hallucination Mitigation in Video Large Language ModelsArXiv CV (cs.CV)
348Can Nano Banana 2 Replace Traditional Image Restoration Models? An Evaluation of Its Performance on Image Restoration TasksArXiv CV (cs.CV)
349Gram-MMD: A Texture-Aware Metric for Image Realism AssessmentArXiv CV (cs.CV)
350SparseSplat: Towards Applicable Feed-Forward 3D Gaussian Splatting with Pixel-Unaligned PredictionArXiv CV (cs.CV)
351MI-Pruner: Crossmodal Mutual Information-guided Token Pruner for Efficient MLLMsArXiv CV (cs.CV)
352A Data-Centric Vision Transformer Baseline for SAR Sea Ice ClassificationArXiv CV (cs.CV)
353Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept UnlearningArXiv CV (cs.CV)
354Revealing Physical-World Semantic Vulnerabilities: Universal Adversarial Patches for Infrared Vision-Language ModelsArXiv CV (cs.CV)
355Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video GenerationArXiv CV (cs.CV)
356SCC-Loc: A Unified Semantic Cascade Consensus Framework for UAV Thermal Geo-LocalizationArXiv CV (cs.CV)
357SD-FSMIS: Adapting Stable Diffusion for Few-Shot Medical Image SegmentationArXiv CV (cs.CV)
358CAMEO: A Conditional and Quality-Aware Multi-Agent Image Editing OrchestratorArXiv CV (cs.CV)
359EffiMiniVLM: A Compact Dual-Encoder Regression FrameworkArXiv CV (cs.CV)
360SFFNet: Synergistic Feature Fusion Network With Dual-Domain Edge Enhancement for UAV Image Object DetectionArXiv CV (cs.CV)
361The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge ReportArXiv CV (cs.CV)
362ProtoFlow: Mitigating Forgetting in Class-Incremental Remote Sensing Segmentation via Low-Curvature Prototype FlowArXiv CV (cs.CV)
363VOSR: A Vision-Only Generative Model for Image Super-ResolutionArXiv CV (cs.CV)
364CoME-VL: Scaling Complementary Multi-Encoder Vision-Language LearningArXiv CV (cs.CV)
365Managing Diabetic Retinopathy with Deep Learning: A Data Centric OverviewArXiv CV (cs.CV)
366Why Invariance is Not Enough for Biomedical Domain Generalization and How to Fix ItArXiv CV (cs.CV)
367Wavelength-multiplexed massively parallel diffractive optical information storage and image projectionArXiv CV (cs.CV)
368A Rapid Instrument Exchange System for Humanoid Robots in Minimally Invasive SurgeryArXiv CV (cs.CV)
369V2X-QA: A Comprehensive Reasoning Dataset and Benchmark for Multimodal Large Language Models in Autonomous Driving Across Ego, Infrastructure, and Cooperative ViewsArXiv CV (cs.CV)
370Task-Guided Prompting for Unified Remote Sensing Image RestorationArXiv CV (cs.CV)
371Few-Shot Distribution-Aligned Flow Matching for Data Synthesis in Medical Image SegmentationArXiv CV (cs.CV)
372ARM: Advantage Reward Modeling for Long-Horizon ManipulationArXiv CV (cs.CV)
373ARIQA-3DS: A Stereoscopic Image Quality Assessment Dataset for Realistic Augmented RealityArXiv CV (cs.CV)
374Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action ModelArXiv CV (cs.CV)
375HyperCT: Low-Rank Hypernet for Unified Chest CT AnalysisArXiv CV (cs.CV)
376Motion Capture from Inertial and Vision SensorsArXiv CV (cs.CV)
377Generalized SAM: Efficient Fine-Tuning of SAM for Variable Input Image SizesArXiv CV (cs.CV)
378Accuracy Improvement of Cell Image Segmentation Using Feedback FormerArXiv CV (cs.CV)
379ForgeryGPT: A Multimodal LLM for Interpretable Image Forgery Detection and LocalizationArXiv CV (cs.CV)
380FaVChat: Hierarchical Prompt-Query Guided Facial Video Understanding with Data-Efficient GRPOArXiv CV (cs.CV)
381We’ll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic FeedbackArXiv CV (cs.CV)
382FLEX: A Largescale Multimodal, Multiview Dataset for Learning Structured Representations for Fitness Action Quality AssessmentArXiv CV (cs.CV)
383TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMsArXiv CV (cs.CV)
384SmartCLIP: Modular Vision-language Alignment with Identification GuaranteesArXiv CV (cs.CV)
385PAOLI: Pose-free Articulated Object Learning from Sparse-view ImagesArXiv CV (cs.CV)
386MedGS: Gaussian Splatting for Multi-Modal 3D Medical ImagingArXiv CV (cs.CV)
387Learning Adaptive Pseudo-Label Selection for Semi-Supervised 3D Object DetectionArXiv CV (cs.CV)
388Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video UnderstandingArXiv CV (cs.CV)
389SAGA: Source Attribution of Generative AI VideosArXiv CV (cs.CV)
390SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction PriorsArXiv CV (cs.CV)
391Can Vision-Language Models Count? A Synthetic Benchmark and Analysis of Attention-Based InterventionsArXiv CV (cs.CV)
392The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal AlignmentArXiv CV (cs.CV)
393FACT-GS: Frequency-Aligned Complexity-Aware Texture Reparameterization for 2D Gaussian SplattingArXiv CV (cs.CV)
394Analysis of Invasive Breast Cancer in Mammograms Using YOLO, Explainability, and Domain AdaptationArXiv CV (cs.CV)
395DM3D: Deformable Mamba via Offset-Guided Differentiable Scanning for Point Cloud UnderstandingArXiv CV (cs.CV)
396Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic QualityArXiv CV (cs.CV)
397Training Multi-Image Vision Agents via End2End Reinforcement LearningArXiv CV (cs.CV)
398GimbalDiffusion: Gravity-Aware Camera Control for Video GenerationArXiv CV (cs.CV)
399DePT3R: Joint Dense Point Tracking and 3D Reconstruction of Dynamic Scenes in a Single Forward PassArXiv CV (cs.CV)
400FedVideoMAE: Efficient Privacy-Preserving Federated Video ModerationArXiv CV (cs.CV)
401Unified Thinker: A General Reasoning Modular Core for Image GenerationArXiv CV (cs.CV)
402EGM: Efficient Visual Grounding Language ModelsArXiv CV (cs.CV)
403ReWeaver: Towards Simulation-Ready and Topology-Accurate Garment ReconstructionArXiv CV (cs.CV)
404PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document ParsingArXiv CV (cs.CV)
405Video Understanding: Through A Temporal LensArXiv CV (cs.CV)
406Uncertainty-Aware 4D Gaussian Splatting for Monocular Occluded Human RenderingArXiv CV (cs.CV)
4073DXTalker: Unifying Identity, Lip Sync, Emotion, and Spatial Dynamics in Expressive 3D Talking AvatarsArXiv CV (cs.CV)
408Efficient Test-Time Optimization for Depth Completion via Low-Rank Decoder AdaptationArXiv CV (cs.CV)
409Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image RetrievalArXiv CV (cs.CV)
410DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and SynchronizationArXiv CV (cs.CV)
411Edge-Efficient Two-Stream Multimodal Architecture for Non-Intrusive Bathroom Fall DetectionArXiv CV (cs.CV)
412CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language ModelsArXiv CV (cs.CV)
413When Negation Is a Geometry Problem in Vision-Language ModelsArXiv CV (cs.CV)
414Semantic Iterative Reconstruction: One-Shot Universal Anomaly DetectionArXiv CV (cs.CV)
415Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual ProcessingArXiv CV (cs.CV)
416MuRF: Unlocking the Multi-Scale Potential of Vision Foundation ModelsArXiv CV (cs.CV)
417Scene Grounding In the WildArXiv CV (cs.CV)
418Better Rigs, Not Bigger Networks: A Body Model Ablation for Gaussian AvatarsArXiv CV (cs.CV)
419UniRecGen: Unifying Multi-View 3D Reconstruction and GenerationArXiv CV (cs.CV)
420Satellite-Free Training for Drone-View Geo-LocalizationArXiv CV (cs.CV)
421Semantic Richness or Geometric Reasoning? The Fragility of VLM’s Visual InvarianceArXiv CV (cs.CV)
422Light-ResKAN: A Parameter-Sharing Lightweight KAN with Gram Polynomials for Efficient SAR Image RecognitionArXiv CV (cs.CV)
423SDesc3D: Towards Layout-Aware 3D Indoor Scene Generation from Short DescriptionsArXiv CV (cs.CV)
424Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination MitigationArXiv CV (cs.CV)
425Neural Field-Based 3D Surface Reconstruction of Microstructures from Multi-Detector Signals in Scanning Electron MicroscopyArXiv CV (cs.CV)
426Geometric Analysis of Magnetic Labyrinthine Stripe Evolution via U-Net SegmentationArXiv CV (cs.CV)
427Look, Zoom, Understand: The Robotic Eyeball for Embodied PerceptionArXiv CV (cs.CV)

报告生成时间: 2026-04-06 05

UTC


Part II: 💻 科技动态 (15 条)