arXiv cs.CV论文5 小时前WildTableBench: Benchmarking Multimodal Foundation Models on Table Understanding In the Wild阅读
arXiv cs.CV论文5 小时前Towards Generalizable Mapping of Hedges and Linear Woody Features from Earth Observation Data: a national Product for Germany阅读
arXiv cs.CV论文5 小时前VFM$^{4}$SDG: Unveiling the Power of VFMs for Single-Domain Generalized Object Detection阅读
arXiv cs.CV论文5 小时前Towards Brain MRI Foundation Models for the Clinic: Findings from the FOMO25 Challenge阅读
arXiv cs.CV论文5 小时前Anatomy-Guided Vision-Language Learning with Angular Prototype Separation for Multi-Label Video Capsule Endoscopy Classification Under Class Imbalance阅读
arXiv cs.CV论文5 小时前ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization阅读
arXiv cs.CV论文5 小时前VideoTemp-o3: Harmonizing Temporal Grounding and Video Understanding in Agentic Thinking-with-Videos阅读
arXiv cs.CV论文5 小时前PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging阅读
arXiv cs.CV论文5 小时前GT-SVJ: Generative-Transformer-Based Self-Supervised Video Judge For Efficient Video Reward Modeling阅读
arXiv cs.CV论文5 小时前CLEAR-HPV: Interpretable concept discovery for human-papillomavirus-associated morphology in whole-slide histology阅读
arXiv cs.CV论文5 小时前Progressive $\mathcal{J}$-Invariant Self-supervised Learning for Low-Dose CT Denoising阅读