Research
& Analytics.
Explore the latest academic publications in AI and machine learning, pulled live from the ArXiv database.
MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation
The rapid progress of Artificial Intelligence Generated Content (AIGC) tools enables images, videos, and visualizations to be created on demand for webpage design, offering a flexible and increasingly adopted paradigm for modern UI/UX. However, direc...
Generalization in LLM Problem Solving: The Case of the Shortest Path
Whether language models can systematically generalize remains actively debated. Yet empirical performance is jointly shaped by multiple factors such as training data, training paradigms, and inference-time strategies, making failures difficult to int...
Diagnosing LLM Judge Reliability: Conformal Prediction Sets and Transitivity Violations
LLM-as-judge frameworks are increasingly used for automatic NLG evaluation, yet their per-instance reliability remains poorly understood. We present a two-pronged diagnostic toolkit applied to SummEval: $\textbf{(1)}$ a transitivity analysis that rev...
Benchmarking Optimizers for MLPs in Tabular Deep Learning
MLP is a heavily used backbone in modern deep learning (DL) architectures for supervised learning on tabular data, and AdamW is the go-to optimizer used to train tabular DL models. Unlike architecture design, however, the choice of optimizer for tabu...
How Do LLMs and VLMs Understand Viewpoint Rotation Without Vision? An Interpretability Study
Over the past year, spatial intelligence has drawn increasing attention. Many prior works study it from the perspective of visual-spatial intelligence, where models have access to visuospatial information from visual inputs. However, in the absence o...
AD4AD: Benchmarking Visual Anomaly Detection Models for Safer Autonomous Driving
The reliability of a machine vision system for autonomous driving depends heavily on its training data distribution. When a vehicle encounters significantly different conditions, such as atypical obstacles, its perceptual capabilities can degrade sub...
Structural interpretability in SVMs with truncated orthogonal polynomial kernels
We study post-training interpretability for Support Vector Machines (SVMs) built from truncated orthogonal polynomial kernels. Since the associated reproducing kernel Hilbert space is finite-dimensional and admits an explicit tensor-product orthonorm...
Why Do Vision Language Models Struggle To Recognize Human Emotions?
Understanding emotions is a fundamental ability for intelligent systems to be able to interact with humans. Vision-language models (VLMs) have made tremendous progress in the last few years for many visual tasks, potentially offering a promising solu...
How Embeddings Shape Graph Neural Networks: Classical vs Quantum-Oriented Node Representations
Node embeddings act as the information interface for graph neural networks, yet their empirical impact is often reported under mismatched backbones, splits, and training budgets. This paper provides a controlled benchmark of embedding choices for gra...
ArXiv API Entegrasyonu
This page pulls data directly from the arXiv.org database. All listed contents are open-access publications shared by the global academic community.