Research — Page 2 · NeuralCoreNews

Research

Papers that actually matter

28 articles in this section.

AURA: Solving the KV Cache Problem for Continuous Embodied AI

AURA introduces action-gated memory to prevent VRAM bloat in robots, allowing long-term policies to run indefinitely without crashing or hallucinating.

Jun 3, 2026 · 3 min read

Research

Reducing LLM Long-Context Latency with Adaptive Runtime Termination

Explore how Adaptive Runtime Termination (ART) reduces memory bandwidth bottlenecks to improve token throughput during long-context LLM inference.

Jun 2, 2026 · 3 min read

Research

BitsMoE: Reducing VRAM Requirements for Mixture-of-Experts Models

BitsMoE uses spectral energy to guide non-uniform bit allocation, potentially allowing massive MoE models to fit on consumer GPUs.

Jun 2, 2026 · 3 min read

Research

EAGLE 3.1: Fixing Attention Drift in Speculative Decoding

EAGLE 3.1 addresses attention drift to provide more consistent and predictable throughput for LLM inference via speculative decoding.

May 27, 2026 · 3 min read

Research

Together AI’s OSCAR: 2-Bit KV Cache Quantization for Long Context

Together AI’s OSCAR system uses attention-aware rotation to compress KV caches to 2-bit, significantly expanding context windows on consumer GPUs.

May 26, 2026 · 3 min read

Research

ByteDance Research: QA-Centric Training Improves LMM Document Analysis

A ByteDance study suggests that training multimodal models via question-answering outperforms transcription-heavy methods for analyzing long, complex documents.

May 24, 2026 · 3 min read

Research

Recurrent Depth in Transformers: Balancing Compute and Memory Efficiency

An analysis of recurrent depth and Sparse MoE as a way to trade memory efficiency for gradient stability in transformer architectures.

May 22, 2026 · 3 min read

Research

Multi-Pass Prompt Verification: Addressing Qualitative Loss in Quantized LLMs

A new study explores using multi-pass verification to recover accuracy lost in 2-bit and 3-bit quantized models, though critics argue it’s a workaround.

May 21, 2026 · 3 min read

Research

OpenAI Model Disproves Discrete Geometry Conjecture via Counterexample Search

An OpenAI model has disproven a geometry conjecture, highlighting the shift from human intuition to high-speed automated counterexample searching in mathematics.

May 21, 2026 · 3 min read

Research

AI Science Assistants and the Reality of Drug Retargeting

Two AI assistants are accelerating drug retargeting by filtering medical literature, though physical lab validation remains the primary bottleneck in drug discovery.

May 19, 2026 · 3 min read