Learn2Splat: Extending the Horizon of
Learned 3DGS Optimization

A meta-learned optimizer for 3D Gaussian Splatting achieving faster early convergence while remaining stable across long optimization trajectories.

Naama Pearl*¹ Stefano Esposito*¹ Haofei Xu^1,3 Amit Peleg¹ Patricia Gschossmann¹

Lorenzo Porzi² Peter Kontschieder² Gerard Pons-Moll¹ Andreas Geiger¹

¹University of Tübingen, Tübingen AI Center ²Meta Reality Labs ³ETH Zurich

Code (Will be released) Paper arXiv

Setting

Dataset

Scene

Iterations (log)

t = 0

We compare our two checkpoint versions Learn2Splat^Sparse, Learn2Splat^Dense
to standard 3DGS optimization, a tuned learning rate 3DGS*, our learned optimizer baseline LO Baseline and ReSplat.
Please Zoom in for details

Reference

Learn2Splat is a learned optimizer for 3DGS reaching higher reconstruction quality in early stages while remaining effective across long optimization horizons. Prior learned optimizers rely on LR schedules or time encodings, limiting them to short or predefined horizons. Learn2Splat maintains performance via a meta learning scheme (checkpoint buffer and optimizer rollout) and architectural modifications, generalizing zero-shot to unseen datasets and resolutions within the trained initialization setting. We train two checkpoints: Learn2Splat^Sparse (sparse setting, ReSplat init) and Learn2Splat^Dense (dense setting, SfM init). Both are evaluated on both settings: the matched checkpoint performs best, while the cross-setting checkpoint remains stable but suboptimal.

Abstract

3D Gaussian Splatting (3DGS) optimization is most commonly performed using standard optimizers (Adam, SGD). While stable across diverse scenes, standard optimizers are general-purpose and not tailored to the structure of the problem. In particular, they produce independent parameter updates that do not capture the structural and spatial relationships within a scene, leading to inefficient optimization and slow convergence. Recent works introduced learned optimizers that predict correlated updates informed by inter-parameter and inter-Gaussian dependencies. However, these methods are trained for a fixed number of optimization iterations and rely on manually scheduled learning rates to avoid degradation.

In this paper, we introduce Learn2Splat, a learned optimizer for 3DGS that avoids degradation over extended optimization horizons without auxiliary mechanisms. To enable this, we propose a meta-learning scheme that extends the optimization horizon via a checkpoint buffer and an optimizer rollout strategy, combined with an architecture that encodes gradient scale information in its latent states. Results show improved early novel view synthesis quality while remaining stable over long horizons, with zero-shot generalization to unseen reconstruction settings. To support our findings, we introduce the first unified framework for training and evaluating both learned and conventional optimizers across sparse and dense view settings. Code and models will be released publicly.

Method

Meta-training Scheme & Architecture

Learn2Splat replaces Adam with a meta-learned network predicting per-Gaussian updates from Adam-normalized gradients and maintained latent states. The core architecture module is a kNN-based Point Transformer that captures spatial Gaussian relationships.

3DGS Optimization Paradigms.

Per-scene optimization: iterative updates via loss evaluation, backpropagation, and standard optimizer rules.
Feed-forward networks (FFN): scene representation predicted in a single forward pass with a frozen pre-trained model.
Learned optimizers: iterative updates using a frozen meta-trained model that predicts steps from signals such as image-space errors or loss gradients.

Meta-training and Architecture.

Meta iteration initialization. At each meta-iteration, a 3D scene is sampled and its Gaussians are initialized either from (1) SfM points or feed-forward (FFN) predictions at t = 0, or (2) an intermediate optimization state retrieved from a Checkpoint Buffer. The buffer stores partially optimized scenes together with their optimizer states, exposing the learned optimizer to both early- and late-stage optimization regimes. After the meta-iteration, the scene is further optimized using several frozen rollout steps before being pushed back into the buffer.
Meta iteration. Starting from the sampled scene state, the inner loop rolls out the learned optimizer for τ iterations. At each step, gradients of a reconstruction loss with respect to the Gaussian parameters are computed and passed to the learned optimizer to predict parameter updates. The outer meta-loop evaluates the updates quality and updates the learned optimizer parameters through meta-gradients.
Model architecture. The optimizer takes as input the current Gaussian parameters, Adam-normalized gradients, and per-Gaussian latent states. A kNN-based Point Transformer propagates information across neighboring Gaussians and predicts updated latent states, while a parallel State Scale MLP predicts state-scaling coefficients that preserve gradient magnitude information. The scaled latent states are then passed to an Update MLP, which predicts the final Gaussian parameter updates. Dashed lines denote concatenation and ⊙ indicates element-wise scaling.

Checkpoint Buffer

Diverse optimization states

Stores intermediate scene states from previous meta-iterations. Exposes the optimizer to states from large early gradients through fine late-stage refinements without extending the computational graph.

Optimizer Rollout

Learning from rollouts

Scenes are further optimized with a frozen snapshot before buffering. Rollout horizon grows from 1→50 steps over the first 10k meta-iterations, teaching the optimizer to recover from its own mistakes.

State Scale MLP

Gradient scale encoding

Predicts per-Gaussian scaling coefficients from Adam-normalized gradients, restoring magnitude information suppressed by transformer normalization so updates decay naturally as the loss decreases.

Zero-shot Generalization

Unseen datasets & resolutions

Trained on low-resolution DL3DV scenes, Learn2Splat generalizes zero-shot to new datasets and resolutions. In the sparse setting, it transfers to RealEstate10K, and in the dense setting to DTU, LLFF, and MipNeRF360. Cross-setting application remains stable but is suboptimal.

Training Configurations

Learn2Splat^Sparse — Sparse Setting

ReSplat Initialization

Trained on DL3DV in a sparse-view, forward-facing setup using ReSplat feed-forward initialization. Uses a fixed set of 8 context views at low resolution (256×448). Latent states are initialized from the FFN output. Zero-shot generalizes to RealEstate10K and higher resolutions.

Init: ReSplat FFN (57K primitives at train res.)
Views: 8 fixed context + 6 target views
Resolution: 256×448
Zero-shot to: RealEstate10K, higher resolutions

Learn2Splat^Dense — Dense Setting

SfM Initialization

Trained on DL3DV in a dense-view, large-baseline setup using SfM point cloud initialization with random latent states. Samples 8 views per iteration via furthest-point sampling. Data-augmented with 10–100% of initial SfM points. Zero-shot generalizes to DTU, LLFF, and Mip-NeRF360.

Init: SfM points (random latent states)
Views: 8 sampled via furthest-point + 6 target views
Resolution: 256×448
Zero-shot to: DTU, LLFF, Mip-NeRF360, higher resolutions

Learn2Splat: Extending the Horizon of
Learned 3DGS Optimization

Meta-training Scheme & Architecture

Diverse optimization states

Learning from rollouts

Gradient scale encoding

Unseen datasets & resolutions

ReSplat Initialization

SfM Initialization

Results & Comparisons

DL3DV scenes optimization in the sparse setting

BibTeX

Learn2Splat: Extending the Horizon ofLearned 3DGS Optimization

Meta-training Scheme & Architecture

Diverse optimization states

Learning from rollouts

Gradient scale encoding

Unseen datasets & resolutions

ReSplat Initialization

SfM Initialization

Results & Comparisons

DL3DV scenes optimization in the sparse setting

BibTeX

Learn2Splat: Extending the Horizon of
Learned 3DGS Optimization