Learn2Splat: Extending the Horizon of
Learned 3DGS Optimization

A meta-learned optimizer for 3D Gaussian Splatting achieving faster early convergence while remaining stable across long optimization trajectories.

1University of TΓΌbingen, TΓΌbingen AI Center 2Meta Reality Labs 3ETH Zurich
Code (Will be released) Paper arXiv
Setting
Dataset
Scene
Iterations (log)
t = 0

We compare our two checkpoint versions Learn2SplatSparse, Learn2SplatDense
to standard 3DGS optimization, a tuned learning rate 3DGS*, our learned optimizer baseline LO Baseline and ReSplat.
Please Zoom in for details
Reference
Legend

Learn2Splat is a learned optimizer for 3DGS reaching higher reconstruction quality in early stages while remaining effective across long optimization horizons. Prior learned optimizers rely on LR schedules or time encodings, limiting them to short or predefined horizons. Learn2Splat maintains performance via a meta learning scheme (checkpoint buffer and optimizer rollout) and architectural modifications, generalizing zero-shot to unseen datasets and resolutions within the trained initialization setting. We train two checkpoints: Learn2SplatSparse (sparse setting, ReSplat init) and Learn2SplatDense (dense setting, SfM init). Both are evaluated on both settings: the matched checkpoint performs best, while the cross-setting checkpoint remains stable but suboptimal.

Abstract
3D Gaussian Splatting (3DGS) optimization is most commonly performed using standard optimizers (Adam, SGD). While stable across diverse scenes, standard optimizers are general-purpose and not tailored to the structure of the problem. In particular, they produce independent parameter updates that do not capture the structural and spatial relationships within a scene, leading to inefficient optimization and slow convergence. Recent works introduced learned optimizers that predict correlated updates informed by inter-parameter and inter-Gaussian dependencies. However, these methods are trained for a fixed number of optimization iterations and rely on manually scheduled learning rates to avoid degradation.

In this paper, we introduce Learn2Splat, a learned optimizer for 3DGS that avoids degradation over extended optimization horizons without auxiliary mechanisms. To enable this, we propose a meta-learning scheme that extends the optimization horizon via a checkpoint buffer and an optimizer rollout strategy, combined with an architecture that encodes gradient scale information in its latent states. Results show improved early novel view synthesis quality while remaining stable over long horizons, with zero-shot generalization to unseen reconstruction settings. To support our findings, we introduce the first unified framework for training and evaluating both learned and conventional optimizers across sparse and dense view settings. Code and models will be released publicly.
Method

Meta-training Scheme & Architecture

Learn2Splat replaces Adam with a meta-learned network predicting per-Gaussian updates from Adam-normalized gradients and maintained latent states. The core architecture module is a kNN-based Point Transformer that captures spatial Gaussian relationships.

3DGS optimization paradigms
3DGS Optimization Paradigms.
  • Per-scene optimization: iterative updates via loss evaluation, backpropagation, and standard optimizer rules.
  • Feed-forward networks (FFN): scene representation predicted in a single forward pass with a frozen pre-trained model.
  • Learned optimizers: iterative updates using a frozen meta-trained model that predicts steps from signals such as image-space errors or loss gradients.
Learn2Splat meta-training and architecture
Meta-training and Architecture.
  • Meta iteration initialization. At each meta-iteration, a 3D scene is sampled and its Gaussians are initialized either from (1) SfM points or feed-forward (FFN) predictions at t = 0, or (2) an intermediate optimization state retrieved from a Checkpoint Buffer. The buffer stores partially optimized scenes together with their optimizer states, exposing the learned optimizer to both early- and late-stage optimization regimes. After the meta-iteration, the scene is further optimized using several frozen rollout steps before being pushed back into the buffer.
  • Meta iteration. Starting from the sampled scene state, the inner loop rolls out the learned optimizer for τ iterations. At each step, gradients of a reconstruction loss with respect to the Gaussian parameters are computed and passed to the learned optimizer to predict parameter updates. The outer meta-loop evaluates the updates quality and updates the learned optimizer parameters through meta-gradients.
  • Model architecture. The optimizer takes as input the current Gaussian parameters, Adam-normalized gradients, and per-Gaussian latent states. A kNN-based Point Transformer propagates information across neighboring Gaussians and predicts updated latent states, while a parallel State Scale MLP predicts state-scaling coefficients that preserve gradient magnitude information. The scaled latent states are then passed to an Update MLP, which predicts the final Gaussian parameter updates. Dashed lines denote concatenation and ⊙ indicates element-wise scaling.
Checkpoint Buffer

Diverse optimization states

Stores intermediate scene states from previous meta-iterations. Exposes the optimizer to states from large early gradients through fine late-stage refinements without extending the computational graph.

Optimizer Rollout

Learning from rollouts

Scenes are further optimized with a frozen snapshot before buffering. Rollout horizon grows from 1→50 steps over the first 10k meta-iterations, teaching the optimizer to recover from its own mistakes.

State Scale MLP

Gradient scale encoding

Predicts per-Gaussian scaling coefficients from Adam-normalized gradients, restoring magnitude information suppressed by transformer normalization so updates decay naturally as the loss decreases.

Zero-shot Generalization

Unseen datasets & resolutions

Trained on low-resolution DL3DV scenes, Learn2Splat generalizes zero-shot to new datasets and resolutions. In the sparse setting, it transfers to RealEstate10K, and in the dense setting to DTU, LLFF, and MipNeRF360. Cross-setting application remains stable but is suboptimal.

Training Configurations
Learn2SplatSparse β€” Sparse Setting

ReSplat Initialization

Trained on DL3DV in a sparse-view, forward-facing setup using ReSplat feed-forward initialization. Uses a fixed set of 8 context views at low resolution (256×448). Latent states are initialized from the FFN output. Zero-shot generalizes to RealEstate10K and higher resolutions.

  • Init: ReSplat FFN (57K primitives at train res.)
  • Views: 8 fixed context + 6 target views
  • Resolution: 256×448
  • Zero-shot to: RealEstate10K, higher resolutions
Learn2SplatDense β€” Dense Setting

SfM Initialization

Trained on DL3DV in a dense-view, large-baseline setup using SfM point cloud initialization with random latent states. Samples 8 views per iteration via furthest-point sampling. Data-augmented with 10–100% of initial SfM points. Zero-shot generalizes to DTU, LLFF, and Mip-NeRF360.

  • Init: SfM points (random latent states)
  • Views: 8 sampled via furthest-point + 6 target views
  • Resolution: 256×448
  • Zero-shot to: DTU, LLFF, Mip-NeRF360, higher resolutions
Video

Results & Comparisons

DL3DV scenes optimization in the sparse setting

All methods are initialized from ReSplat at t = 0 and optimize for up to 2000 steps with 16 views at 512×960. Learn2Splat achieves higher PSNR in early iterations while remaining stable throughout.

Interactive teaser

Drag the split lines to compare all four methods in one view.

Citation

BibTeX

If you find this work useful, please cite:

@inproceedings{learn2splat2026,
  title     = {Learn2Splat: Extending the Horizon of Learned {3DGS} Optimization},
  author    = {Author One and Author Two and Author Three and Author Four},
  booktitle = {arxiv:***}, 
  year      = {2026}
}