SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction

3D imaging and machine learning algorithms are quickly improving their ability to capture the unique characteristics of physical objects. The digital twin industry stands at the cusp of a significant transformation with a novel method to extract a hi...

Dec 27, 2023

·9 min read

Cover Image for SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction

3D imaging and machine learning algorithms are quickly improving their ability to capture the unique characteristics of physical objects. The digital twin industry stands at the cusp of a significant transformation with a novel method to extract a highly detailed mesh from any 3D Gaussian Splatting scene within minutes on a single GPU.

Introducing SuGaR (Surface-Aligned Gaussian Splatting), which promises to impact 3D graphics and rendering by offering precise and fast mesh extraction from 3D Gaussian splatting.

SugaR's extracted meshes can be imported into traditional 3D programs for further composition and animation, allowing for editing, sculpting, rigging, animating, or relighting. This integration into conventional 3D workflows underscores the flexibility and practical utility of the SuGaR pipeline.

%[https://youtu.be/QkAGvAHfctI]

Understanding SuGaR

Gaussian Splatting has recently become very popular as it yields realistic rendering while being significantly faster to train than NeRFs. It is however challenging to extract a mesh from the millions of tiny 3D Gaussians as these Gaussians tend to be unorganized after optimization. Leveraging Gaussian Splatting, SuGaR allows for the creation of highly detailed, editable meshes, a feat that eludes many traditional methods.

Technical Innovations

The genius of SuGaR lies in its approach to mesh extraction. An approach is introduced to derive a volume density from new Gaussians that can be assumed to be flat and well distributed over the scene surface. Then, with a newly introduced method to very efficiently sample points on the visible part of a level set of a density function, a Poisson reconstruction algorithm is run to ultimately result in the creation of detailed (triangle) meshes.

The SuGaR pipeline addresses the complexities and limitations of previous methods like Neural Radiance Fields (NeRFs) and Marching Cubes algorithm.

The key contributions of the SuGaR project include:

Aligning the Gaussians with the Surface: Introducing a regularization term that encourages the alignment of 3D Gaussians with the surface of the scene. It encourages the Gaussians to be well distributed over the scene surface, ensuring a more accurate representation of the scene geometry.

![Extracting a mesh from Gaussians. Without regularization, the Gaussians have no special arrangement after optimization, which makes extracting a mesh very difficult. Without our regularization term, Marching Cubes fail to extract an acceptable mesh. With our regularization term, Marching Cubes recover an extremely noisy mesh even with a very fine 3D grid. Our scalable extraction method obtains a mesh even without our regularization term. Still, the mesh is noisy. By contrast, our full method succeeds in reconstructing an accurate mesh very efficiently.](https://cdn.hashnode.com/res/hashnode/image/upload/v1703636836201/bb47850d-baee-4484-99f8-197ee88d3e3e.png align="center")

Efficient Mesh Extraction: Utilizing the alignment of 3D Gaussians to sample points on the real surface of the scene and extracting a mesh using Poisson reconstruction, a method that is fast, scalable, and detail-preserving.

![Examples of (a) renderings and (b) reconstructed meshes with SuGaR. The (c) normal maps help visualize the geometry](https://cdn.hashnode.com/res/hashnode/image/upload/v1703636991059/2715c49b-391d-4482-995f-45971c8509d1.png align="center")

Refinement Strategy: An optional refinement strategy that binds Gaussians to the surface of the mesh and jointly optimizes these Gaussians and the mesh through Gaussian splatting rendering. New Gaussians are bound to the mesh triangles and optimized jointly with the mesh using the Gaussian Splatting rasterizer. Configs for time to spend on refinement can be "short" (2k iterations), "medium" (7k iterations) or "long" (15k iterations).

![Joint refinement of mesh and Gaussians. Left: We bind Gaussians to the triangles of the mesh. Depending on the number of triangles in the scene, we bind a different number of Gaussians per triangle, with predefined barycentric coordinates. Right: Mesh before and after joint refinement.](https://cdn.hashnode.com/res/hashnode/image/upload/v1703634170575/74fe99d4-83c7-4e37-956a-83d0d9542809.png align="center")

Improved Efficiency and Quality: The method enables quick retrieval of an editable mesh for realistic rendering, with significant improvements in rendering quality and speed compared to state-of-the-art methods on neural SDFs.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1703637204877/0b646205-3f1d-49ad-9ffa-8ef93df8a988.gif align="center")

![](https://github.com/Anttwo/SuGaR/raw/main/media/examples/walk.gif align="center")

This pipeline enables high-quality rendering of the mesh using Gaussian splatting rendering rather than traditional textured mesh rendering. This allows for easy editing, sculpting, rigging, animating, or relighting of the Gaussians using traditional software like Blender, Unity, Unreal Engine, etc. by manipulating the mesh instead of the Gaussians themselves.

Requirements

Before starting the main pipeline steps of the SuGaR project, a vanilla 3D Gaussian Splatting model optimized for 7,000 iterations must be provided. This pre-optimized model serves as the foundation for the subsequent steps in the SuGaR pipeline.

You can obtain a pre-optimized model by setting up the Gaussian Splatting environment on a suitable machine and running the provided training script on an appropriate dataset (like COLMAP or NeRF Synthetic). This foundational model is crucial for the subsequent SuGaR optimization steps.

DIY: Tips for using SuGaR on your own data and obtain better reconstructions

The SuGaR repository provides several tips aimed at helping users obtain the best possible results when using SuGaR for 3D mesh reconstruction from their own captured data.

Capturing Images or Videos: Use a smartphone or camera to capture images or a video covering the entire surface of the 3D scene you wish to reconstruct.
- Move around the scene slowly and smoothly to avoid motion blur.
- Maintain a uniform focal length and constant exposure time for consistent reconstruction and easier camera pose estimation with COLMAP.
- Disable auto-focus to ensure constant focal length.
Estimating Camera Poses with COLMAP: Install a recent version of COLMAP, ideally CUDA-powered. Put the images in a directory /input, then use the gaussian_splatting/convert.py script from the Gaussian Splatting implementation to compute camera poses from the images using COLMAP.
Choosing Regularization Method: Decide between density regularization and SDF regularization based on your scene. Density regularization works well for objects centered in the scene, while SDF regularization provides stronger regularization, especially in background regions.
Adapting Scale and Bounding Box: If you're reconstructing a large scene, you may need to adjust the scale and bounding box. By default, the bounding box is computed as the bounding box of all camera centers, but you can provide a custom bounding box to the train.py script using the --bboxmin and --bboxmax parameters.

Implications for the Digital Twin Industry

The digital twin industry, which thrives on creating virtual replicas of physical entities, stands to gain immensely from SuGaR. The ability to quickly generate detailed and editable 3D models can accelerate the development of digital twins, leading to more accurate simulations and analyses. Retrieving an editable mesh for realistic rendering is done within minutes with SuGaR, compared to hours with the state-of-the-art method on neural SDFs, while providing a better rendering quality in terms of PSNR, SSIM and LPIPS.

![This image illustrates the workflow of the SuGaR pipeline for 3D mesh reconstruction and composition using traditional 3D tools. On the left, there are two sets of input data, each showing a set of images or a video of a scene with different objects—a robotic action figure and a toy bulldozer. In the middle, these inputs are processed by the SuGaR pipeline, resulting in a hybrid representation of each object, composed of a mesh overlaid with 3D Gaussians, which are visual approximations for rendering. On the right, the images show the results of further processing. The action figure and the bulldozer have been enhanced with traditional color textures for the mesh. These textures give the objects a more lifelike appearance, closer to the original input images. Additionally, the image in the top right corner shows a 3D editing software interface (like Blender), suggesting that the reconstructed objects can be imported into such a program for further composition and animation, allowing for editing, sculpting, rigging, animating, or relighting with traditional software tools. This integration into conventional 3D workflows underscores the flexibility and practical utility of the SuGaR pipeline.](https://cdn.hashnode.com/res/hashnode/image/upload/v1703632190289/11c35040-d11a-4627-8e40-b4b9b0e0c00a.png align="center")

![In this image, the workflow is similar to the previous one, focusing on the SuGaR pipeline's process for mesh reconstruction and subsequent editing in 3D software. The input data on the left shows images or videos of different scenes: an action figure in one and a child's playroom in another. These are processed by SuGaR to create a hybrid representation consisting of a mesh combined with 3D Gaussians. The second set of images showcases the outcome after further processing, where the objects and the room have traditional color textures applied to the mesh, giving a realistic look to the 3D models. Finally, the image in the top right corner displays a 3D editing software interface (like Blender), indicating that these reconstructed models can be imported into such programs for additional composition and animation, leveraging traditional 3D modeling and animation tools. The workflow exemplifies the integration from raw image data to a fully textured and editable 3D model.](https://cdn.hashnode.com/res/hashnode/image/upload/v1703632122341/16c81833-866a-4e4a-956b-b58d4e929f35.png align="center")

Evaluation and Analysis

While SuGaR represents a significant advancement, scalability and performance in diverse scenarios are still to be achieved. Its reliance on Gaussian alignment and Poisson reconstruction, though innovative, may present challenges in extremely complex environments.

Gaussian Splatting representations of Real scenes typically end up with one or several millions of 3D Gaussians with different scales and rotations, the majority of them being extremely small in order to reproduce texture and details in the scene.

2,000 iterations are usually enough to obtain high quality rendering since the extracted mesh “textured” with surface Gaussians is already an excellent initialization for optimizing the model. However, further refinement helps the Gaussians to capture texturing details and reconstruct extremely thin geometry that is finer that the resolution of the mesh, such as the spokes of the bicycle.

![Refined SuGaR renderings with different numbers of refinement iterations. ](https://cdn.hashnode.com/res/hashnode/image/upload/v1703633875570/d609af26-2da3-4534-9617-f75f97a161a0.png align="center")

SuGaR extracts a mesh in 30~35 minutes on average on a single GPU. After mesh extraction, the refinement time can take from a few minutes up to an hour. A short refinement time is enough to produce a good-looking hybrid representation in most cases but in complex environments longer refinement may be necessary.

The optimization time may vary (from 20 to 45 minutes) depending on the complexity of the scene and the GPU used. Moreover, the current implementation splits the optimization into 3 scripts that can be run separately (SuGaR optimization, mesh extraction, model refinement) so it reloads the data at each part, which is not optimal and takes several minutes. There are plans in the near future to optimize this.

Hopeful Outlook

As SuGaR matures, its integration into mainstream 3D modeling and rendering workflows could revolutionize how we interact with digital twins. It opens a door to recreating real-world scenes into detailed virtual environments, which could be pivotal in fields like urban planning, healthcare, and manufacturing.

The latest updates on the SuGaR GitHub page as of December 2023 are:

December 20, 2023: A short notebook was added to demonstrate how to render images with the hybrid representation using the Gaussian Splatting rasterizer.
December 18, 2023: The initial release of the code.

Additionally, the to-do list includes:

Adapting the code for compatibility with Windows, as the current code is not compatible due to path-writing conventions.
Adding the capability to use the NeRF synthetic dataset, which differs in format from COLMAP scenes.
Completing and cleaning the code for composition and animation, and incorporating it into the sugar_scene/sugar_compositor.py script.
Creating a tutorial on using scripts in the blender directory and the sugar_scene/sugar_compositor.py class for importing composition and animation data into PyTorch and applying it to the SuGaR hybrid representation

Conclusion

SuGaR sweetens the prospects for the digital twin industry, offering a blend of precision, efficiency, and versatility. Its role in the future of AI-driven 3D graphics is not just promising but potentially game-changing. As we embrace these advancements, it's crucial to continue critically evaluating and contributing to the evolution of such technologies.