3D Gaussian Splatting 코드 분석

6 Nov 2024

3D Gaussian Splatting for Real-Time Radiance Field Rendering

코드 분석

실행 코드

convert.py
```
 python convert.py -s <location> [--resize] #If not resizing, ImageMagick is not needed
```
- input 이미지로 COLMAP을 실행함
- 1/2, 1/4, 1/8 사이즈로 이미지를 압축함

train.py

 python train.py -s <path to COLMAP or NeRF Synthetic dataset>

초기화

가우시안 모델 생성

  gaussians = GaussianModel(dataset.sh_degree, opt.optimizer_type)

scene 생성
```
  scene = Scene(dataset, gaussians)
```

exponential learning decay scheduling (position에 대해서)

  depth_l1_weight = get_expon_lr_func(opt.depth_l1_weight_init, opt.depth_l1_weight_final, max_steps=opt.iterations)

학습

opt.iterations(30,000번)만큼 반복 학습
1000번마다 SH의 레벨을 올림 (최대 3)

랜덤하게 카메라의 뷰포인트 선택

  rand_idx = randint(0, len(viewpoint_indices) - 1)
  viewpoint_cam = viewpoint_stack.pop(rand_idx)
  vind = viewpoint_indices.pop(rand_idx)

가우시안 렌더 생성

image: 렌더된 이미지
viewspace_point_tensor: 2D 이미지에서의 가우시안 중심점
visibility_filter: radii가 0보다 큰 경우를 filter함
radii: 3D 가우시안의 반경

  render_pkg = render(viewpoint_cam, gaussians, pipe, bg, use_trained_exp=dataset.train_test_exp, separate_sh=SPARSE_ADAM_AVAILABLE
  image, viewspace_point_tensor, visibility_filter, radii = render_pkg["render"], render_pkg["viewspace_points"], render_pkg["visibility_filter"], render_pkg["radii"]

loss 계산

  loss = (1.0 - opt.lambda_dssim) * Ll1 + opt.lambda_dssim * (1.0 - ssim_value)

Depth 정규화

  if depth_l1_weight(iteration) > 0 and viewpoint_cam.depth_reliable:
      invDepth = render_pkg["depth"]
      mono_invdepth = viewpoint_cam.invdepthmap.cuda()
      depth_mask = viewpoint_cam.depth_mask.cuda()
            
      Ll1depth_pure = torch.abs((invDepth  - mono_invdepth) * depth_mask).mean()
      Ll1depth = depth_l1_weight(iteration) * Ll1depth_pure 
      loss += Ll1depth
      Ll1depth = Ll1depth.item()
  else:
      Ll1depth = 0

Densification

opt.densify_from_iter(500번) 부터 opt.densify_until_iter(15000번) 전까지 opt.densification_interval(100번)마다 densify & prune 수행
가우시안 개수 조절을 위해 alpha(불투명도)를 opt.opacity_reset_interval(3000번)마다 0으로 조정

  # Keep track of max radii in image-space for pruning
  gaussians.max_radii2D[visibility_filter] = torch.max(gaussians.max_radii2D[visibility_filter], radii[visibility_filter])
  gaussians.add_densification_stats(viewspace_point_tensor, visibility_filter)
        
  if iteration > opt.densify_from_iter and iteration % opt.densification_interval == 0:
      size_threshold = 20 if iteration > opt.opacity_reset_interval else None
      gaussians.densify_and_prune(opt.densify_grad_threshold, 0.005, scene.cameras_extent, size_threshold, radii)
        
  if iteration % opt.opacity_reset_interval == 0 or (dataset.white_background and iteration == opt.densify_from_iter):
      gaussians.reset_opacity()

render.py

train/test set을 렌더

 python render.py -m <path to trained model>

metrics.py

SSIM / PSNR / LPIPS 계산

 python metrics.py -m <path to trained model>

full_eval.py

 python full_eval.py -m <directory with evaluation images>/garden ... --skip_training --skip_rendering

scene

__init__.py

scene 클래스를 정의

  class Scene:
      gaussians : GaussianModel
  		def __init__(self, args : ModelParams, gaussians : GaussianModel, load_iteration=None, shuffle=True, resolution_scales=[1.0]):
  		    self.model_path = args.model_path
  		    self.loaded_iter = None
  		    self.gaussians = gaussians

load iteration

render의 경우 이미 존재하는 모델의 iteration 횟수를 불러옴

Default의 경우 None으로 설정됨

  if load_iteration:
      if load_iteration == -1: # render
          self.loaded_iter = searchForMaxIteration(os.path.join(self.model_path, "point_cloud"))
      else: # train
          self.loaded_iter = load_iteration
      print("Loading trained model at iteration {}".format(self.loaded_iter))

scene 설정 (Colmap/Blender)

sceneLoadTypeCallbacks를 통해 카메라 파라미터와 depth, 포인트 클라우드를 불러와 scene_info 를 정의

  if os.path.exists(os.path.join(args.source_path, "sparse")):
      scene_info = sceneLoadTypeCallbacks["Colmap"](args.source_path, args.images, args.depths, args.eval, args.train_test_exp)
  elif os.path.exists(os.path.join(args.source_path, "transforms_train.json")):
      print("Found transforms_train.json file, assuming Blender data set!")
      scene_info = sceneLoadTypeCallbacks["Blender"](args.source_path, args.white_background, args.depths, args.eval)
  else:
      assert False, "Could not recognize scene type!"

카메라 리스트

self.loaded_iter=None 일 때 카메라 리스트를 불러옴 (train/test)

  if not self.loaded_iter:
      with open(scene_info.ply_path, 'rb') as src_file, open(os.path.join(self.model_path, "input.ply") , 'wb') as dest_file:
          dest_file.write(src_file.read())
      json_cams = []
      camlist = []
      if scene_info.test_cameras:
          camlist.extend(scene_info.test_cameras)
      if scene_info.train_cameras:
          camlist.extend(scene_info.train_cameras)
      for id, cam in enumerate(camlist):
          json_cams.append(camera_to_JSON(id, cam))
      with open(os.path.join(self.model_path, "cameras.json"), 'w') as file:
          json.dump(json_cams, file)

가우시안

  if self.loaded_iter:
      self.gaussians.load_ply(os.path.join(self.model_path,
                                           "point_cloud",
                                           "iteration_" + str(self.loaded_iter),
                                           "point_cloud.ply"), args.train_test_exp)
  else:
      self.gaussians.create_from_pcd(scene_info.point_cloud, scene_info.train_cameras, self.cameras_extent)

render의 경우 ply를 불러와 가우시안을 생성
train의 경우 Colmap으로 생성된 sparse한 포인트 클라우드로 가우시안을 생성(create_from_pcd )

cameras.py

Camera 클래스를 정의

  class Camera(nn.Module):
      def __init__(self, resolution, colmap_id, R, T, FoVx, FoVy, depth_params, image, invdepthmap,
                   image_name, uid,
                   trans=np.array([0.0, 0.0, 0.0]), scale=1.0, data_device = "cuda",
                   train_test_exp = False, is_test_dataset = False, is_test_view = False
                   ):
          super(Camera, self).__init__()

class variables
- uid, colmap_id, R, T, FoVx, FoVy, image_name
- alpha_mask
- original_image, image_width, image_height
- invdepthmap, depth_reliable
- zfar, znear
- trans, scale
- world_view_transform (R, T, trans, scale)
- projection_matrix (znear, zfar, FoVx, FoVy)
- full_proj_transform (view&proj)
- camera_center(=camera pose)

colmap_loader.py
- Colmap의 결과물(bin, txt)을 변환함
dataset_readers.py
- CameraInfo, SceneInfo 클래스를 정의함
- colmap의 카메라를 CameraInfo로 불러옴 (readColmapCameras)
- colmap의 결과물을 SceneInfo로 불러옴 (readColmapSceneInfo)

gaussian_model.py

GaussianModel 클래스를 정의

setup_functions 함수
- build_covariance_from_scaling_rotation
- activation function을 정의
class variables
- SH degree (active/max)
- optimizer / optimizer type
- xyz(위치)
- features_dc, features_rest (SH)
- scaling
- rotation
- opacity
- max_radii2D
- xyz_gradient_accum
- denom
- percent_dense
- spatial_lr_scale

class functions

capture/restore

create_from_pcd

train시 sparse한 포인트클라우드에서 가우시안을 초기화

      def create_from_pcd(self, pcd : BasicPointCloud, cam_infos : int, spatial_lr_scale : float):
          self.spatial_lr_scale = spatial_lr_scale
          fused_point_cloud = torch.tensor(np.asarray(pcd.points)).float().cuda()
          ### RGB -> SH -> features
          fused_color = RGB2SH(torch.tensor(np.asarray(pcd.colors)).float().cuda())
          features = torch.zeros((fused_color.shape[0], 3, (self.max_sh_degree + 1) ** 2)).float().cuda()
          features[:, :3, 0 ] = fused_color
          features[:, 3:, 1:] = 0.0
                
          print("Number of points at initialisation : ", fused_point_cloud.shape[0])
                				
  				### dist: point cloud에서 가져옴
  				### scale: average distance of 3NN, activation is exp so log is added
  				### rot: 0으로 초기화
          dist2 = torch.clamp_min(distCUDA2(torch.from_numpy(np.asarray(pcd.points)).float().cuda()), 0.0000001)
          scales = torch.log(torch.sqrt(dist2))[...,None].repeat(1, 3)
          rots = torch.zeros((fused_point_cloud.shape[0], 4), device="cuda")
          rots[:, 0] = 1
                				
  				### opacity: inverse sigmoid
          opacities = self.inverse_opacity_activation(0.1 * torch.ones((fused_point_cloud.shape[0], 1), dtype=torch.float, device="cuda"))
                				
  				### create nn.Parameter
          self._xyz = nn.Parameter(fused_point_cloud.requires_grad_(True))
          self._features_dc = nn.Parameter(features[:,:,0:1].transpose(1, 2).contiguous().requires_grad_(True))
          self._features_rest = nn.Parameter(features[:,:,1:].transpose(1, 2).contiguous().requires_grad_(True))
          self._scaling = nn.Parameter(scales.requires_grad_(True))
          self._rotation = nn.Parameter(rots.requires_grad_(True))
          self._opacity = nn.Parameter(opacities.requires_grad_(True))
          self.max_radii2D = torch.zeros((self.get_xyz.shape[0]), device="cuda")
          self.exposure_mapping = {cam_info.image_name: idx for idx, cam_info in enumerate(cam_infos)}
          self.pretrained_exposures = None
          exposure = torch.eye(3, 4, device="cuda")[None].repeat(len(cam_infos), 1, 1)
          self._exposure = nn.Parameter(exposure.requires_grad_(True))

load_ply
- render시 ply를 불러와 가우시안으로 바꿈
save_ply
- 가우시안을 ply로 변환하여 저장
reset_opacity
densify_and_split, densify_and_clone, densify_and_prune
- prune_points
  - optimizer을 prune함
  - 그 외 다른 변수들도 prune함 (xyz_gradient_accum, denom, max_radii2D, tmp_radii)
- densification_postfix
  - densify(clone, split)후 새로운 가우시안 optimizer을 추가함

gaussian_renderer

init.py

render 함수: diff_gaussian_rasterization의 GaussianRasterizer 을 이용해 가우시안을 2D 이미지로 렌더함

output:

  out = {
          "render": rendered_image,
          "viewspace_points": screenspace_points,
          "visibility_filter" : (radii > 0).nonzero(),
          "radii": radii,
          "depth" : depth_image
          }
  return out

network_gui.py
- 실시간 렌더러에 쓰이는 로컬 네트워크 구현 코드

utils

camera_utils.py
- 카메라를 불러오는 함수
- CamInfo로부터 카메라 리스트를 만드는 함수
- json으로 바꾸기 위해 dictionary를 정의하는 함수
general_utils.py
- inverse sigmoid
- pil to torch
- exponential lr
- rotation
graphics_utils.py
- view matrix, projection matrix
image_utils.py
- mse, psnr
loss_utils.py
- loss에 관련된 함수
make_depth_scale.py
- 큰 범위의 환경일 경우 스케일을 줄이는 함수
read_write_model.py
- bin/txt에서 byte로 read/write
sh_utils.py
- spherical harmonics
system_utils.py
- mkdir