3D Gaussian Splatting 코드 분석
3D Gaussian Splatting for Real-Time Radiance Field Rendering
코드 분석
실행 코드
-
convert.py
python convert.py -s <location> [--resize] #If not resizing, ImageMagick is not needed
- input 이미지로 COLMAP을 실행함
- 1/2, 1/4, 1/8 사이즈로 이미지를 압축함
-
train.py
python train.py -s <path to COLMAP or NeRF Synthetic dataset>
- 초기화
-
가우시안 모델 생성
gaussians = GaussianModel(dataset.sh_degree, opt.optimizer_type)
-
scene 생성
scene = Scene(dataset, gaussians)
-
exponential learning decay scheduling (position에 대해서)
depth_l1_weight = get_expon_lr_func(opt.depth_l1_weight_init, opt.depth_l1_weight_final, max_steps=opt.iterations)
-
- 학습
opt.iterations
(30,000번)만큼 반복 학습- 1000번마다 SH의 레벨을 올림 (최대 3)
-
랜덤하게 카메라의 뷰포인트 선택
rand_idx = randint(0, len(viewpoint_indices) - 1) viewpoint_cam = viewpoint_stack.pop(rand_idx) vind = viewpoint_indices.pop(rand_idx)
- 가우시안 렌더 생성
image
: 렌더된 이미지viewspace_point_tensor
: 2D 이미지에서의 가우시안 중심점visibility_filter
: radii가 0보다 큰 경우를 filter함radii
: 3D 가우시안의 반경
render_pkg = render(viewpoint_cam, gaussians, pipe, bg, use_trained_exp=dataset.train_test_exp, separate_sh=SPARSE_ADAM_AVAILABLE image, viewspace_point_tensor, visibility_filter, radii = render_pkg["render"], render_pkg["viewspace_points"], render_pkg["visibility_filter"], render_pkg["radii"]
-
loss 계산
loss = (1.0 - opt.lambda_dssim) * Ll1 + opt.lambda_dssim * (1.0 - ssim_value)
-
Depth 정규화
if depth_l1_weight(iteration) > 0 and viewpoint_cam.depth_reliable: invDepth = render_pkg["depth"] mono_invdepth = viewpoint_cam.invdepthmap.cuda() depth_mask = viewpoint_cam.depth_mask.cuda() Ll1depth_pure = torch.abs((invDepth - mono_invdepth) * depth_mask).mean() Ll1depth = depth_l1_weight(iteration) * Ll1depth_pure loss += Ll1depth Ll1depth = Ll1depth.item() else: Ll1depth = 0
- Densification
opt.densify_from_iter
(500번) 부터opt.densify_until_iter
(15000번) 전까지opt.densification_interval
(100번)마다 densify & prune 수행- 가우시안 개수 조절을 위해 alpha(불투명도)를
opt.opacity_reset_interval
(3000번)마다 0으로 조정
# Keep track of max radii in image-space for pruning gaussians.max_radii2D[visibility_filter] = torch.max(gaussians.max_radii2D[visibility_filter], radii[visibility_filter]) gaussians.add_densification_stats(viewspace_point_tensor, visibility_filter) if iteration > opt.densify_from_iter and iteration % opt.densification_interval == 0: size_threshold = 20 if iteration > opt.opacity_reset_interval else None gaussians.densify_and_prune(opt.densify_grad_threshold, 0.005, scene.cameras_extent, size_threshold, radii) if iteration % opt.opacity_reset_interval == 0 or (dataset.white_background and iteration == opt.densify_from_iter): gaussians.reset_opacity()
- 초기화
- render.py
- train/test set을 렌더
python render.py -m <path to trained model>
- metrics.py
- SSIM / PSNR / LPIPS 계산
python metrics.py -m <path to trained model>
-
full_eval.py
python full_eval.py -m <directory with evaluation images>/garden ... --skip_training --skip_rendering
scene
- __init__.py
-
scene 클래스를 정의
class Scene: gaussians : GaussianModel def __init__(self, args : ModelParams, gaussians : GaussianModel, load_iteration=None, shuffle=True, resolution_scales=[1.0]): self.model_path = args.model_path self.loaded_iter = None self.gaussians = gaussians
- load iteration
- render의 경우 이미 존재하는 모델의 iteration 횟수를 불러옴
-
Default
의 경우None
으로 설정됨if load_iteration: if load_iteration == -1: # render self.loaded_iter = searchForMaxIteration(os.path.join(self.model_path, "point_cloud")) else: # train self.loaded_iter = load_iteration print("Loading trained model at iteration {}".format(self.loaded_iter))
- scene 설정 (Colmap/Blender)
-
sceneLoadTypeCallbacks
를 통해 카메라 파라미터와 depth, 포인트 클라우드를 불러와scene_info
를 정의if os.path.exists(os.path.join(args.source_path, "sparse")): scene_info = sceneLoadTypeCallbacks["Colmap"](args.source_path, args.images, args.depths, args.eval, args.train_test_exp) elif os.path.exists(os.path.join(args.source_path, "transforms_train.json")): print("Found transforms_train.json file, assuming Blender data set!") scene_info = sceneLoadTypeCallbacks["Blender"](args.source_path, args.white_background, args.depths, args.eval) else: assert False, "Could not recognize scene type!"
-
- 카메라 리스트
-
self.loaded_iter=
None
일 때 카메라 리스트를 불러옴 (train/test)if not self.loaded_iter: with open(scene_info.ply_path, 'rb') as src_file, open(os.path.join(self.model_path, "input.ply") , 'wb') as dest_file: dest_file.write(src_file.read()) json_cams = [] camlist = [] if scene_info.test_cameras: camlist.extend(scene_info.test_cameras) if scene_info.train_cameras: camlist.extend(scene_info.train_cameras) for id, cam in enumerate(camlist): json_cams.append(camera_to_JSON(id, cam)) with open(os.path.join(self.model_path, "cameras.json"), 'w') as file: json.dump(json_cams, file)
-
-
가우시안
if self.loaded_iter: self.gaussians.load_ply(os.path.join(self.model_path, "point_cloud", "iteration_" + str(self.loaded_iter), "point_cloud.ply"), args.train_test_exp) else: self.gaussians.create_from_pcd(scene_info.point_cloud, scene_info.train_cameras, self.cameras_extent)
- render의 경우 ply를 불러와 가우시안을 생성
- train의 경우 Colmap으로 생성된 sparse한 포인트 클라우드로 가우시안을 생성(
create_from_pcd
)
- load iteration
-
- cameras.py
-
Camera 클래스를 정의
class Camera(nn.Module): def __init__(self, resolution, colmap_id, R, T, FoVx, FoVy, depth_params, image, invdepthmap, image_name, uid, trans=np.array([0.0, 0.0, 0.0]), scale=1.0, data_device = "cuda", train_test_exp = False, is_test_dataset = False, is_test_view = False ): super(Camera, self).__init__()
-
class variables
- uid, colmap_id, R, T, FoVx, FoVy, image_name
- alpha_mask
- original_image, image_width, image_height
- invdepthmap, depth_reliable
- zfar, znear
- trans, scale
- world_view_transform (R, T, trans, scale)
- projection_matrix (znear, zfar, FoVx, FoVy)
- full_proj_transform (view&proj)
- camera_center(=camera pose)
-
- colmap_loader.py
- Colmap의 결과물(bin, txt)을 변환함
- dataset_readers.py
- CameraInfo, SceneInfo 클래스를 정의함
- colmap의 카메라를 CameraInfo로 불러옴 (
readColmapCameras
) - colmap의 결과물을 SceneInfo로 불러옴 (
readColmapSceneInfo
)
- gaussian_model.py
- GaussianModel 클래스를 정의
setup_functions
함수build_covariance_from_scaling_rotation
- activation function을 정의
- class variables
- SH degree (active/max)
- optimizer / optimizer type
- xyz(위치)
- features_dc, features_rest (SH)
- scaling
- rotation
- opacity
- max_radii2D
- xyz_gradient_accum
- denom
- percent_dense
- spatial_lr_scale
- class functions
- capture/restore
create_from_pcd
- train시 sparse한 포인트클라우드에서 가우시안을 초기화
def create_from_pcd(self, pcd : BasicPointCloud, cam_infos : int, spatial_lr_scale : float): self.spatial_lr_scale = spatial_lr_scale fused_point_cloud = torch.tensor(np.asarray(pcd.points)).float().cuda() ### RGB -> SH -> features fused_color = RGB2SH(torch.tensor(np.asarray(pcd.colors)).float().cuda()) features = torch.zeros((fused_color.shape[0], 3, (self.max_sh_degree + 1) ** 2)).float().cuda() features[:, :3, 0 ] = fused_color features[:, 3:, 1:] = 0.0 print("Number of points at initialisation : ", fused_point_cloud.shape[0]) ### dist: point cloud에서 가져옴 ### scale: average distance of 3NN, activation is exp so log is added ### rot: 0으로 초기화 dist2 = torch.clamp_min(distCUDA2(torch.from_numpy(np.asarray(pcd.points)).float().cuda()), 0.0000001) scales = torch.log(torch.sqrt(dist2))[...,None].repeat(1, 3) rots = torch.zeros((fused_point_cloud.shape[0], 4), device="cuda") rots[:, 0] = 1 ### opacity: inverse sigmoid opacities = self.inverse_opacity_activation(0.1 * torch.ones((fused_point_cloud.shape[0], 1), dtype=torch.float, device="cuda")) ### create nn.Parameter self._xyz = nn.Parameter(fused_point_cloud.requires_grad_(True)) self._features_dc = nn.Parameter(features[:,:,0:1].transpose(1, 2).contiguous().requires_grad_(True)) self._features_rest = nn.Parameter(features[:,:,1:].transpose(1, 2).contiguous().requires_grad_(True)) self._scaling = nn.Parameter(scales.requires_grad_(True)) self._rotation = nn.Parameter(rots.requires_grad_(True)) self._opacity = nn.Parameter(opacities.requires_grad_(True)) self.max_radii2D = torch.zeros((self.get_xyz.shape[0]), device="cuda") self.exposure_mapping = {cam_info.image_name: idx for idx, cam_info in enumerate(cam_infos)} self.pretrained_exposures = None exposure = torch.eye(3, 4, device="cuda")[None].repeat(len(cam_infos), 1, 1) self._exposure = nn.Parameter(exposure.requires_grad_(True))
load_ply
- render시 ply를 불러와 가우시안으로 바꿈
save_ply
- 가우시안을 ply로 변환하여 저장
reset_opacity
densify_and_split
,densify_and_clone
,densify_and_prune
prune_points
- optimizer을 prune함
- 그 외 다른 변수들도 prune함 (xyz_gradient_accum, denom, max_radii2D, tmp_radii)
densification_postfix
- densify(clone, split)후 새로운 가우시안 optimizer을 추가함
- GaussianModel 클래스를 정의
gaussian_renderer
- init.py
render
함수: diff_gaussian_rasterization의GaussianRasterizer
을 이용해 가우시안을 2D 이미지로 렌더함-
output:
out = { "render": rendered_image, "viewspace_points": screenspace_points, "visibility_filter" : (radii > 0).nonzero(), "radii": radii, "depth" : depth_image } return out
- network_gui.py
- 실시간 렌더러에 쓰이는 로컬 네트워크 구현 코드
utils
- camera_utils.py
- 카메라를 불러오는 함수
- CamInfo로부터 카메라 리스트를 만드는 함수
- json으로 바꾸기 위해 dictionary를 정의하는 함수
- general_utils.py
- inverse sigmoid
- pil to torch
- exponential lr
- rotation
- graphics_utils.py
- view matrix, projection matrix
- image_utils.py
- mse, psnr
- loss_utils.py
- loss에 관련된 함수
- make_depth_scale.py
- 큰 범위의 환경일 경우 스케일을 줄이는 함수
- read_write_model.py
- bin/txt에서 byte로 read/write
- sh_utils.py
- spherical harmonics
- system_utils.py
- mkdir