Real-time 3D Reconstruction using Kinect

Real-time 3D Reconstruction

Jiakai Zhang, Prof. Davi Geiger
New York University
July 2012 – September 2012

In order to reconstruct an indoor scene using a moving Kinect camera, I first needed toalign point clouds of different frames, then integrate them and rebuild the surface, and finally realize the real-time reconstruction using CUDA language.

More details are in my report.

Here is the pipeline:

Figure 1 pipeline

1. Input raw data – depth image

The figure 2 shows the raw data from Kinect which is RGB Image and Depth Image.

Figure 2 Raw Data from Kinect

The Kinect Camera has 30 FPS. The resolution for the depth image is 640 by 480.

2. Noise reduction – bilateral filtering

The raw depth data from the Kinect is pretty noisy. It’s hard to use for camera tracking. If I apply the Phong-shading to represent the normal map, the noisy normal vectors make the objects irregularity.

Figure 3 Raw Normal Map

Thus we implement a bilateral filtering which is used to smooth the depth image and remove noise while still preserving edges. The details of this algorithm shows on this Web Page. The figure 4 shows the result by choosing different parameters of filtering


Figure 4 bilateral filtering process results

3. Camera Pose Estimation – ICP

The input of ICP is the consecutive cloud points and normal vectors in different frames. The output is the 6DOF transformation matrix T which indicates the pose of camera. The figure 5 shows the results before and after applying ICP. The two images are obtained from two different viewports but the same scene.

Figure 5 ICP Result

6. Update reconstruction – TSDF and Ray Casting

Once I know the position and rotation relations between frames, I can use TSDF to merge all frame depth map into one. Here I use truncated signed distance function (TSDF) to save merged data. TSDF actually a 3d tensor or I call it a cube, which represents the space I are measuring. The value of each volume in the cube is the distance to closest surface. And this distance is signed and truncated. If the volume is behind the surface in the view of camera, then I set distance a negative value. If the distance between volume and surface is too long, then I set the distance equal to 1 or -1. I use truncation to efficiently get parallel surfaces.

After updating the TSDF cube, I choose the particular camera position to cast ray to the volume of the TSDF cube. If we find the sign of the TSDF value changes, it means we find a point on the surface. And we calculate the normal vector by calculating the gradient of TSDF at this point. The figure 6 shows the result of ray casting.

Figure 6 Ray Casting

7. Reference

[1] KinectFusion: Real-Time Dense Surface Mapping and Tracking. Microsoft Research
[2] B. Curless and M. Levoy. A volumetric method for building complex models from range images.
[3] M. Harris, S. Sengupta, and J. D. Owens. Parallel prefix sum (scan) with CUDA. In H. Nguyen, editor, GPU Gems 3, chapter 39, pages 851–876.
[4] C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In Proceedings of the ICCV, 1998.
[5] C. Rasch and T. Satzger. Remarks on the O(N) implementation of the fast marching method.
[6] Y. Chen and G. Medioni. Object modeling by registration of multiple range images. Image and Vision Computing (IVC), 10(3):145–155,1992
[7] Kok-Lim Low Linear Least-Squares Optimization for Point-to-Plane ICP Surface Registration

Real-time 3D Reconstruction using Kinect的更多相关文章

  1. Camera Calibration and 3D Reconstruction


  2. Multi-View 3D Reconstruction with Geometry and Shading——Part-2

    From PhDTheses Multi-View 3D Reconstruction with Geometry and Shading 我们的主要目标是只利用图像中的信息而没有额外的限制或假设来得 ...

  3. Multi-View 3D Reconstruction with Geometry and Shading——Part-1

    From PhDTheses Multi-View 3D Reconstruction with Geometry and Shading 计算机视觉的主要任务就是利用图像信息能智能理解周围的世界. ...

  4. [SLAM] 02. Some basic algorithms of 3D reconstruction

    链接: 三维重建 3D reconstruction的一个算法思路介绍,帮助理解 首先一切 ...

  5. [SLAM] 02 Some algorithms of 3D reconstruction

    链接: 首先一切建立在相机模型 x = kPX 上   x,X分别代表图片和空间中的二维三 ...

  6. 能否通过六面照片构建3D模型?比如人脸,全身的多角度照片,生成3D模型。? 9023 ​添加评论 ​分享 ​邀请回答​举报 ​ 收起 已关注写回答   9 个回答 默认排序​ 叛逆者 计算机图形学 ...

  7. 用基于WebGL的BabylonJS来共享你的3D扫描模型

    转自: 用基于WebGL的BabylonJS来共享你的3D扫描模型 杰克祥子 2014 年 2 月 26 日 0 条评论 标签:3D扫描 , B ...

  8. 3D重建算法原理

    3D重建算法原理 三维重建(3D Reconstruction)技术一直是计算机图形学和计算机视觉领域的一个热点课题.早期的三维重建技术通常以二维图像作为输入,重建出场景中的三维模型.但是,受限于输入 ...

  9. 2020国防科大综述:3D点云深度学习——综述(3D点云分割部分)

    目录 摘要 1.引言: 2.背景 2.1 数据集 2.2评价指标 3.3D点云分割 3.1 3D语义分割 3.1.1 基于投影的方法 多视图表示 球形表示 3.1.2 基于离散的方法 稠密离散表示 稀 ...


  1. MySql 模糊查询、范围查询

    实例: SQL模糊查询,使用like比较关键字,加上SQL里的通配符,请参考以下: 1.LIKE'Mc%' 将搜索以字母 Mc 开头的所有字符串(如 McBadden). 2.LIKE'%inger' ...

  2. wpf企业应用之主从结构列表

    主从结构在企业级应用中相当常见,这里结合我的例子谈一下wpf中主从结构列表展示的常用做法,具体效果见 wpf企业级开发中的几种常见业务场景. 首先,Model有两种,主表对应model(假设为mode ...

  3. [BZOJ4811][YNOI2017]由乃的OJ(树链剖分+线段树)

    起床困难综合症那题,只要从高往低贪心,每次暴力跑一边看这一位输入0和1分别得到什么结果即可. 放到序列上且带修改,只要对每位维护一个线段树,每个节点分别记录0和1从左往右和从右往左走完这段区间后变成的 ...

  4. 拆分Cocos2dx渲染部分代码

    纹理实现 思想 这个是Cocos2dx的渲染部分的最基本的实现,被我拆分到mac上,但是并不是用的EGLContext,而是搭配glfw,还有soil第三方图形库. 实现 // // main.cpp ...

  5. 计算机音频基础-PCM简介

    我们在音频处理的时候经常会接触到PCM数据:它是模拟音频信号经模数转换(A/D变换)直接形成的二进制序列,该文件没有附加的文件头和文件结束标志. 声音本身是模拟信号,而计算机只能识别数字信号,要在计算 ...

  6. Oracle中NVARCHAR2与VARCHAR2的相互转换

    将NVARCHAR2转换为VARCHAR2: declare v_username   varchar2(12)   ; v_nm_login   nvarchar2(12); begin selec ...

  7. ORA-00918:未明确定义列

    <script type="text/javascript"><!-- google_ad_client = "pub-9528830580198364 ...

  8. HUST 1017 Exact cover(DLX精确覆盖)

    Description There is an N*M matrix with only 0s and 1s, (1 <= N,M <= 1000). An exact cover is ...

  9. UVa409_Excuses, Excuses!(小白书字符串专题)

    解题报告 题意: 找包括单词最多的串.有多个按顺序输出 思路: 字典树爆. #include <cstdio> #include <cstring> #include < ...

  10. 多个rs485设备怎样跟上位机通讯? 多个rs485设备怎样跟上位机通讯? [复制链接] |关注本帖     fdemeng 签到天数: 1228 天 ...