3dpcp/doc/high_level_doc/documentation.tex
2012-09-16 14:33:11 +02:00

540 lines
No EOL
18 KiB
TeX

\documentclass[12pt,dvips]{article}
\usepackage[T1]{fontenc}
\usepackage[isolatin]{inputenc}
\usepackage{a4}
\usepackage{url}
\usepackage{afterpage}
\usepackage{latexsym}
\usepackage{graphicx}
\usepackage{longtable}
\usepackage{amsmath}
\usepackage[rflt]{floatflt}
\usepackage{fancybox}
\usepackage{bbm}
\usepackage{cite}
\usepackage{picins}
\usepackage[dvips]{epsfig}
\parindent0mm
\parskip5pt plus1pt minus2pt
\textwidth16cm
\textheight22cm
\topmargin-1cm
\oddsidemargin0cm
\renewcommand\floatpagefraction{1.0}
\renewcommand\topfraction{1.0}
\renewcommand\bottomfraction{1.0}
\renewcommand\textfraction{0.0}
\def\dbltopfraction{1.0}
\def\bottomfraction{1.0}
\def\dblfloatpagefraction{1.0}
\setcounter{secnumdepth}{5}
%% \setcounter{section}{-1}
\newcommand{\OO}{{\cal O}}
\newcommand{\bc}{\begin{center}}
\newcommand{\ec}{\end{center}}
\newcommand{\VNull}{\mathbf{0}}
%\newcommand{\transp}{^{\textsf{T}}}
\newcommand{\transp}{^T}
\renewcommand{\phi}{\varphi}
\newcommand{\verf}[1]{\textsf{#1}}
\renewcommand{\v}[1]{\text{\boldmath $#1$}}
\newcommand{\V}[1]{\text{\boldmath $#1$}}
\newcommand{\m}[1]{\v{#1}} % Format "Matrix"
\newcommand{\M}[1]{\v{#1}} % Format "Matrix"
\newcommand{\sR}{\ensuremath{{\cal R}}} % Referenz-Scan
\newcommand{\sS}{\ensuremath{{\cal S}}} % aktueller Scan
\renewcommand{\O}{{\cal O}}
\newcommand{\norm}[1]{\left | \left | #1 \right | \right |}
\newcommand{\Vector}[2]{\begin{pmatrix} #1 \\ #2 \end{pmatrix}}
\newcommand{\VEctor}[3]{\begin{pmatrix} #1 \\ #2 \\ #3 \end{pmatrix}}
\newcommand{\hist}[1]{{\cal #1}}
\newcommand{\trace}[1]{\mbox{tr}\left( #1 \right)}
\newcommand{\rVEctor}[3]{
\left(\!\!\!\begin{array}{l}
#1 \\ #2 \\ #3
\end{array}\!\!\right)}
\newenvironment{rmatrix}
{\left(\!\!\!\begin{array}{rr}}
{\end{array}\!\!\right)}
\newenvironment{Rmatrix}
{\left(\!\!\!\begin{array}{rrr}}
{\end{array}\!\!\right)}
\newcommand{\filter}[3]{\ensuremath{\bigl[\,#1,\;#2,\;#3\,\bigr]}}
\newcommand{\R}{\mathbbm{R}}
\newcommand{\N}{\mathbbm{N}}
%\newcommand{\R}{\mathbb{R}}
%\newcommand{\N}{\mathbb{N}}
\newcommand{\grad}{\ensuremath{^{\circ}}}
\newcommand{\textmenge}[2]{\{\, #1 \,| \; #2 \}}
\newcommand{\menge}[2]{\left\{\, #1 \,\left| \; #2 \right.\right\} }
\newcommand{\Menge}[2]{\bigl\{\, #1 \,| \; #2 \bigr\} }
\newcommand{\fueralle}[2]{\forall \, #1\!: \; #2}
\newcommand{\esgibt}[2]{\exists \, #1\!: \; #2}
\newcommand{\Widehat}[1]{\widehat{#1\hspace{+.5ex}}\hspace{-.5ex}}
\newcommand{\textSF}[1]{\textsf{\small #1}}
\newcommand{\textSFsmall}[1]{\textsf{\scriptsize #1}}
\newcommand{\verfSmall}[1]{\textSFsmall{#1}}
\newcommand{\eps}{{\epsilon}}
\renewcommand{\epsilon}{\varepsilon}
\newcommand{\vR}{v_{\text{set}}}
\newcommand{\oR}{\omega_{\text{set}}}
\newcommand{\ATAN}{\text{atan2}}
%\renewcommand{\familydefault}{\sfdefault}
%%----------------------------------------------------------------
%%----------------------------------------------------------------
\begin{document}
{
\title{\vspace*{-5mm}
\bf 6D SLAM -- \\%[.5cm]
Simultaneous 6 D.O.F. Localization \\and 3D Mapping\\%[1cm]
}
\author{
\scalebox{.65}{\includegraphics{unilogo.eps}}\\[3ex]
Andreas N\"uchter, Kai Lingemann, Joachim Hertzberg\\
Department of Mathematics/Computer Science\\
Institute of Computer Science\\
Knowledge-Based Systems Research Group\\
University of Osnabr\"uck\\
{\small \texttt{http://www.informatik.uni-osnabrueck.de/kbs/}}
}
\maketitle
\thispagestyle{empty}
\vspace*{-12mm}
\begin{center}
\includegraphics[height=50mm]{stylish_scanner}
\end{center}
\vspace*{1mm}
\begin{center}
\textbf{Documentation}
\end{center}
\begin{quote}
This document describes the algorithms for 6D SLAM --
Simultaneous 6 D.O.F. Localization and 3D Mapping system. 6D SLAM
with mobile robots considers six dimensions for the robot pose,
namely the $x$, $y$ and $z$ coordinates and the roll, yaw and
pitch angles. Robot motion and localization on natural surfaces,
e.g., driving with a mobile robot outdoor, must necessarily
regard these degrees of freedom.
\end{quote}
\newpage
\setcounter{page}{1}
\section{Range Image Registration and Robot Relocalization}
Multiple 3D scans are necessary to digitalize environments
without occlusions. To create a correct and consistent model, the
scans have to be merged into one coordinate system. This process
is called registration. If robot carrying the 3D scanner were
precisely localized, the registration could be done directly
based on the robot pose. However, due to the unprecise robot
sensors, self localization is erroneous, so the geometric
structure of overlapping 3D scans has to be considered for
registration.
The following method for registration of point sets is part of
many publications, so only a short summary is given here. The
complete algorithm was invented in 1992 and can be found, e.g.,
in \cite{Besl_1992}. The method is called \emph{Iterative Closest
Points (ICP) algorithm}.
Given two independently acquired sets of 3D points, $M$ (model
set, $|M| = N_m$) and $D$ (data set, $|D| = N_d$) which
correspond to a single shape, we aim to find the transformation
consisting of a rotation $\M R$ and a translation $\V t$ which
minimizes the following cost function:
\begin{equation}\label{DMin}
E(\M R, \V t) =
\sum_{i=1}^{N_m}\sum_{j=1}^{N_d}w_{i,j}\norm{\V m_{i}-(\M R
\V d_j+\V t)}^2.
\end{equation}
$w_{i,j}$ is assigned 1 if the $i$-th point of $M$ describes the
same point in space as the $j$-th point of $D$. Otherwise
$w_{i,j}$ is 0. Two things have to be calculated: First, the
corresponding points, and second, the transformation ($\M R$,
$\V t$) that minimize $E(\M R, \V t)$ on the base of the
corresponding points.
The ICP algorithm calculates iteratively the point
correspondences. In each iteration step, the algorithm selects
the closest points as correspondences and calculates the
transformation ($\M R, \V t$) for minimizing equation
(\ref{DMin}). The assumption is that in the last iteration step
the point correspondences are correct. Besl et al. prove that
the method terminates in a minimum \cite{Besl_1992}. However,
this theorem does not hold in our case, since we use a maximum
tolerable distance $d_\text{max}$ for associating the scan
data. Such a threshold is required, given that the 3D scans
overlap only partially. Fig. \ref{samplematch} (top) shows three
frames, i.e., iteration steps, of the ICP algorithm. The bottom
part shows the start poses $(x,z,\theta_y)$ from which a correct
matching is possible, here with only three degrees of freedom.
\begin{figure*}
\begin{center}
\includegraphics[width=50mm]{frame1_final}~~
\includegraphics[width=50mm]{frame2_final}~~
\includegraphics[width=50mm]{frame3_final} \\[1.5ex]
\caption{{Left: Initial odometry based pose of two 3D
scans. Middle: Pose after five ICP iterations. Right: final
alignment, pairwise matching.
\vspace*{-6mm}
}}\label{samplematch}
\end{center}
\end{figure*}
\subsection{Calculation of the rotation and translation}
In every iteration the optimal tranformation ($\M R$, $\V t$)
has to be computed. Eq. (\ref{DMin}) can be reduced to
\begin{eqnarray}
E(\M R, \V t) & \propto & \frac{1}{N} \sum_{i=1}^N
\norm{\V m_i - (\M R \V d_i + \V t)}^2,\label{DualDMin}
\end{eqnarray}
with $N = \sum_{i=1}^{N_m}\sum_{j=1}^{N_d}w_{i,j}$, since the
correspondence matix can be represented by a vector containing
the point pairs.
Four methods are known to minimize eq. (\ref{DualDMin})
\cite{Lorusso_1995}. The 6D SLAM system uses the following one,
based on singular value decomposition (SVD), is robust and easy
to implement, thus we give a brief overview of the SVD-based
algorithms. It was first published by Arun, Huang and Blostein
\cite{Arun_1987}. The difficulty of this minimization problem is
to enforce the orthonormality of matrix $\M R$. The first step of
the computation is to decouple the calculation of the rotation
$\M R$ from the translation $\V t$ using the centroids of the
points belonging to the matching, i.e.,
\begin{eqnarray}
\V c_m = \frac{1}{N} \sum_{i=1}^{N} \V m_{i}, \qquad \qquad \V c_d = \frac{1}{N}
\sum_{i=1}^{N} \V d_{j}\label{schwerpunkt2}
\end{eqnarray}
and
\begin{eqnarray}
M' &=& \{ \V m'_{i} = \V m_{i} - \V c_{m} \}_{1,\ldots,N}, \label{d_neu1}\\
\qquad
D' &=& \{ \V d'_{i}\ = \V d_{i}\, - \V c_{d} \}_{1,\ldots,N}\label{d_neu2}.
\end{eqnarray}
After replacing (\ref{schwerpunkt2}), (\ref{d_neu1}) and
(\ref{d_neu2}) in the error function, $E(\M R,\V t)$
eq. (\ref{DualDMin}) becomes:
\begin{subequations}
\begin{eqnarray}
E(\M R, \V t)
\!\!\!\!\!&\propto&\!\!\!\!\! \frac{1}{N} \sum_{i=1}^{N}
\lvert\lvert{\V m'_{i}-\M R \V d'_i-\underbrace{(\V t-\V c_m+\M R
\V c_d)}_{= \tilde {\V t}}\lvert\lvert}^2
\nonumber \\
&=&\!\!\!\!\! \frac{1}{N}\sum_{i=1}^{N}\norm{\V m'_{i}-\M R \V d'_i}^2
\label{DMinpart1} \\
&&- \frac{2}{N} \tilde {\V t} \cdot \sum_{i=1}^{N} \left( \V
m'_{i}-\M R \V d'_i \right) \label{DMinpart2} \\
&&+ \frac{1}{N}
\sum_{i=1}^{N}\norm{\tilde {\V
t}}^2. \label{DMinpart3}\label{fehlerneu}
\end{eqnarray}
\end{subequations}
In order to minimize the sum above, all terms have to be
minimized. The second sum (\ref{DMinpart2}) is zero, since all
values refer to centroid. The third part (\ref{DMinpart3}) has
its minimum for $\tilde {\V t} = \VNull$ or
\begin{eqnarray}\label{translatB}
\V t = \V c_m - \M R \V c_d.
\end{eqnarray}
Therefore the algorithm has to minimize only the first
term, and the error function is expressed in terms of the
rotation only:
\begin{eqnarray}\label{DMinn}
E(\M R, \V t) \propto
\sum_{i=1}^{N}\norm{\V m'_{i}-\M R \V d'_i}^2.
\end{eqnarray}
\noindent \textit{Theorem:} The optimal rotation is calculated
by $\M R = \M V \M U^T$. Herby the matrices $\M V$ and $\M U$ are
derived by the singular value decomposition $\M H = \M U \M
\Lambda \M V^T$ of a correlation matrix $\M H$. This $3 \times 3$ matrix
$\M H$ is given by
\begin{eqnarray}
\M H = \sum_{i=1}^{N} \V m'^T_i \V d'_i
= \left(
\begin{array}{ccc}
S_{xx} & S_{xy} & S_{xz} \\
S_{yx} & S_{yy} & S_{yz} \\
S_{zx} & S_{zy} & S_{zz} \\
\end{array}
\right), \label{Korrelationsmatrix}
\end{eqnarray}
with $S_{xx} = \sum_{i=1}^{N} \ m'_{ix} d'_{ix}, \ S_{xy} =
\sum_{i=1}^{N} \ m'_{ix} d'_{iy}, \ \ldots \, $. The analogous
algorithm is derived directly from this theorem.
\medskip
\textit{Proof:} Since rotation is length preserving, i.e.,
$\lvert\lvert\M R\V d'_i\lvert\lvert^2 = \lvert\lvert \V
d'_i\lvert \lvert^2$ the error function (\ref{DMinn}) is expanded
\begin{eqnarray*}
E(\M R, \V t) \propto
\sum_{i=1}^{N}\norm{\V m'_i}^2
- 2 \sum_{i=1}^{N} \V m'_{i} \cdot \M R
\V d'_i
+ \sum_{i=1}^{N} \norm{\V d'_i}^2.
\end{eqnarray*}
The rotation affects only the middle term, thus it is sufficient
to maximize
\begin{eqnarray}\label{max}
\sum_{i=1}^{N} \V m'_{i} \cdot \M R \V
d'_i
& = &
\sum_{i=1}^{N} \V {m'_{i}}^T \M R \V
d'_i.
\end{eqnarray}
Using the trace of a matrix, (\ref{max}) can be rewritten to
obtain
\begin{eqnarray*}
\trace{\sum_{i=1}^{N} \M R \V d'_i \V {m'_{i}}^T } =
\trace{\M R \M H},
\end{eqnarray*}
With $\M H$ defined as in (\ref{Korrelationsmatrix}). Now we have
to find the matrix $\M R$ that maximizes $\trace{\M R \M H}$.
Assume that the singular value decomposition of $\M H$ is
\begin{eqnarray*}
\M H = \M U \M \Lambda \M V^T,
\end{eqnarray*}
with $\M U$ and $\M V$ orthonormal $3 \times 3$ matrices and $\M
\Lambda$ a $3 \times 3$ diagonal matrix without negative
elements. Suppose
\begin{eqnarray*}
\M R = \M V \M U^T.
\end{eqnarray*}
$\M R$ is orthonormal and
\begin{eqnarray*}
\M R \M H & = & \M V \M U^T \M U \M \Lambda \M V^T \\
& = & \M V \M \Lambda \M V^T
\end{eqnarray*}
is a symmetric, positive definite matrix. Arun, Huang and
Blostein provide a lemma to show that
\begin{eqnarray*}
\trace{\M R \M H} \geq \trace{\M B \M R \M H}
\end{eqnarray*}
for any orthonormal matrix $\M B$. Therefore the matrix $\M R$ is
optimal. Prooving the lemma is straightforward using the
Cauchy-Schwarz \cite{Arun_1987}. Finally, the
optimal translation is calculated as (cf. eq. (\ref{DMinpart3})
and (\ref{translatB}))
\begin{eqnarray*}
\V t = \V c_m - \M R \V c_d.
\end{eqnarray*}
\section{ICP-based 6D SLAM}
To match two 3D scans with the ICP algorithm it is necessary to
have a sufficient starting guess for the second scan pose.
\begin{itemize}
\item
Extrapolate the odometry readings to all six degrees of freedom
using previous registration matrices. The change of the robot
pose $\Delta \M P$ given the odometry information
$(x_n,z_n,\theta_{y,n})$, $(x_{n+1},z_{n+1},\theta_{y,n+1})$
and the registration matrix $\M R({\theta_{x,{n}}},
{\theta_{y,{n}}}, {\theta_{z,{n}}})$ is calculated by solving:
\begin{small}
\begin{eqnarray}
\left(
\begin{array}{c}
x_{n+1} \\
y_{n+1} \\
z_{n+1} \\
{\theta_{x,{n+1}}} \\
{\theta_{y,{n+1}}} \\
{\theta_{z,{n+1}}} \\
\end{array}
\right)
=
\left(
\begin{array}{c}
x_{n} \\
y_{n} \\
z_{n} \\
{\theta_{x,{n}}} \\
{\theta_{y,{n}}} \\
{\theta_{z,{n}}} \\
\end{array}
\right)
+
\left(
\begin{array}{ccc|ccc}
& & & & & \\
& \M R({\theta_{x,{n}}},{\theta_{y,{n}}}, {\theta_{z,{n}}}) & & & \M 0 &
\\
& & & & & \\
\hline
& & & 1 & 0 & 0 \\
& \M 0 & & 0 & 1 & 0 \\
& & & 0 & 0 & 1 \\
\end{array}
\right)
\cdot
\underbrace{\left(
\begin{array}{c}
\Delta x_{n+1} \\
\Delta y_{n+1} \\
\Delta z_{n+1} \\
\Delta {\theta_{x,{n+1}}} \\
\Delta {\theta_{y,{n+1}}} \\
\Delta {\theta_{z,{n+1}}} \\
\end{array}
\right).}_{\Delta \V P} \label{PosUpdate6D}
\label{extrapol}
\end{eqnarray}
\end{small}
Therefore, calculating $\Delta \V P$ requires a matrix
inversion. Finally, the 6D pose $\M P_{n+1}$ is calculated by
\vspace*{-2mm}
\begin{small}
\begin{eqnarray*}\label{inital6DPose}
\M P_{n+1} = \Delta \M P \cdot \M P_{n}
\end{eqnarray*}
\end{small}
using the poses' matrix representations.\\[-1.5ex]
\end{itemize}
\section{Variable Correspondences}
\begin{tabular}{lll}
$(\M R, \V t)$ & \texttt{double alignxf[16]} & Transformation
Matrix \\
$\V m_i$, $\V d_i$ & \texttt{class PtPair} & Point Pair\\
$\V c_m$, $\V c_d$ & \texttt{double cm[3], cd[3]} & Centroids\\
$\V m'_i$, $\V d'_i$ & \texttt{double** m, d} & Centered
Point Pairs\\
$\M H$, $\M U$, $\M \Lambda$, $\M V$ & \texttt{Matrix} & SVD
Matrices\\
$\M R$ & \texttt{double transMat[16]} & Pose as Matrix\\
$(x_n, y_n, z_n)$ & \texttt{double rPos[3]} & Position of $n$-th 3D Scan\\
$(\theta_{x,n}, \theta_{y,n}, \theta_{z,n})$ & \texttt{double
rPostheta[3]} & Rotation of $n$-th 3D Scan\\
\end{tabular}
\section{File Formats and Units}
\begin{figure}
\begin{center}
\includegraphics[width=3.2cm]{coordinate_system_white}
\caption{Left handed coordinate system.}\label{coord}
\end{center}
\end{figure}
The coordinate system is left handed, with the $y$ axis pointing
upwards, and the depth axis $z$ (cf. Figure~\ref{coord}). Input
and output files are:
\footnotetext[1]{stating
with \texttt{XXX} = 000, 001,\dots, until no more files are
found in the specified directory.}
\begin{enumerate}
\item The 3D scan files (\texttt{scanXXX.3d})\footnotemark[1]\
have to be of the following structure:\\ The first line
contains the scan's resolution (w x b), followed by lines of
data points (x, y, z).
%
\item The pose files (\texttt{scanXXX.pose})\footnotemark[1]\ associated with each
3D scan contain information of the estimated pose of the
respective scan as given by, e.g., odometry. The first line
contains the 3 translatorial positions ($x$, $y$, $z$), the
second the rotations pitch, yaw and roll ($\theta_x$,
$\theta_y$, $\theta_z$ around the respective axis) in deg.\\
Values that are not estimated by the robot (odometry) can be
set to 0 and are extrapolated as described by Eq.~\eqref{extrapol}.
%
\item The SLAM program generated files \texttt{scanXXX.frames},
consisting of the transformations computed from the scan
matching. Each line contains a $4 \times 4$ OpenGL-style
matrix. The very last matrix is the final transformation for
registering the scan into the common coordinate system.
The matrix is stored in the following format:
\begin{quote}
$(R[1,1], R[1,2], 0, R[1,3], 0, R[2,1], R[2,2], R[2,3], 0,
R[3,1], R[3,2], R[3,3], 0,$\\$t[1], t[2], t[3], 1),$
\end{quote}
with $R[x,y]$ the $(x,y)$-th entry of the rotation matrix $\M R$, and
$t[x]$ the $x$-th translation component of $\V t$.
\end{enumerate}
\section{Requirements}
All executables can be compiled and used both with Linux and Windows.
\begin{description}
\item[Linux:] The system was developed and tested under Linux 9.1 with
the g++ compiler version 3.3.3. As additional library,
\texttt{OpenGL} and \texttt{glut} have to be installed, which
should be included in your Linux distribution (tested with
freeglut 2.2.0-78).\\
To compile, type in \texttt{make} in the main directory. The
executables are generated in the \texttt{./bin} directory.
\item[Windows:]
The system was tested with the C++ compiler from Microsoft Visual
Studio.NET 2005.
To compile, load the respective project file (\texttt{.sln})
from the directory \texttt{.$\backslash$Visual\_Studio\_Projects$\backslash$}.
The executables are generated in the respective \texttt{Debug}
or \texttt{Release} directories, depending on your compiler settings.\\
Precompiled versions can be found in the \texttt{./bin}
directory, too. If moving the executables, take care about the \texttt{glut}
directory as well.
\end{description}
\section{Usage}
For a detailed explanation about the programs' usage, just start
the respective binary. Both applcations can be configured by a
set of command line parameters, which are explained when starting
the program as mentioned above.
Especially, take care of the reduction parameters
\texttt{-r}/\texttt{-R} of the SLAM system: Without using one of
those, the registration is being slowed down tremendously due to
taking \emph{all} data points as input. Other potentially
critical parameters are the maximal distance of points that may
form corresponding point pairs (matrix entries $w_{ij}$,
parameter \texttt{-d}), as well as the maximal range distance of
points used for scan matching or displaying (\texttt{-m}),
especially used for eliminating outliers (i.e., data points with
the maximal range distance of the range finder).
\bibliographystyle{plain}
\bibliography{diss,paper,diplom}
\end{document}