141x Filetype PDF File size 0.75 MB Source: egt.dii.unisi.it
The Epipolar Geometry Toolbox: multiple view geometry and visual servoing for MATLAB Gian Luca Mariottini and Domenico Prattichizzo Dipartimento di Ingegneria dell’Informazione, Universita` di Siena Via Roma 56, 53100 Siena, Italy Email: {gmariottini,prattichizzo}@dii.unisi.it Abstract—TheEpipolarGeometryToolbox(EGT)wasrealized Let us emphasize that EGT can also be successfully em- to provide a MATLAB user with an extensible framework for ployed in many other contexts when single and multiple view the creation and visualization of multi-camera scenarios and the geometry is involved as, for example, in visual odometry and manipulation of the visual information and the geometry between structure from motion applications [23] [22]. For example in them. Functions provided, for both pin-hole and panoramic the first work an interesting “visual odometry” approach for vision sensors, include camera placement and visualization, computation and estimation of epipolar geometry entities and robot SLAM is proposed in which the multiple view geometry many others. The compatibility of EGT with the Robotics is used to estimate the camera motion from pairs of images Toolbox [7] allows to address general vision-based control issues. without requiring the knowledge of the observed scene. Two applications of EGT to visual servoing tasks are here EGT, as the Robotics Toolbox, is a simulation environment, provided. This article introduces the Toolbox in tutorial form. but the EGT functions can be easily embedded by the user in Examples are provided to show its capabilities. The complete toolbox, the detailed manual and demo examples are freely Simulink models. In this way, thanks to the MATLAB Real- available on the EGT web site [21]. TimeWorkshop,theuser can generate and execute stand-alone I. INTRODUCTION C code for many off-line and real-time applications. A distinguishable remark of EGT is that it can be used The Epipolar Geometry Toolbox (EGT) is a toolbox de- to create and manipulate visual data provided by both pin- signed for MATLAB [29]. MATLAB is a software envi- hole and panoramic cameras. Catadioptric cameras, due to ronment, available for a wide range of platforms, designed their wide field of view, has been recently applied in visual around linear algebra principles and graphical presentations servoing [32]. also for large datasets. Its core functionalities are extended The second motivation lead to the development of EGT was by the use of many additional toolboxes. Combined with the increasing distribution of “free” software in the latest years, interactive MATLAB environment and advanced graphical on the basis of the Free Software Foundation [10] principles. functions, EGT provides a wide set of functions to approach In this way users are allowed, and also encouraged, to adapt computer vision problems especially with multiple views. and improve the program as dictated by their needs. Examples The Epipolar Geometry Toolbox allows to design vision- of programs that follow these principles include for instance based control systems for both pin-hole and central panoramic the Robotics Toolbox [7], for the creation of simulations cameras. EGT is fully compatible with the well known in robotics, and the Intel’s OpenCV C++ libraries for the Robotics Toolbox by Corke [7]. The increasing interest in implementation of computer vision algorithms, such as image robotic visual servoing for both 6DOF kinematic chains and processing and object recognition [1]. mobile robots equipped with pin-hole or panoramic cameras The third important motivation for EGT was the availability fixed to the workspace or to the robot, motivated the develop- and increasing sophistication of MATLAB. EGT could have ment of EGT. been written in other languages, such as C, C++ and this Several authors, such as [4], [9], [18], [20], [24], have would have freed it from dependency on other software. proposed new visual servoing strategies based on the geom- However these low-level languages are not so conducive to etry relating multiple views acquired from different camera rapid program development as MATLAB. configurations, i.e. the Epipolar Geometry [14]. This tutorial assumes the reader has familiarity with MAT- In these years we have observed the necessity to develop LABandpresents the basic EGT functions, after short theory a software environment that could help researchers to rapidly recalls, together with intuitive examples. In this tutorial we create a multiple camera setup, use visual data and design also present two applications of EGT to visual servoing. new visual servoing algorithms. With EGT we provide a wide Section 2 presents the basic vector notation in EGT, while set of easy-to-use and completely customizable functions to in Section 3 the pin-hole and omnidirectional camera models design general multi-camera scenarios and manipulate the together with EGT basic functions are presented. In Section 4 visual information between them. we present the setup for multiple camera geometry (Epipolar x y c Consider now the more general case in which two camera c frames, referred to as actual and desired, are observing the Xc CCD θ z X Xm c w same point X . From (1) w zw yw d d X = R X +t (3) z w w d w tc Rc c O w ; w O x a a c ψ w w y X = R X +t (4) x m w w a w φ xc World m frame O m Substituting (4) in (3) it follows yc S w zm d T a d T a d Pin-hole Camera X =R R X +R t −t (5) frame S d | w w a w w w c Central Catadioptric Camera {z } | {z } Ra a d t frameSm d Fig. 1. Main reference frames notation and vector representation in EGT. Equation (5) will be very useful in EGT for the analytical computation of epipolar geometry where it is necessary to a a knowtherelative displacement t and orientation R between the two camera frames. d d Geometry) while in Section 5 two applications of EGT to III. PIN-HOLE AND OMNIDIRECTIONAL CAMERA MODELS visual servoing are presented together with simulation results. In Section 6 we make a comparison between EGT and other EGT provides easy-to-use functions for the placement of software packages. EGT can be freely downloaded at [21], pin-hole and central catadioptric (or omnidirectional) cameras. can be used under Windows and requires MATLAB 6.5 or Their imaging model has been here implemented to allow upper. The detailed manual is provided in the EGT web site, users to manipulate the visual information. In this section with a large set of examples, figures and source code also for the fundamentals of perspective and omnidirectional camera beginners. models are quickly reviewed. The reader is referred to [14], [17], [5] for a detailed treatment. According to the purposes II. BASIC VECTOR NOTATION of this tutorial, some basic EGT code examples are reported We here present the basic vector notation adopted in together with the theory. 3 Epipolar Geometry Toolbox. All scene points X ∈ IR w A. Perspective camera are expressed in the world frame S =< O x y z > w w w w w Consider a pin-hole camera located at O as in (Fig.1). When referred to the pin-hole camera frame Fig. 2. The full perspective model describes thecrelation- S =theywill be indicated with X . Moreover c c c c c c ship between a 3D point (in homogeneous coordinates) all scene points expressed w.r.t. a central catadioptric camera e X Y Z 1 T X = expressed in the world frame and frame S = willbeindicatedwithX .For w m m m m m m ˜ u v 1 T the reader convenience we briefly present the basic vector its projection m = onto the image plane ac- notation and transformation [25]. Refer to Fig. 1 and consider cording to e e m=KΠX the 3×1vector X ∈ S . It can be expressed in S as follows: w c c w c c 3×3 X =R X +t (1) where K ∈ IR is the camera intrinsic parameters matrix w w c w given by: c k f γ u where t is the translational vector centered in S and u 0 w w c K= 0 kf v : pointing toward the S frame (Fig. 1). The matrix R is the v 0 c w rotation necessary to align the world frame with the camera 0 0 1 frame. For example we may choose Rc = Rroll;pitch;yaw = w Rz;θRy;φRx;ψ. The homogeneous notation aims to express Image plane (0,0) u (1) in linear form: coordinates Optical u Axis e c e v 0 Xc X X =H X w w w c v0 z e T T e T T w where X = [X 1] , X = [X 1] . The 4 × 4 matrix m y w w c c w Hc is referred to as homogeneous transformation matrix: f w Ow x w zc World frame c c O coordinates c R t c H = w w w 0T 1 xc I yc Camera-centered (R,t) Analogously,a pointX canbeexpressedinthecameraframe coordinates w by the following transformation Fig. 2. The pin-hole camera model. The 3D point X is projected onto m w c T c T c through the optical center Oc. Note that m is expressed in the image plane R −R t e w w w ˜ coordinates (u;v) (pixels). X = X (2) c 0T 1 w 3D setup EGT Tutorial Ex.1 Image Plane EGT Tutorial Ex.1 Here (u ;v ) are the pixels coordinates, in the image frame, 0.3 4 0 0 of the principal point (i.e. the intersection point between the 0.2 3 4 Z wf 3 1 0. image plane and the optical axis z ), k and k are the number Ywf c u v 5 0 Z of pixels per unit distance in image coordinates, f is the focal c 1 2 Xwf length (in meters) and γ is the orthogonality factor of the CCD 0.1 0 4 Zm Xc 2 image axes (skew-factor). 0.2 2 Y 0 c -2 Ym 1 3×4 - 0.3 Matrix Π = [R|t] ∈ IR is the so-called external pa- - 5 4 -12 -10 -6 -8 -6 - -8 0.4 4 -2 0 0.1 0.05 0 0.05 0.1 0.15 0.2 0.25 rameters camera matrix, that contains the rotation R and Xm 2 4 the translation t between the world and the camera frames. (a) (b) According to the commonly used notation, in the case of Fig. 3. Example 1. (a) A pin-hole camera is positioned in t = [−10;−5;0] no camera rotation the optical axis zc of pin-hole cameras in the 3D world frame and rotated by π=4 around the y-axis. (b) The 3D is parallel to the y axis of S . We then define: scene points are projected onto the image plane. Note that in this case K = I w w for simplicity. R = R R T x;−π=2 rpy T c t = − Rx;−π=2Rrpy t w by the use of function f_scenepnt(X) In order to directly obtain the 4×4 homogeneous matrix Hw the function f_Rt2H is provided c >> Xi=[-3, 3, 3, -3]; >> Yi=[ 3, 3, 3, 3]; >> H=f_Rt2H(R,t). >> Zi=[-3, -3, 3, 3]; >> Xw=[Xi; Yi; Zi]; Note that with the use of f_Rt2H the position t of a >> f_scenepnt(Xw); pin-hole camera is specified with respect to the world frame >> f_3Dwfenum(Xw); %enumerate points while the rotation R is referred to the pin-hole camera frame T The perspective projection m = [u;v] of points X axes. During the testing phase at the University of Siena this w choice was appreciated from students of Robotics and Vision is obtained with f_perspproj(Xw,H,K): classes that addressed it as very intuitive. >> [u,v]=f_perspproj(Xw,H,K); Example 1 (3D scene and pin-hole camera placement): >> plot(u,v,’rO’) Consider now a pin-hole camera rotated by Projection of scene points is represented in Fig. 3(b). 3×3 T R=Ry;π=4 ∈IR and translated by t = [−10;−5;0] : >> R=rotoy(pi/4); Note that while the above example describes the placement of 3D points X , EGT is also able to build scenes with more >> t=[-10,-5,0]’; w >> H=f_Rt2H(R,t); complex 3D objects returning surface points and normals (see function f_3Dsurface in [21]). In EGT the camera frame and the associated 3D camera B. Omnidirectional Camera Model can be visualized with functions f_3Dframe(H) and Omnidirectional cameras combine reflective surfaces (mir- f_3Dcamera(H) respectively, where H is the 4×4 rors) and lenses. Several types of panoramic cameras can be homogeneous transformation describing position and obtained simply combining cameras (pin-hole or orthographic) orientation of the camera with respect to S w and mirrors (hyperbolic, parabolic or elliptical) [5]. >> f_3Dframe(H,1); %camera frame Panoramic cameras are classified according to the fact that >> hold on they satisfy or not the single viewpoint constraint guaranteeing >> f_3Dcamera(H); %3D pin-hole camera that the visual sensor only measures the light through a single >> axis equal, grid on, view(12,34) point. Note that this constraint is required for the existence >> title(’3D setup - EGT Tutorial - Ex.1’) of epipolar geometry and for the generation of geometrically correct images [28] [12]. Plot of 3D view is reported in Fig. 3(a). All the functions In [3], Baker et al. derive the entire class of catadioptric have further options. See the EGT Manual [21] for details. systems verifying the single viewpoint constraint. Among i i i i these EGT takes into account catadioptric systems consisting Wecanalso place a set of N 3D points X =[X ; Y ; Z ] w of pin-hole cameras coupled with hyperbolic mirrors, and (e.g. the rectangular panel vertexes) defined as orthographic cameras coupled with parabolic mirrors. X1 X2 ::: XN 3×N In [11] a unified projection model for central catadioptric X = Y1 Y2 ::: YN ∈IR camera systems has been proposed. In particular it was shown w Z1 Z2 ::: ZN that all central panoramic cameras can be modelled by a X In EGT a central catadioptric camera is defined by speci- zm Xw Hyperbolic y Xh fying the homogeneous transformation matrix between mirror m mirror x Xc and world frames Om m z w y m m w m R t H = w w x w T w 0 1 Ow Example 2 (Panoramic camera placement): In EGT a 2e panoramic camera can be placed and visualized. Let us place the camera at t=[-5,-5,0]’ with orientation R≡ Rz;π=4. EGT provides a function to simply visualize the panoramic Pin-hole m camera in the 3D world frame as in Fig. 5. camera zc y f c >> H=[rotoz(pi/4) , [-5,-5,0]’; Oc x c >> 0 0 0 , 1 ]; >> f_3Dpanoramic(H); Fig. 4. The panoramic camera model (pin-hole camera and hyperbolic Moreover, for assigned camera calibration matrix K: mirror). The 3D point X is projected at m through the optical center O , w c after being projected at X through the mirror center Om. h K=[10ˆ3 0 320; 0 10ˆ3 240; particular mapping on a sphere, followed by a projection from 0 0 1 ]; a point on the camera optical axis onto the image plane. the projection of a 3D point Xw=[0,0,4]’ in both In order to keep in EGT a physically meaningful graphi- the camera (m) and mirror (Xh) frames can be obtained from: cal representation we decided, without loosing generality, to represent the central panoramic cameras not as spheres in the >> [m,Xh] = f_panproj(Xw,H,K); space but with the couple of a CCD camera with a parabolic >> or hyperbolic mirror (see for example Fig. 5). m = In what follows the imaging model for a pin-hole camera 4.1048e+002 with hyperbolic mirror is described. 2.4000e+002 Consider now the basic scheme in Fig. 4. Note that in this 1.0000e+000 case all frames (for both the camera and the mirror) are aligned with the world frame. Three important reference frames are Xh = defined: (1) the world reference system centered at O whose w 6.0317e-001 vector is X ; (2) the mirror coordinate system centered at the w 3.7881e-017 T focus O whosevectorisX = [X;Y;Z] ;and(3)thecamera m 3.4120e-001 coordinate system centered at O whose vector is X . c c Henceforth all equations will be expressed in the mirror reference frame if not stated otherwise. EGT - Central Catadioptric Imaging of 3D scene points Refer to Fig. 4 and let a and b be the hyperbolic mirror 6 parameters 2 2 2 4 (z +e) x +y EGT CCD Panoramic Camera Plane − =1 240.5 2 2 2 X a b w Z Z √ c wf 2 2 0 Y with eccentricity e = a +b , the transformation to obtain Zm c Xh -2 X the projection u in the pin-hole camera frame (see Fig. 4) is c Y m wf 240 X given by wf v [pixels] -4 -6 Z 1 m mT m m c m=K R λR (X −t ) +t (6) Y c w w w c c X 2e c 239.5 -6 410 410.2 410.4 410.6 410.8 411 2 -4 u [pixels] where λ = b (−eZ±a||X||) is a nonlinear function of X. K Xm -2 2 2 2 2 2 2 0 b Z −a X −a Y 2 -4 -2 0 2 is the internal calibration matrix of CCD camera looking at the -6 m Ym mirror. t is the mirror center expressed in the camera frame (a) c (b) and corresponds to [0; 0; 2e]. Rm is the matrix representing c the rotation between camera and mirror frames. Analogously Fig. 5. Example 2. (a) A panoramic camera is positioned in [−5;−5;0]T in the 3D world frame and (b) the 3D point X =[0;0;4]T is projected to m m w t and R represent the mirror configuration (rotation and w w the pinhole camera after being projected in X . orientation) with respect to the world frame. h
no reviews yet
Please Login to review.