The computation of the transform parameters needs a preliminary m

The computation of the transform parameters needs a preliminary manual identification of some reference points on the floor. In [12], the top view is obtained using a self-improving method, whereas in [13] the authors derive another solution based on a so-called ��V-disparity image�� technique. Similarly to the configuration previously discussed, the problem of partial occlusions can still exist. One of these situations occurs, for example, when the subject to monitor is behind a bulky object, like a couch or an armchair.

Taking into consideration the techniques described above, our solution has the following advantages:the top view depth frames are directly available, without the need of a transformation process applied to the spatial coordinates;the direct top view allows a better monitoring of the scene, than the ones in [9,13], and the occlusion phenomenon is therefore reduced;avoiding a machine learning solution in our approach strongly reduces the computational demand;the algorithm is portable on different hardware platforms, as it works on raw depth data, possibly captured by different sensors, not only Kinect?. This is not the case for the system proposed in [10], which is bound to the NITE 2 middleware.3.?The Proposed MethodThe system setup adopts a Kinect? sensor in top view configuration, at a distance of 3 m (MaxHeight) from the floor, thus providing a coverage area of 8.25 m2. To extend the monitored area, the sensor can be elevated up to around 7 m; beyond this distance the depth data become unreliable.

The algorithm works with raw depth data (given in millimeters), that are captured at a GSK-3 frame rate of 30 fps with a resolution of 320 �� 240 pixels, using Microsoft SDK v.1.5.3.1. Preprocessing and SegmentationThe input depth frame (DF) is represented in Figure 2a. As discussed in Section 2, the operation of floor identification implemented in our system is simpler than the solutions proposed in [11�C13]. In DF, all the pixels for which the difference of their depth value from the MaxHeight value is within the range of 200 mm, are set as belonging to the floor surface, thus obtaining a modified depth frame (DFm). This range is empirically evaluated, and it depends on the MaxHeight value. When the sensor cannot evaluate the depth information for some pixels, as those corresponding to corners, shadowed areas, and dark objects, it assigns them a null depth value.

There are various approaches to handle null depth values: differently from [9], where the null values are discarded, in [14] the authors propose a substitution process. In [15], a so-called ��flood-fill�� algorithm is used to resolve this problem, while in this work the null pixels are replaced by the first valid depth value occurring in the same row of the frame.Figure 2.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>