LIDAR Image Segmentation Using Hierarchical Clustering

Tahereh Sharafi, Gholamreza Akbarizadeh

American Journal of Computing Research Repository

LIDAR Image Segmentation Using Hierarchical Clustering

Tahereh Sharafi1,, Gholamreza Akbarizadeh2

1Islamic Azad University Fasa Branch

2Shahid Chamran University of Ahvaz


Hierarchical clustering method is adopted for LIDAR image segmentation after extracting the intended features for identifying complex objects. In the experiments, four LIDAR images with different numbers of areas (sea, forest, desert, and urban) were used for examining the algorithm. The efficiency of image segmentation was generally evaluated visually because the segments of the main image typically lack certain, fixed features and depend on the criterion used for pattern distance/similarity as well as the threshold for cluster separation. For each experiment, hierarchical clustering method was employed by creating the clustering hierarchy tree and specifying the optimal number of clusters on the basis of tree data. Once the optimal number of clusters was determined, the similarity matrix of data image patterns was separated according to Euclidean distance algorithm in terms of greatest similarity among the patterns to the number of clusters. Clustering was then performed. The program output comprised labeled images for samples specifying which pattern pertains to which cluster. The images associated with each cluster are displayed separated from other clusters with other areas eliminated. The results indicated that for LIDAR images that lack a certain, fixed feature, the hierarchical clustering method for segmentation can perform separation and labeling.

Cite this article:

  • Tahereh Sharafi, Gholamreza Akbarizadeh. LIDAR Image Segmentation Using Hierarchical Clustering. American Journal of Computing Research Repository. Vol. 4, No. 1, 2016, pp 21-29.
  • Sharafi, Tahereh, and Gholamreza Akbarizadeh. "LIDAR Image Segmentation Using Hierarchical Clustering." American Journal of Computing Research Repository 4.1 (2016): 21-29.
  • Sharafi, T. , & Akbarizadeh, G. (2016). LIDAR Image Segmentation Using Hierarchical Clustering. American Journal of Computing Research Repository, 4(1), 21-29.
  • Sharafi, Tahereh, and Gholamreza Akbarizadeh. "LIDAR Image Segmentation Using Hierarchical Clustering." American Journal of Computing Research Repository 4, no. 1 (2016): 21-29.

Import into BibTeX Import into EndNote Import into RefMan Import into RefWorks

At a glance: Figures

1. Introduction

LIDAR technology was introduced as an effective tool for collecting 3D data from complex surfaces, such as an urban residential area. These data constitute the base of such extensive applications as urban 3D modeling, urban management, and communications. Practical applications and analyses have demonstrated that the variety of 3D objects and the great number of locations and communications in images require the adoption of a system for data organization with regard to information extraction. Depending on any application, this can turn out to be highly effective and useful. In the existing methods, building, forest, desert, and sea data are normally separated from one another. The 3D model is reconstructed through modeling different areas and their boundaries and the processing computations pertaining to different areas are carried out. Therefore, in such applications, the important point is to identify the boundary of areas and the differences in their features. In this respect, the first step is to extract information and area features and subsequently separate area patterns in terms of the differences in their features.

High-resolution satellite images contains adequate locational information and details for depicting surface characteristics pertaining to geological sciences such as faults, notches, gaps, and cliffs making possible the analysis of subsections. Contrary to low-resolution images, the intended textures on the earth can be considered in the form of a series of complex patterns including the objects of our interest in these images [1]. The visual analysis of these images is very time-consuming where errors are likely to occur. Thus, an analytical tool for extracting the intended features (such as the intended objects) from these images is sought for [2]. Numerous tools for feature extraction and acquiring the necessary information from these images have been introduced including clustering algorithms and LIDAR data clustering algorithms [3]. Clustering techniques are widely used for area separation and image segmentation and they have yielded satisfactory results.

The use of images with a great number of locational resolutions brings about a series of successive relationships between their radiometric information [4]. These considerations compel us to try to make use of both of these ideas, i.e. numerous images and various locational resolutions for images [5]. In particular, what is intended is the use of radiometric and locational enrichment advantages for proposing multi-resolution representation of data leading to an unsupervised hierarchical feature extraction method [6]. Hence, the capabilities of this method is investigated. Then, feature extraction and image segmentation is put forward aiming to detect (or in other words segment) images using hierarchical clustering technique and extract the intended information.

The edge points are extracted from the depth images produced by the cloud of LIDAR points. They were then classified in terms of their geometry for separating buildings from other features [7]. Image processing techniques were applied to the depth images created by the irregular cloud of LIDAR points. Such criteria as vegetation index and surface normal variances were used for extracting building points [8]. Filtering methods of the cloud of laser points and surface roughness parameters were respectively utilized for separating earth from non-earth points and the plants and trees from buildings [9]. Certain techniques such as K-means clustering (Maboudi, 2005), Fuzzy C-means (Sattari, 2011), surface expansion method [10], as well as area expansion method based on such criteria as surface parameters such as slope in x- and y-direction [11], 2D probability distribution criteria of depth, and 3D probability distribution of points [12], as well as the normal vector [13] were also adopted for segmentation. The separation and incorporation process is employed according to a three-fold data structure span in the 3D space on the basis of the coplanarity condition for the set of points for implementing coplanar points [14]. Different techniques were proposed for extracting building edges. Provided that roof surfaces intersect, the edge of the building roof is usually extracted using this intersection. For extracting the outer edge of the building (i.e. the boundary), other methods were proposed by various researchers [15].

Put forward a hybrid method for classifying trees in LIDAR data using an aggregation of point-based supervised classification and area-based unsupervised clustering techniques. In addition, they proposed a powerful 3D statistical feature method that could overcome the limitations of the existing methods in separating the boundary points of the buildings from tree points. Their proposed method was presented for solving the problem of incorrect classification of the boundary points of buildings and trees. It consisted of three steps: 1- clustering on the basis of continuity and locational similarity, 2- classification using supervised machine learning techniques using a 3D formal feature, and 3- classification improvement and modification using an unsupervised clustering technique. They used SVM algorithm with an RBF kernel for classification and desirable results were obtained from the hybrid method [16].

An object-based system for combining and extracting features from LIDAR images was presented. In spite of the application of LIDAR image technology and other locational information recording equipment of urban regions, a regional coverage with a high resolution is still lacking in many urban regions. The complexity of many existing landscapes, the available bulk of data, and the challenge of fusing the data recorded at different times and with various standards may account for this fact. Object-based techniques are appropriate for overcoming these limitations. The design, development, and application of an object-based system is put forward, which employs LIDAR images and data for creating an inclusive set of data with several million pixels for Philadelphia. A new method that adopts parallel processing enables us to distribute the feature extraction burden to several cores so that a high efficiency and the possibility of the continuous changes of the expert system would be facilitated until achieving the precise objectives of the project [17].

Numerous papers and references may be found that have used clustering techniques for extracting and classifying LIDAR data. In a paper titled “LIDAR Data Clustering using Particle Swarm Algorithm in Urban Regions,” a method based on data clustering was put forward, where the use of the clustering technique and also the complexity and diversity of objects within the image in urban regions including buildings, streets, trees, etc. were justified. Taking into account the possibility to access other information sources such as laser intensity and range information in the first and last echoes, more information may be utilized in the clustering process, so that they would be used for reconstructing and recognizing the intended objects and items. The multi-dimensionality property of LIDAR data with a high sampling rate in urban regions provides a huge bulk of information. The large volume of information increases the complexity of the problem for optimally finding the existing structure in clustering methods. In the paper by Samadzadegan and Saeedi, PSO algorithm was adopted for optimizing the clustering process of LIDAR multi-dimensional data in urban regions. The simultaneous use of the simple k-means algorithm and the capabilities of PSO algorithm, a global clustering technique without getting caught in local optimums was obtained. The proposed algorithm was applied to the LIDAR data of urban regions with different sizes and complexities, confirming the increased efficiency results of clustering using PSO algorithm in terms of accuracy and time.

To extract the intended features and segment LIDAR images, various clustering algorithms were combined and employed for identifying complex objects. It may be stated as a result that the majority of common clustering techniques, which were used for solving other different problems with limited bulk of data and feature space and yielded desirable results, will not exhibit proper efficiency for the process of extracting the intended feature and object from LIDAR data. Thus, an efficient, suitable method for this purpose was sought for leading to hierarchical methods that function as a local optimization. Adopting a hierarchical clustering method for analyzing and segmenting LIDAR images is one of the major innovations of this paper.

Hierarchical clustering creates a hierarchical structure of the clusters or, in other words, a tree of clusters known as dendrogram. Each node is a cluster that consists of child clusters. The child nodes partition the points covered by the parent cluster. Such methods allow us to review data at different magnification levels. Hierarchical clustering has the following advantages: 1- the intrinsic flexibility of this method in relation to the magnification level, 2- the simplicity of using any similarity/distance criterion between clusters, and 3- its applicability to any attribute. The disadvantages of this type of clustering are as follows: 1- the vagueness of the end condition, and 2- a great number of hierarchical clustering algorithms fail to examine the created clusters to improve the results. Since the number of clusters in LIDAR images is limited to 4 or 5 areas at most, the first disadvantage is not allowed for in implementation. Due to the use of top-down partitioning and the lack of need for examining LIDAR image areas, which is owing to the lack of a considerable overlap among areas, the second disadvantage is also alleviated during implementation, causing no problem.

2. Research Methodology

The hierarchical clustering method for segmenting LIDAR images using a hierarchical tree and the determination of the optimal number of clusters based on tree data is explained in this section. The proposed method is implemented using MATLAB. Hierarchical clustering techniques function on the basis of an increase in the number of clusters to more than 2 and an investigation of clustering efficiency at any stage of cluster separation. Finally, the optimal number of the clusters is obtained. The algorithm of the proposed method is as follows:

Step 1 – The input to this code is LIDAR images containing different areas such as forests, deserts, seas, and urban regions. This step is the introduction of the image to the program.

Step 2 – The input image format is converted from RGB to L*a*b, and ab matrix is created, which is the vectorial feature matrix.

Step 3 – Determining the criterion for calculating distance/similarity for the patterns within the image. Either Euclidean distance or cosine distance may be selected in the proposed algorithm. For certain geometries, Euclidean distance is more appropriate, whereas for some others, where the clusters are intertwined with more complex shapes, cosine distance better expresses similarity.

Step 4 – Similarity matrix (which is defined as the inverse of distance) is developed for each of the two Euclidean and cosine distance criteria. Then, the tree of the hierarchy of clusters is formed and displayed.

Step 5 – Once the hierarchy of clusters is formed, the optimal number of clusters is asked from the user according to the found data representation tree.

3. Findings

Four samples of LIDAR images were considered for experiments. The segmentation of these images is explained in what follows using the proposed method. Image samples may be seen in Figure 1.

Figure 2 shows how the proposed algorithm is run. Once image name is entered as program input in the first mode by entering 1, the type of pattern distance is determined to be Euclidean distance in the image space. In this case, the hierarchy of clusters is displayed as Figure 3 with Euclidean space and Figure 4 with cosine distance.

Figure 3. Hierarchy of clusters by entering Figure 1-a with Euclidean distance
Figure 4. Hierarchy of clusters by entering Figure 1-a with cosine distance

The basis of hierarchical clustering is the selection of two clusters and then dividing each to two other clusters. The number of clusters is optimal where the distance between surface nodes and the nodes of the next surface is relatively big or, in other words, the length of tree distances is greater compared to other lengths. In certain images where an optimal number of clusters is lacking according to the image details, the selection of the optimal number of clusters based on the tree is subject to the user’s view. Clustering is performed according to the number of clusters selected by the user. With a surface cut of the hierarchy of clusters, the number of lines intersected by a lateral cut indicates the number of selected clusters. Afterward, the samples are labeled specifying which pattern (pixel in image) belongs to which cluster. Different sections/clusters are separated from one another. Upon displaying the segmented image, the color of each image changes compared to other clusters and is displayed at the end of the program. What may be realized through Figure 3 and Figure 4 is that the optimal number of clusters may be considered 2 or 3 or itself. At this stage, the program asks the user the number of clusters. Entering 4 as the selected number of clusters, the figure is segmented to 4 different sections. The process of program execution may be seen in Figure 5, where user is being asked about the number of clusters.

Figure 5. Program execution process while asking the number of clusters from the user

By entering 4 as the number of clusters, the program displays the outputs of Figure 6.

Figure 6. Program output for the segmentation of Figure 1-a with 4 clusters

To segment Figure 1-b, the program is run by entering the name of this figure. The hierarchy of clusters for this figure may be seen as Figure 7 with Euclidean distance and Figure 8 with cosine distance.

Figure 7. Hierarchy of clusters by entering Figure 1-b with Euclidean distance
Figure 8. Hierarchy of clusters by entering Figure 1-b with cosine distance

Selecting the number of clusters as 3 and entering 3 at this stage, the program output would be as Figure 9.

Figure 9. Program output for Figure 1-b with 3 selected clusters

For Figure 1-c, the hierarchy of clusters may be seen with Euclidean distance in Figure 10 and with cosine distance in Figure 11.

Figure 10. Hierarchy of clusters by entering Figure 1-c with Euclidean distance
Figure 11. Hierarchy of clusters by entering Figure (1-c) with cosine distance

Given the diagram of the hierarchy of clusters for Figure 1-c, 2 clusters are selected for segmenting this image. Entering 2 as the number of clusters, program output at the next stage would be as Figure 12.

Figure 12. Program output for Figure (1-c) with 2 selected clusters

For segmenting Figure 1-d, the name of the figure is given as program input and the program is run. The hierarchy of clusters for this figure may be seen with Euclidean distance in Figure 13 and with cosine distance in Figure 14.

Figure 13. Hierarchy of clusters by entering Figure (1-d) with Euclidean distance
Figure 14. Hierarchy of clusters by entering Figure 1-d with cosine distance

Given the hierarchy of clusters demonstrated in Figure 13 and Figure 14, the number of optimal clusters is selected to be 3. Entering 3 as the number of clusters at the next stage of running the algorithm, the program displays the outputs as in Figure 15.

Figure 15. Program output for Figure (1-d) with 3 selected clusters

4. Discussion and Conclusion

The proposed scheme of the hierarchy of clusters basically aims to determine the optimal number of clusters in the image. In the experiments, four LIDAR images with different number of areas (sea, forest, desert, and urban) were used for investigating the algorithm performance. For each image under experiment in the hierarchy of clusters, the number of clusters, i.e. the number of image sections, for LIDAR images was selected as per the number of areas existing in the image. Accordingly, the optimal number of clusters in the image was estimated.

The experiment results indicate the capability of the hierarchical clustering technique for LIDAR image segmentation. The efficiency of image segmentation was generally evaluated visually because the sections of the main image typically lack certain, fixed features and depend on the criterion used for pattern distance/similarity as well as the threshold for cluster separation. For future research in this field, it may be proposed that an algorithm be developed for identifying the optimal number of clusters according to the hierarchical tree along with the hierarchical clustering technique, in a way that the algorithm fundamentals may be run automatically. Thus, in this case, the program may be adopted for a huge bulk of data or in online processing of video images. This code may be utilized as a developed software together with imaging systems for LIDAR image data online processing, or any other type of image.


[1]  Fiorucci, F., Cardinali, M., Carlà, R., Rossi, M., Mondini, A.C., Santurri, L., Ardizzone, F., Guzzetti, F., 2011. Seasonal landslide mapping and estimation of landslide mobilization rates using aerial and satellite images. Geomorphology 129, 59-70.
In article      View Article
[2]  Galli, M., Ardizzone, F., Cardinali, M., Guzzetti, F., Reichenbach, P., 2008. Comparing landslide inventory maps. Geomorphology 94, 268-289.
In article      View Article
[3]  Blaschke, T., 2010. Object based image analysis for remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing 65, 2-16.
In article      View Article
[4]  Benediktsson, J.A., Bruzzone, L., Chanussot, J., Dalla Mura, M., Salembier, P., Valero, S., 2011. Hierarchical analysis of remote sensing data: morphological attribute profiles and binary partition trees. In: Soille, P., Pesaresi, M., Ouzounis, G.K. (Eds.), Proceedings of the International Symposium on Mathematical Morphology – ISMM. Springer, pp. 306-319.
In article      View Article
[5]  Li, Xiaoling, Wenjun Zeng, and Ye Duan. "A hybrid approach for tree classification in airborne LIDAR data." In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pp. 2183-2187. IEEE, 2013.
In article      View Article
[6]  Thiery, Y., Malet, J.P., Sterlacchini, S., Puissant, A., Maquaire, O., 2007. Landslide susceptibility assessment by bivariate methods at large scales: application to a complex mountainous environment. Journal of Geomorphology 92, 38-59.
In article      View Article
[7]  Wang, z. and schenk, t. (2000). “Building extraction and reconstruction from lidar data”. international archives of photogrammetry and remote sensing. vol. xxxiii, part b3, Amsterdam.
In article      
[8]  Arefi, H., Engels, J., Hahn, M. and Mayer, H. (2008). “Levels of detail in 3d building reconstruction from lidar data”. International archives of the photogrammetry, Remote sensing and spatial information sciences. vol. xxxvii, part b3b., pp. 485-190.
In article      
[9]  Kabolizadeh, M., Ebadi, H. and Mohammadzadeh, A. (2012). “Design and implementation of an algorithm for automatic 3d reconstruction of building models using genetic algorithm international”. International Journal of Applied Earth Observation and Geoinformation. Vol. 19, pp.104-114.
In article      View Article
[10]  Oude Elberink, S. and Vosselman, G. (2009). “Building reconstruction by target based graph matching on incomplete laser data: analysis and limitations”. Sensors, vol. 9, pp. 6101-6118.
In article      View Article
[11]  Alharty, A. and Bethel, J. (2004) “Detailed building reconstruction from airborne laser data using a moving surface method”. IAPRS international archive of photogrammetry and remote sensing, vol.34, part 3 b3, pp. 213-219.
In article      
[12]  Poullis, C. and You, S. (2009) “Automatic reconstruction of cities from remote sensor data”. Computer Vision and Pattern Recognition, 20-25 Jun, Miami. pp. 2775-2782.
In article      View Article
[13]  Wang, Lu, and CH, Chu. (2009). "3D building reconstruction from LiDAR data." In Systems, Man and Cybernetics, SMC 2009. IEEE International Conference on, pp.3054-3059.
In article      View Article
[14]  Wang, W, and Tseng, Y.H. (2010). “Automatic segmentation of lidar data into coplanar point clusters using anoctree-based split-and-merge algorithm”. Photogrammetric Engineering & Remote Sensing. Vol. 76, No.4, April 2010, pp. 407-420.
In article      View Article
[15]  Arefi, H., and Reinartz, P. (2013). “Building Reconstruction Using DSM and Orthorectified Images. Remote Sensing”. 5(4), pp. 1681-1703.
In article      View Article
[16]  Li, Xiaoling, Wenjun Zeng, and Ye Duan. "A hybrid approach for tree classification in airborne LIDAR data." In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pp. 2183-2187. IEEE, 2013.
In article      View Article
[17]  O’Neil-Dunne, J. P. M., S. W. MacFaden, A. R. Royar, and K. C. Pelletier. 2013. “An Object-Based System for LiDAR Data Fusion and Feature Extraction.” Geocarto International 28: 227-242.
In article      View Article
  • CiteULikeCiteULike
  • MendeleyMendeley
  • StumbleUponStumbleUpon
  • Add to DeliciousDelicious
  • FacebookFacebook
  • TwitterTwitter
  • LinkedInLinkedIn