Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

Loop closing is an important part of SLAM for autonomous mobile systems.
BoW features can be used for loop searching and 6-DoF loop correction.
BoW3D is a novel method for real-time loop closing in 3D LiDAR SLAM.
BoW3D is efficient, pose-invariant and can be used for accurate point-to-point matching.
BoW3D is tested on public datasets and shows better performance than other state-of-the-art algorithms.
BoW3D takes an average of 48 ms to recognize and correct the loops on KITTI 00.

Paper Content

I. introduction

Fast and robust loop closing is essential for long-term SLAM
Standard frame-to-frame point registration algorithms may fail due to large drift in pose estimation
Place recognition is done by building a database from images or point clouds
Distance-based association can be used for loop closing with drift in a certain range
BoW is used for efficient image retrieval in visual SLAM
Challenges in 3D LiDAR SLAM due to irregularity, sparsity and disorder of LiDAR point cloud
Proposed loop closing method builds BoW for 3D LiDAR point clouds
Hash table used as basic structure of database
Accurate point-to-point LinK3D matching used to calculate 6-DoF loop pose

Feature extraction

Feature Extraction, Odometry and Mapping of A-LOAM, and Loop Closing are the three modules of the system
BoW3D has been embedded to the loop closing thread
Experimental results show that the method can reduce drifts and improve accuracy of 3D LiDAR SLAM
Place recognition and pose graph optimization are two steps of loop closing
Different sensor modalities have been used for loop closing, including camera images and 3D LiDAR points
BoW is suitable for camera SLAM due to its efficiency
BoW compresses image information and builds a tree to speed up loop retrieval
Most existing methods are too time-consuming or can’t provide 6-DoF pose estimation

Iii. background review

A. review of link3d features

BoW3D is based on the LinK3D feature
LinK3D consists of three parts: keypoint extraction, descriptor generation and feature matching
LinK3D descriptor is represented by a 180-dimension vector
LinK3D is lightweight and takes an average of 32 ms to extract features from the point cloud
LinK3D can be used to achieve accurate point-to-point matching, which enables it to be applied to fast 3D registration

B. review of bag of words

BoW is used to recognize revisited places by retrieving 2D features
BoW creates a visual vocabulary as a tree structure from a training image dataset
BoW converts extracted features of a new image into a low-dimensional vector
Vector contains term frequency and inverse document frequency (tf-idf) score
Higher tf-idf score indicates more frequent word in the image
Similarity between word and words in database is computed if score is high enough
Invert index is used to search corresponding images

Iv. methodology

Proposed loop closing system based on BoW3D
System embedded in state-of-the-art A-LOAM6
System consists of three parts: extracting LinK3D features, BoW3D encoding, and loop closure detection

A. bow3d algorithm

BoW3D algorithm proposed
LinK3D descriptor used, no further conversion needed
Hash table used to build one-to-one mapping between words and places
Retrieval algorithm used to retrieve words and count frequency of each place
Inverse document frequency used to measure difference between number of places
Loop correction used to provide constraint for pose graph optimization

Place set1

BoW3D data structure consists of words and places
Cost function is used to compute the loop
Rotation and translation of transformation is calculated
Update algorithm is proposed to add new words and places to the database
Descriptors are selected based on distance to LiDAR centers

B. loop optimization

Pose graph is built for loop optimization
Pose graph consists of global poses and observation constraints
Cost function is minimized to optimize the pose graph
Points in local map are updated based on optimized global poses

V. experiments

Evaluated performance of algorithm
Used KITTI dataset with Velodyne HDL-64E S2
11 sequences with ground truth poses
Used Euclidean distance and time difference to determine true positive loop pairs
Experiments performed on notebook with Intel Core i7 @2.2 GHz processor and 16 GB RAM

A. place recognition performance

Our method outperforms other state-of-the-art LiDAR loop closure detection and place recognition methods
Our method can be used to correct the full 6-DoF loop pose
Our method is based on the LinK3D descriptors, which are pose invariant
Our system forms constraints based on accurate point-to-point matching results

B. performance on lidar-based slam

Evaluated performance of loop closing system used in 3D LiDAR-based SLAM
Verified accuracy of loop correction and whole trajectories using Euclidean distance, Θ and RMS E
Results showed loop closing system can effectively correct cumulative errors and reduce drifts of 3D LiDAR SLAM system

C. hyperparameters setup and robustness analyzation

Performance of BoW3D measured by F1 score and average detection time
As T h r increases, runtime increases and F1 score remains the same
Setting T h f less than 5 increases F1 score but requires more time
Setting T h f larger than 8 reduces robustness of algorithm
Optimal settings for robustness: T h r = 4, T h f = 5

D. system runtime

Evaluated average runtime of each module in SLAM system after integrating loop closing
Used KITTI 00 dataset with 4K+ LiDAR scans
Set T h r = 4, T h f = 5, number of closer features as 5 when adding to database, 3 when retrieving from database
Runtime of each module shown in Fig. 7
Each module operates separately in different threads
Runtime of mapping thread and PGO more than 100 ms, but can be performed online due to low frequency
BoW3D takes less than 100 ms to process one frame, ensuring realtime performance of system

Vi. conclusion

Proposed a novel 3D-feature-based bag of words algorithm for place recognition
Consists of three parts: place retrieval, loop correction and database update
Hash table used as overall structure of database
Achieves competitive results compared to state-of-the-art methods
Does not require pre-training or GPU resources

Link to paper#

Abstract#

Paper Content#

I. introduction#

Feature extraction#

Iii. background review#

A. review of link3d features#

B. review of bag of words#

Iv. methodology#

A. bow3d algorithm#

Place set1#

B. loop optimization#

V. experiments#

A. place recognition performance#

B. performance on lidar-based slam#

C. hyperparameters setup and robustness analyzation#

D. system runtime#

Vi. conclusion#

Link to paper

Abstract

Paper Content

I. introduction

Feature extraction

Iii. background review

A. review of link3d features

B. review of bag of words

Iv. methodology

A. bow3d algorithm

Place set1

B. loop optimization

V. experiments

A. place recognition performance

B. performance on lidar-based slam

C. hyperparameters setup and robustness analyzation

D. system runtime

Vi. conclusion