Abstract
The 6D object pose is widely applied in robotic grasping, virtual reality and visual navigation. However, heavy occlusion, changing light conditions and cluttered scenes make this problem challenging. To address these issues, we propose a novel approach that effectively extracts color and depth features from RGB-D images considering the local and global geometric relationships. After that, we apply a graph attention mechanism to fully exploit representations between these features and then fuse them together to predict the 6D pose of a given object. The evaluation results indicate that our method significantly improves the accuracy of the estimated 6D pose and achieves the state-of-the-art performance on LineMOD, YCB-Video, and a new dataset. Ablation studies demonstrate the effect of our network modules.
Original language | English |
---|---|
Title of host publication | 2020 16th International Conference on Control, Automation, Robotics and Vision (ICARCV) |
Publisher | IEEE |
Pages | 529-535 |
Number of pages | 7 |
ISBN (Electronic) | 978-1-7281-7709-0 |
ISBN (Print) | 978-1-7281-7710-6 |
DOIs | |
Publication status | Published - 2020 |
Keywords
- Feature extraction
- Three-dimensional displays
- Pose estimation
- Image color analysis
- Image edge detection
- Fuses
- Data mining