UM  > 科技學院  > 電腦及資訊科學系
SeqViews2SeqLabels: Learning 3D global features via aggregating sequential views by RNN with attention
Han Z.4; Shang M.4; Liu Z.2; Vong C.-M.3; Liu Y.-S.4; Zwicker M.1; Han J.2; Chen C.L.P.3
2019-02-01
Source PublicationIEEE Transactions on Image Processing
ISSN10577149
Volume28Issue:2Pages:658-672
Abstract

Learning 3D global features by aggregating multiple views has been introduced as a successful strategy for 3D shape analysis. In recent deep learning models with end-to-end training, pooling is a widely adopted procedure for view aggregation. However, pooling merely retains the max or mean value over all views, which disregards the content information of almost all views and also the spatial information among the views. To resolve these issues, we propose Sequential Views To Sequential Labels (SeqViews2SeqLabels) as a novel deep learning model with an encoder-decoder structure based on recurrent neural networks (RNNs) with attention. SeqViews2SeqLabels consists of two connected parts, an encoder-RNN followed by a decoder-RNN, that aim to learn the global features by aggregating sequential views and then performing shape classification from the learned global features, respectively. Specifically, the encoder-RNN learns the global features by simultaneously encoding the spatial and content information of sequential views, which captures the semantics of the view sequence. With the proposed prediction of sequential labels, the decoder-RNN performs more accurate classification using the learned global features by predicting sequential labels step by step. Learning to predict sequential labels provides more and finer discriminative information among shape classes to learn, which alleviates the overfitting problem inherent in training using a limited number of 3D shapes. Moreover, we introduce an attention mechanism to further improve the discriminative ability of SeqViews2SeqLabels. This mechanism increases the weight of views that are distinctive to each shape class, and it dramatically reduces the effect of selecting the first view position. Shape classification and retrieval results under three large-scale benchmarks verify that SeqViews2SeqLabels learns more discriminative global features by more effectively aggregating sequential views than state-of-the-art methods.

Keyword3d Feature Learning Attention Rnn Sequential Labels Sequential Views View Aggregation
DOI10.1109/TIP.2018.2868426
URLView the original
Indexed BySCI
Language英语
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS IDWOS:000446255300010
Fulltext Access
Citation statistics
Cited Times [WOS]:4   [WOS Record]     [Related Records in WOS]
Document TypeJournal article
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.University of Maryland
2.Northwestern Polytechnical University
3.Universidade de Macau
4.Tsinghua University
Recommended Citation
GB/T 7714
Han Z.,Shang M.,Liu Z.,et al. SeqViews2SeqLabels: Learning 3D global features via aggregating sequential views by RNN with attention[J]. IEEE Transactions on Image Processing,2019,28(2):658-672.
APA Han Z..,Shang M..,Liu Z..,Vong C.-M..,Liu Y.-S..,...&Chen C.L.P..(2019).SeqViews2SeqLabels: Learning 3D global features via aggregating sequential views by RNN with attention.IEEE Transactions on Image Processing,28(2),658-672.
MLA Han Z.,et al."SeqViews2SeqLabels: Learning 3D global features via aggregating sequential views by RNN with attention".IEEE Transactions on Image Processing 28.2(2019):658-672.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Han Z.]'s Articles
[Shang M.]'s Articles
[Liu Z.]'s Articles
Baidu academic
Similar articles in Baidu academic
[Han Z.]'s Articles
[Shang M.]'s Articles
[Liu Z.]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Han Z.]'s Articles
[Shang M.]'s Articles
[Liu Z.]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.