Published Versions 1 Vol 2 (4) : 513–528 2020
Deep Learning, Feature Learning, and Clustering Analysis for SEM Image Classification
: 2019 - 09 - 06
: 2020 - 07 - 27
: 2020 - 07 - 30
500 16 0
Abstract & Keywords
Abstract: In this paper, we report upon our recent work aimed at improving and adapting machine learning algorithms to automatically classify nanoscience images acquired by the Scanning Electron Microscope (SEM). This is done by coupling supervised and unsupervised learning approaches. We first investigate supervised learning on a ten-category data set of images and compare the performance of the different models in terms of training accuracy. Then, we reduce the dimensionality of the features through autoencoders to perform unsupervised learning on a subset of images in a selected range of scale (from 1 µm to 2 µm). Finally, we compare different clustering methods to uncover intrinsic structures in the images.
Keywords: Neural networks; Feature learning; Clustering analysis; Scanning Electron Microscope (SEM); Image classification
This work has been done within the NFFA-EUROPE project and has received funding from the European Union's Horizon 2020 Research and Innovation Program under grant agreement No. 654360 NFFA-EUROPE. The authors thank A. Cazzaniga for his contribution in preparing the final version of the plots.
G. Roughton, A.S. Varde, S. Robila, & J. Liang. A feature-based approach for processing nanoscale images. In: Proceeding of Scanning Microscopy, 2010, pp. 1-9. doi: 10.1117/12.853412.
A. Amani & D. Mohammadyani. Artificial neural networks: Applications in nanotechnology. In: C.L.P. Hui (edited) Artificial Neural Networks - Application. Rijeka, Croatia: InTech, 2011, pp 465-478.
M.P. Nikiforov et al. Functional recognition imaging using artificial neural networks: Applications to rapid cellular identification via broadband electromechanical response. Nanotechnology 20 (2009), No. 40.
M.A. Al-Khedher, C. Pezeshki, J.L. McHale, & F.J. Knorr. Quality classification via Raman identification and SEM analysis of carbon nanotube bundles using artificial neural networks. Nanotechnology 18(2007), No. 35.
W. Xie, J.A. Noble, & A. Zisserman. Microscopy cell counting with fully convolutional regression networks. In: Deep Learning Workshop in MICCAI, 2015, pp. 1-8.
H. Chen, Q. Dou, X. Wang, J. Qin, & P.A. Heng. Mitosis detection in breast cancer histology images via deep cascaded networks. In: Proceedings of the Thirtieth Conference on Artificial Intelligence (AAAI), 2016, pp. 1160-1166.
M. Mabaso, D. Withey, & B. Twala. Spot detection in microscopy images using convolutional neural network with sliding-window approach. In: Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) 2, 2018, pp. 67-74. doi: 10.5220/0006724200670074.
NFFA-EUROPE homepage. Available at:
M.D. Wilkinson, M. Dumontier, I.J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, … & B. Mons. The FAIR guiding principles for scientific data management and stewardship. Scientific Data 3(2016), Article No.160018. doi: 10.1038/sdata.2016.18.
CNR-IOM homepage. Available at:
R. Aversa, M.H. Modarres, S. Cozzini, R. Ciancio, & A. Chiusole. The first annotated set of scanning electron microscopy images for nanoscience. Scientific Data 5 (2018), Article No. 180172. doi: 10.1038/sdata.2018.172.
R. Aversa. Scientific image processing within the NFFA-EUROPE data repository. MHPC Thesis, 2016. Available at:
M.H. Modarres, R. Aversa, S. Cozzini, R. Ciancio, A. Leto, & G.P. Brandino. Neural network for nanoscience scanning electron microscope image recognition. Scientific Reports 7 (2017), Article No. 13282. doi: 10.1038/s41598-017-13565-z.
J. Yosinski, J. Clune, Y. Bengio, & H. Lipson. How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems (NIPS), 2014, pp. 1-9.
C. de Nobili. Deep learning for nanoscience scanning electron microscope image recognition. MHPC Thesis, 2017. Available at:
P. Coronica. Feature learning and clustering analysis for images classification. MHPC Thesis, 2018. Available at:
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, & L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, 248-255. doi: 10.1109/CVPR.2009.5206848.
Imagenet Large Visual Recognition Challenge (ILSVRC) homepage. Available at:
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, ... & A. Rabinovich. Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1-9. doi: 10.1109/CVPR.2015.7298594.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, & Z. Wojna. Rethinking the inception architecture for computer vision. arXiv preprint. arXiv:1512.00567, 2015.
C. Szegedy, S. Ioffe, & V. Vanhoucke. Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Proceedings of the 31th AAAI Conference on Artificial Intelligence (AAAI-17), 2017, pp. 1-3.
c3hpc homepage. Available at:
eXact Lab srl homepage. Available at:
A. Krizhevsky, I. Sutskever, & G.E. Hinton. Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, 2002, pp. 1097-1105.
S.J. Pan & Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10)(2010), 1345-1359. doi: 10.1109/TKDE.2009.191.
M.P. Sampat, Z. Wang, S. Gupta, A. C. Bovik, & M.K. Markey. Complex wavelet structural similarity: A new image similarity index. IEEE Transactions on Image Processing 18(11)(2009), 2385–2401. doi: 10.1109/TIP.2009.2025923.
L. Wang, Y. Zhang, & J. Feng. On the Euclidean distance of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8)(2005), pp. 1334 - 1339. doi: 10.1109/TPAMI.2005.165.
L. van Der Maaten, E. Postma, & J. Van den Herik. Dimensionality reduction: A comparative review. Journal of Machining Learning Research 10(2009), 66–71.
A. Géron. Hands-on machine learning with Scikit-Learn and Tensor-Flow. Boston, MA: O'Reilly Media: , 2017. isbn: 9781491962299
E. Facco, M. d’Errico, A. Laio, & A. Rodriguez. Estimating the intrinsic dimension of datasets by a minimal neighborhood information. Scientific Report 7(1)(2017), 1–8.
A. Rodriguez & A. Laio. Clustering by fast search and find of density peaks. Science 344(6191)(2014), 1492–1496.
L. Danon, A. Diaz-Guilera, J. Duch, & A. Arenas. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment 2005(9)(2005), P09008.
Article and author information
Cite As
R. Aversa, P. Coronica, C. De Nobili & S. Cozzini. Deep learning, feature learning, and clustering analysis for SEM image classification. Data Intelligence 2(2020). doi: 10.1162/dint_a_00062
Rossella Aversa
R. Aversa wrote and revised the manuscript. She planned the overall work and performed the supervised approach.
Rossella Aversa got a PhD in Astrophysics at the International School for Advanced Studies (SISSA-ISAS) in Trieste, Italy. In the same town, she finalized her studies with a Master in High Performance Computing (MHPC) and worked as postdoc at CNR-IOM for three years, acquiring experience in machine learning techniques and data management. She is currently employed at Karlsruhe Institute of Technology (KIT) in Karlsruhe, Germany.
Piero Coronica
P. Coronica performed the unsupervised approach and reviewed the manuscript.
Piero Coronica holds a PhD in Geometry from the International School for Advanced Studies (SISSA-ISAS, Trieste) combined with further post graduate studies in High Performance Computing and data science. He currently works as part of the HPC team at the University of Cambridge’s Research Computing Services assisting national and international research groups to carry outacademic work. The projects he has been involved in apply state-of-the-art machine learning techniques to advance active research fields ranging from astronomy to digital humanities and medical imaging.
Cristiano De Nobili
C. De Nobili performed the supervised approach and reviewed the manuscript.
Cristiano De Nobili is a theoretical particle physicist. After his PhD in statistical physics at the International School for Advanced Studies (SISSAISAS) (2016) and a further master in High-Performance Computing (MHPC), he has been involved in deep learning. Starting from computer vision he is now a senior deep learning scientist in the field of Natural Language Processingworking on Samsung's virtual assistant. Moreover, Cristiano is a machine/deep learning instructor for several masters in both the private and academic sectors. He is an active speaker and recently gave a TEDx talk on ArtificialIntelligence.
Stefano Cozzini
S. Cozzini planned the overall work, supervised the whole work and reviewed the manuscript.
Stefano Cozzini is presently director of the Institute of Research and Technologies at Area Science Park where he coordinates several scientific infrastructures and projects at national and international level. He has more than 20 years’ experience in the area of scientific computing and HPC/Data e-infrastructures. His main scientific interests are scientific computing andmachine learning techniques applied to scientific data management. He is presently actively involved in the master’s degree on Data Science and Scientific Computing master at University of Trieste, Italy
Publication records
Published: Dec. 17, 2020 (Versions1
Data Intelligence