Pre-Trained Variational Autoencoder Approaches for Generating 3D Objects from 2D Images
| dc.contributor.author | Serin, Zafer | |
| dc.contributor.author | Yüzgeç, Uğur | |
| dc.contributor.author | Karakuzu, Cihan | |
| dc.date.accessioned | 2025-05-20T18:47:18Z | |
| dc.date.issued | 2024 | |
| dc.department | Bilecik Şeyh Edebali Üniversitesi | |
| dc.description | 2nd International Congress of Electrical and Computer Engineering, ICECENG 2023 -- 22 November 2023 through 25 November 2023 -- Bandirma -- 309799 | |
| dc.description.abstract | In this study, we focus on the 3D-VAE-GAN models, a novel combination of generative adversarial networks (GANs) and variational autoencoders (VAEs) in the field of 3D object generation from 2D images. Specifically, we explore the use of several pre-trained convolutional neural networks (CNNs) as potential encoder networks in the VAE component of the structure. These pre-trained network models are DenseNet121, EfficientNetB0, RegNet16, and ResNet18. Additionally, a standard CNN model with a fully connected layer and a CNN model without a fully connected layer were also used. To facilitate the training process, the binary cross-entropy loss function is used for the generator and discriminator networks of the GAN model, while the Kullback–Leibler divergence is utilized for the encoder network of the VAE model. For the training and testing stages, attention is directed toward the chair category contained within the ShapeNet dataset. With regard to the depiction of 3D items, the selection utilized is based on voxels, which prove highly compatible with the neural network and deep learning procedures. Our approach uses only one 2D image as the input for producing 3D objects. Tests and evaluations have shown that the use of pre-trained networks in the encoder network portion of the VAE yields very successful results. The average Kullback–Leibler divergence values obtained were found to be 1129.660 for RegNet16, 1219.067 for ResNet18, 1352.815 for EfficientNetB0, 1538.489 for CNN without a fully connected layer, 2893.807 for DenseNet121, and 1696.749 for CNN with a fully connected layer, respectively. The pre-trained RegNet16 outperforms other methods. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. | |
| dc.identifier.doi | 10.1007/978-3-031-52760-9_7 | |
| dc.identifier.endpage | 101 | |
| dc.identifier.isbn | 978-303152759-3 | |
| dc.identifier.issn | 2522-8595 | |
| dc.identifier.scopus | 2-s2.0-85189565116 | |
| dc.identifier.scopusquality | Q3 | |
| dc.identifier.startpage | 87 | |
| dc.identifier.uri | https://doi.org/10.1007/978-3-031-52760-9_7 | |
| dc.identifier.uri | https://hdl.handle.net/11552/6293 | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | Springer Science and Business Media Deutschland GmbH | |
| dc.relation.ispartof | EAI/Springer Innovations in Communication and Computing | |
| dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | KA_Scopus_20250518 | |
| dc.subject | 3D reconstruction | |
| dc.subject | Generative adversarial networks | |
| dc.subject | Pre-trained networks | |
| dc.title | Pre-Trained Variational Autoencoder Approaches for Generating 3D Objects from 2D Images | |
| dc.type | Conference Object |












