Mass Prediction from 2D Images

Yılmaz İ. C. S., EFE M. Ö.

6th Mediterranean Conference on Pattern Recognition and Artificial Intelligence, MedPRAI 2024, İstanbul, Türkiye, 18 - 19 Ekim 2024, cilt.1393 LNNS, ss.1207-1222, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası: 1393 LNNS
Doi Numarası: 10.1007/978-3-031-90893-4_81
Basıldığı Şehir: İstanbul
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.1207-1222
Anahtar Kelimeler: 3D object reconstruction, Computer vision, Deep learning, Depth map prediction, Feature extraction, Machine learning, Regression models, U-Net architecture, Volume estimation
Hacettepe Üniversitesi Adresli: Evet

Özet

This study aims to accurately forecast depth maps and estimate volume from 2D images. Medical imaging, robotics, and computer vision all have the issue of prediction using 2D data. Traditional methods have struggled to generalize across many scenarios and datasets. To overcome these limitations, a novel deep learning-feature extraction method has been devised. A U-Net model predicted depth maps more accurately by capturing complicated spatial hierarchies with its powerful convolutional network architecture. HOG (Histogram of Oriented Gradients) and Oriented FAST, Rotated BRIEF (ORB) feature extraction enhanced the model’s object volume estimation. A custom Blender-generated dataset and the NYU Depth V2 dataset were used to validate the recommended algorithms. In testing using Random Forest, Support Vector Regression (SVR), and Gradient Boosting Machine (GBM), the recommended technique outperformed them. Improvements in the R-squared (R2) and Mean Squared Error (MSE) metrics are observed. The results demonstrate the effectiveness of deep learning with traditional feature extraction and regression models, paving the way for more accurate volume estimation from 2D images. This study discusses a strong framework that enhances depth and volume estimate accuracy, and the framework is scalable and domain-adaptable. The findings indicate that the proposed approach may be used for other applications that need precise 3D reconstructions from 2D data, which is significant for future study.