Summary:
Schizophrenia remains diagnosed primarily through clinical assessment, which motivates the researchers to search for quantifiable digital phenotypes. Prior studies suggest that skeletal differences are associated with schizophrenia, leading us to explore whether bone structure might contain detectable disease-related digital signals. To investigate this possibility, we developed an end-to-end framework for recovering image-derived digital phenotypes from routine DXA scans of the non-dominant distal forearm. We segmented bone RoI in a de-identified, multi-center dataset of non-dominant distal forearm scans using a convolutional neural network (CNN), followed by the extraction of comprehensive radiomics features that capture dynamic descriptors. From the complementary visual representations, we formed a 15-channel visual stack. The radiomics features were modeled using Random Forest and XGBoost, and the visual stack features using a lightweight multi-channel CNN. We interpret models via TreeSHAP, Grad-CAM, and perturbation-based analyses. We evaluated performance on a test set, where XGBoost achieved an F1 score of 0.861, and Random Forest achieved 0.876. Our convergent importance analyses identified a compact, biologically plausible subset of features, i.e., orientation-sensitive texture, rotation-invariant shape, and multi-scale wavelet detail, localized by Grad-CAM. We observed that morphology channels exhibited low mean saliency but large performance drops when perturbed, indicating a supportive role in preserving bone geometry and spatial context. Through these experiments, we demonstrate that routine scan encodes associative phenotypes related to schizophrenia that are recoverable with transparent, auditable pipelines. Given missing covariates (e.g., BMI, medication, smoking, activity), the results are hypothesis-generating and motivate further validation and multimodal fusion with neuroimaging and behavioral data to assess generalizability and mechanistic relevance in the future.