All published articles of this journal are available on ScienceDirect.

RESEARCH ARTICLE

Predictive Analysis of Parkinson's Disease Using Machine Learning

The Open Bioinformatics Journal 27 Mar 2025 RESEARCH ARTICLE DOI: 10.2174/0118750362361900250312054436

Abstract

Background

Parkinson's disease is one of the major nerve disorders that affect physical movement, which leads to tremors in individuals above 50 years of age. The affected individual will face walking and speech difficulties. According to the Parkinson Foundation, about 0.3% of the human population is affected across the globe. WHO reported that Parkinson’s disease affects a large population of about 10 million people worldwide. Patients left untreated with timely medication will lead to fatal neurological functional disorders. Due to environmental changes, food habits, and lifestyle, the number of people affected by this disease is gradually increasing.

Aim

This study aimed to predict Parkinson’s disease using speech signals and applying various AI techniques and identify the research gaps to improve the treatment efficiency with respect to detection rate and cost.

Objective

The objective of this study was to efficiently predict Parkinson’s disease using shallow learning AI algorithms, such as Support Vector Machine, XGBoost, and Multilayer Perceptron Deep Neural Networks algorithms under limited patient data with the aid of efficient feature selection algorithms like Principal Component Analysis (PCA) and Analysis of Variance (ANOVA) for selecting the most distinguishing features.

Methods

The dataset containing speech samples was obtained from the UCI repository, which included samples from 188 individuals. The data preprocessing involved the application of the Synthetic Minority Oversampling Technique (SMOTE), and a comparative study of PCA and ANOVA was carried out to select the optimal features. Then, the algorithms SVM, XGBoost, and DNN were employed.

Result

When PCA was used for dimensionality reduction, DNN and XGBoost demonstrated higher accuracy, but SVM exhibited lower runtime. On the other hand, when ANOVA was applied for dimensionality reduction, all three algorithms showed good accuracy, with DNN proving more efficient for a smaller number of features.

Conclusion

All the algorithms, when combined with both dimensionality reduction techniques, exhibited an average accuracy of 97%. In comparison, the ANOVA feature selection technique led to shorter training times compared to PCA. However, PCA resulted in a comparatively fewer number of optimal features for all three algorithms, which resulted in the trade-off between the number of optimal features and training times. Therefore, to increase the efficiency of decision-making for improved disease detection, there is a need to explore multimodal and multi-objective approaches.

Keywords: Parkinson’s disease detection, Speech-based data analysis, Feature selection, Multi-modal analysis, Multi-objective analysis.
Fulltext HTML PDF
1800
1801
1802
1803
1804