Cancer Type Driver Classification Accuracy Using Spark ML Technology
DOI:
https://doi.org/10.69511/ijdsaa.v3i3.84Keywords:
Tumor; genes; CAMUR; Hadoop; Apache Spark; BIGBIOL; MLlib; Thyroid cancerAbstract
In this paper, analysis of genes extracted from the body has been performed that can be a driver of tumor, resulting in a cancer of different types like breast cancer etc. motived by the BIGBIOCL. Classifier with Alternative and Multiple Rule Based (CAMUR) is a core algorithm that is applied here to dissect large datasets. For the purpose to acquire the desire goal, Apache Spark as well as MLlib is used, on stack of Hadoop in local mode. The practice has been performed using the decision tree as well as random forest separately. As far as the deployed data is concerned, in terms of measurement of F and efficiency, random forest has shown the better results. For the objective of extraction of genes and other pertinent models, deletion of features has been performed with the deployment of iterative algorithm as proposed earlier CAMUR with modified version. Finally, the extracted results are facilitated to biologist, so they can analyzed the extraction is related or either can be a driver of cancer.
Downloads
Published
How to Cite
Issue
Section
License

International Journal of Data Science and Advanced Analytics (IJDSAA) is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. This license allows users to copy, distribute and transmit an article, adapt the article as long as the author is attributed and the article is not used for commercial purposes.
The author(s) confirms
- The manuscript submission has not been previously published, nor is it before another journal for consideration (or an explanation has been provided in Comments to the Editor).
- The published materials used in the manuscript were obtained permission for reproduction. (if any)