Unveiling Patterns in Breast Cancer Diagnosis: An Exploratory Analysis of Clinical Data from Maharashtra
DOI:
https://doi.org/10.70135/seejph.vi.3095Abstract
Breast cancer remains a leading cause of morbidity and mortality among women globally, emphasizing the need for early detection and accurate diagnostic methods. This study conducts a comprehensive exploratory data analysis (EDA) of a breast cancer dataset collected from hospitals in Maharashtra, India. The dataset comprises 1,006 patient records and 26 features, encompassing clinical, demographic, and tumor-specific attributes such as age at diagnosis, family history, hormonal usage, and tumor characteristics.
EDA techniques, including descriptive statistics, class distribution analysis, univariate analysis, correlation analysis, and feature distribution visualization, were employed to identify patterns, assess feature relationships, and understand data variability. Results revealed significant associations between clinical attributes, such as family history of breast cancer, and diagnosis outcomes. Visualizations, including heatmaps, scatterplots, and boxplots, highlighted key insights, such as differences in age distributions across diagnostic categories and correlations between biological features like age at menarche and menopause.
The findings underscore the importance of data-driven approaches for breast cancer diagnosis, particularly in preparing datasets for machine learning applications. By focusing on a region-specific cohort, this study bridges a gap in localized breast cancer research, offering foundational insights for developing predictive models aimed at enhancing diagnostic accuracy and supporting personalized treatment strategies.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.