Goal: Identify datasets, preprocess the data, and perform exploratory data analysis (EDA).
Identify Relevant Datasets: Search for open-source datasets on platforms such as:
▪ cBioPortal for Cancer Genomics (preferred)
▪ The Cancer Genome Atlas (TCGA).
▪ UCI Machine Learning Repository.
▪ Kaggle.
Choose datasets aligned with your research focus (e.g., genomic data, clinical
records).