Hello dear,
i am participating in ongoing kaggle challenge which is to classify and segment pneumothorax ,here is the link to that kaggle challenge : https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation
the dataset is highly imbalanced,only 28% sample has pneumothorax and remaining non pneumothorax samples as a result most of the proposed model is overfitting and having high biased problem,i tried unet with efficientnet encoder and got around 84% public LB accuracy,i want my model to work much better than that,is it ok to remove some portion of non pneumothorax samples before sending the train samples to classifier? there are also 39 samples that are unlabeled,how do i take care of those? please suggest me some techniques and methods to improve my models performance for that competition