Is there any formula to find the sample size needed to create machine learning or deep learning models in the detection ,localization segmentation and classification of colon polyps
I'm almost sure ML does not have that. Sure some recommendations can be found on how to split data for ML but it is all situational and based on how much data you can gather and what problem you have.
You can find some ideas online, e. g.: Article Evaluation of a decided sample size in machine learning applications
Or check articles for mathematicians, they love to test samplings and batch size impacts on model quality.
I mentioned how we split data in few of my articles also but it wasn't a deep investication so I don't think it would help much. Healthy practice is to use around 70% of data to train model and how you will split validation and test sets its up to you and 15% and 15% is a simple idea.
As for batch sizes, you need to experiment with it.