there is a utility in OpenCV for data augmentation: opencv_createsamples. It is normally used for the creation of data for methods like Haar cascades, but can be employed for other things too.
opencv_createsamples has a lot of parameters, such as rotation angle and brightness and scale and skew factors, and also two different operating modes. I suggest you look into the tutorial below (even if it is a Haar tutorial) and read the documentation carefully: its default mode will insert background images into your samples. In medical images this is a thing that you do not want.
DOCUMENTATION (all those examples below describe a different version of opencv_createsamples) - https://docs.opencv.org/2.4/doc/user_guide/ug_traincascade.html - https://docs.opencv.org/3.0-beta/doc/user_guide/ug_traincascade.html Creating your own Haar Cascade OpenCV Python Tutorial: - https://pythonprogramming.net/haar-cascade-object-detection-python-opencv-tutorial/ OBSERVATION - MUST be copied to and run on the folder where the action occurs - Supposes there is a series of folders and files as per the documentation above
Dear Filippo Pesapane, I want to use the dataset for classification purpose using CNN, now I have 1000 samples with annotations, for that I want to make it double or triple by using augmentation techniques
Hi Hunar, for medical data standard approach would be doing slight rotations, translations, maybe jittering. You should absolutely take a look at elastic transformations (i.e. https://www.kaggle.com/bguberfain/elastic-transform-for-data-augmentation) that are very helpful specifically for medical CNNs and have been shown to improve training significantly.
The Division of Medical Image Computing, German Cancer Research Center (DKFZ) have a python repo for generating a batch of data with various augmentations, see here https://github.com/MIC-DKFZ/batchgenerators