We have already tried with Faster R-CNN and we will read about yolo.
Conference Paper Object Detection in Images using Region based CNN
The datasets used for this this research are PASCAl VOC and ImageNet.
The dataset contains fully annotated images with 5000 images tagged as
train/validation set and 5000 images tagged as testing set. Faster R-CNN based on
ResNet with 50 layers is trained with this dataset for 50 epochs.
All the experiments were done in Amazon EC2 cloud-computing service. EC2 instance g2.8xlarge instance was used for the experiment. The EC2 instance has 32 vCPUs., 60 GiB of memory, 240 GB (2 x 120) of SSD storage and Four NVIDIA GRID GPUs, each with 1,536 CUDA cores and 4 GB of video memory.
VGG-16 and ResNET50 models were used and results was about 5 fps and 12 fps.
Is hardware requirement bottleneck for real-time object detection?