Neuronal networks and deep learning algorithms are not appropiate for high safety levels applications (e.g. ADAS/AD). The possible test coverage is to low.
There are many approaches that are often used in high safety-level applications, such as Advanced Driver-Assistance Systems (ADAS) & Autonomous Driving (AD), where the potential test coverage of neural network models may be considered too low due to their "black box" nature. These approaches may require careful tuning and may not generalize as well as deep learning approaches across very diverse and large-scale datasets. However, their greater transparency, lower computational requirements, and easier validation make them suitable for safety-critical applications where understanding and controlling the decision-making process is paramount.
Some of these approaches are-
- Template Matching- This method involves sliding a template image across the input image to detect objects by comparing the template with the portion of the image it covers. This can be effective for recognizing objects with little variation in appearance.
- Feature-Based Methods- These involve detecting key points, edges, or other significant features in images and using these features to recognize objects. Algorithms like SIFT, SURF, and ORB are examples of feature-based methods that are less reliant on massive amounts of training data and provide more interpretability.
- Geometric Shape Analysis- Some algorithms focus on identifying objects based on their geometric properties, such as circles, rectangles, and polygons, using techniques like the Hough Transform for shape detection. This approach is particularly useful for objects with well-defined geometrical shapes.
- Decision Trees and Random Forests- These machine learning methods involve making decisions based on the features extracted from images. They can be more interpretable than deep learning models, as they make decisions based on clear rules that split the input features.
- Support Vector Machines (SVM)- SVMs can be used for object recognition by defining decision boundaries in the feature space that separate different object classes. They are particularly effective in high-dimensional spaces and for cases where the number of dimensions exceeds the number of samples.
- Classical Statistical Methods- Approaches like Bayesian classifiers can be used for object detection and recognition by modeling the probabilistic relationships between input features and object classes.
- Ensemble Methods- Combining multiple models or algorithms to improve the robustness and accuracy of object recognition. Ensemble methods can leverage the strengths of different approaches to achieve better performance.
An important area you may consider is metaheuristics, aka nature-inspired algorithms. These soft-computing tools can help you with simpler algorithms. You may look for Advanced Driver-Assistance Systems (ADAS) & Autonomous Driving (AD) applications and investigate what has been used and the prospective newly-developed algorithms that deserve to be tried.
In general, the most difficult part is finding a suitable feature vector to characterize your problem.
Adding to Allena Venkata Sai Abhishek - there is a lot of value in segmentation. To me, segmentation is a testable foundation under everything related to image understanding. For segmentation, creating a depth map is one way simplify recognition. We use stereopsis, optical flow, lighting and other motion cues to assist in segmentation.
Well, it is a pretty long pipeline covering a lot of areas. You can google out each speciality and get references from patent US8897596B1. On segmentation, a lot depends on your incoming data stream. If possible, you really want depth information, either by multi-camera stereopsis, focal stacking or worst case, using single camera motion and overlap to hint at depth. Without depth, the problem is much more challenging and it is really worth the trouble to try and obtain some form of depth info. (Keep in mind transparency and blur can add complications and I use alpha coding to provide for multiple depths). Removal of lighting is a good second step, using, in part, the surface normals from the depth map. Then, combining optical flow (or phase correlation) with depth, is a reasonable first pass to a cleanplate and initial segmentation of moving objects (First remove global motion). Then on the cleanplate; template matching, texture classification in Fourier space, atmospheric cueing, the horizon relationship, referral back to the now removed lighting to derive shape, and resolution of motion blur edges each get votes on PELs, with weighting based on your scene. This approach is non-AI, but multi-algorithmic fusion. Each step is relatively well defined, it just sounds like a lot. I wish I could tell you there is a fast and easy path, but things like fog, rain, lens flare, emergency vehicle lights, limited dynamic range and dirt make a small percentage of the process more complex.