I've worked on obstacle avoidance both with cameras and other sensors, including sonar, infrared, bumpers (whiskers), laser and even kinect (camera based, after all). The problem is basically the trade-off between computer load and amount of information gathered. Cameras provide tons of information, but it is usually harder to process. My advice would be:
-Check your environment (how big is it? is it highly dynamic? what kind of materials are around? is it hazardous? is it really 3D?)
-Check your needs (do you just need to avoid obstacles? are obstacles moving? fast? do you want to map? how fast are you moving?)
-Go for the computationally cheapest solution necessary that your hardware can carry on.
Let me illustrate with an example: you need to move a wheelchair robot in a hospital environment. This basically can carry any off-the shelf sensor that I can think of at the moment.
-You're not driving a race car, but you definitely need to detect things a bit ahead, so you need a range of at the very least 3-4 meters (if you go with i.e. infrared, imagine the effect of stopping the vehicle when you detect something (within the centimeter range))
-Your environment is going to be pretty 2D: no uneven terrain and, probably, no obstacles hanging on the walls or floating around. You can work with a sensor that operates in a plane
-The environment will be dynamic. Furthermore, since the driver also contributes to control, it will be very unpredictable. You need fast response, i.e. reduced computational load if possible.
-Safety is of key importance, so the less uncertain the sensor, the better (e.g. sonars have a wide uncertainty detection angle)
If you put it all together, you might want to go for a laser. HOWEVER....
If your hospital has glass doors around, you need to keep in mind that they are INVISIBLE for the laser, so you either use another sensor, combine your laser with cheaper, less reliable ones to cope with this issue or solve the problem via software (e.g. map of the environment). Since you are indoors, Kinect might work for you (with restrictions), but it has a minimum detection area, so you might need additional sensors as well ...
Anyway, I would not go for computer vision in this particular case. Computer vision becomes more interesting when you want rich information, like the nature of the things around. Think, for example, of a social robot. It probably would like to know if something in front of it is a person or not: an obstacle must be avoided, but a person must be engaged in some activity.
Sorry about the length of the explanation. I hope it helps a bit ...
The most immediate answer is that different sensors have different drawbacks/advantages, hence using many of them may allow one (the robot) to reduce drawbacks and combine advantages of those sensors.
For example, a laser scanner will only detect a first obstacle, while a camera may detect an obstacle (e.g., a pedestrian) partially hidden by another (e.g. a parked car). Similarly, odometers may provide bad measurements if the road is slippery, while image matching may work well.
In my opinion, the main advantages of multiple sensor is to perform multi-modal data fusion based on their measurements, with the goal is to obtain more informative and reliable data. I guess a quick search on robot navigation and data fusion may give you interesting references.
Computer Vision (CV) is not something which is alternate to sensors since the very essence of this area is use of images to extract information. Now these images are recorded by some sort of sensors (eg. CCD, CMOS etc). But an important point of CV which makes it easier and versatile is the whole field information which can be gathered by a set or sequence of images. The cost reduces drastically. Besides the area it influences vary from medical to space techs... For a reference I think Forsyth and Ponce's Computer Vision A modern approach is a good start. If you are a beginner.
Sensors are for example the eyes and ears; it is the brain that provides vision and hearing. Sensors without a signal processor (brain/computer) will not enable the robot to accomplish anything.
I like to distinguish between Machine Vision and Sensing (the goal to automate a process using any sensor available, that ideally is the most simple) and Computer Vision (the pursuit of emulating the human vision system). Clearly evolution has helped to select the visible spectrum for us when it comes to vision and has endowed us with far more sense than the basic five - e.g. proprioception is a good example. If you just want robots like a Roomba that can bump into walls and avoid falling down the stairs, then perhaps you don't need more, but if you want to augment human perception of the world, build interesting interactive applications, and sophisticated robots that can do something more interesting like play a round of golf with you and give you some pointers, then clearly you need more than simple sensors. As others have also pointed out, much of computer vision is the internal model and intelligence to parse and understand a sensed environment too. If nothing else, it's also far more interesting and entertaining to build devices we can better interact with due to common perception. Perhaps you could let me know what you think of some of my introductory developer papers on this topic - e.g. https://www.ibm.com/developerworks/library/bd-interactive/
Yes you are right Akshay there are other sensors like sharp IR sensors which can be augmented with laser rangers for obstacle avoidance and with a good degree of accuracy. On board image processing by a robot can drain its power source very quickly particularly if you are only using vision. In particular for obstacle avoidance you will need the distance of robot from a particular obstruction which you can't get if you are not using stereo vision cameras. Again this would increase the design complexity of the robot (you will also ask for DSP). I mean these robots are there in the market but are very expensive. Simple robots can have omni vision cameras augmented with IRs, laser or equivalent sensors, wherein we can let the robots process only binary or grey scale images in discrete time steps. I have not done this and I don't have data on this but I thing it can be interesting to observe the computational complexity of the obstacle avoidance algorithm using different combination of sensors. I will be highly grateful if anyone can suggest some source on previously conducted studies.
Under simple circumstances one sensor is enough, but you have to prepare your platform (robot, UAV, etc.) for situations were simple is not the norm. For example, consider tracking an object that has just been hit by a projectile and you need to verify whether the hit was successful or not. What you will need is a consensus on the processed output of several sensors and multiple criteria to determine success or failure(and you need to do this fast before the other target chooses to hit back :-).
Another challenging problem that you may use is that of the smoking room cocktail problem( an extension of the cocktail problem such as a dance floor or other really noisy environment where smoke , rapid dance movement and multiple source high output sound waves are being emitted). Under these conditions how would a single sensor fare?
Under these example you will most likely need not just multiple sensors and consensus but also the ability to interpolate and extrapolate based on the information provided.
Avinash gautam - thats a nice answer, but as u said don't you think omnivision and extra stuff of other sensors will take more battery input and on top of that how would you see the depth perspective in an omnivision camera? And yes i think it would be good if there is some more data on it.
Arturo geigel - yes for object tracking vision is very important. But as you said with the example of smoking room cocktail problem, do you think vision can track an object in a smoke?
As Geigel suggests, in some cases single sensor might not be enough and and for seeing in smoke you might need some other type camera than you usually use. For tobacco smoke an IR-camera might be suitable to see through. However, if you have hot smoke from fire might need again something different from CMOS cameras, such as radars...
The world is complex and full of uncertainties. It entirely depends upon the task at hand which you want to automate using robots. Like now I don't exactly remember but I have read this statement that you can't have single agent survive in all contexts. A particular spider survives well in the Amazon rain forest but it will die in a bath tub. It is then obvious that you would require data from different sensors and you might be required to fuse this sensor data to make right inferences about the context in which the robot is living.
I've worked on obstacle avoidance both with cameras and other sensors, including sonar, infrared, bumpers (whiskers), laser and even kinect (camera based, after all). The problem is basically the trade-off between computer load and amount of information gathered. Cameras provide tons of information, but it is usually harder to process. My advice would be:
-Check your environment (how big is it? is it highly dynamic? what kind of materials are around? is it hazardous? is it really 3D?)
-Check your needs (do you just need to avoid obstacles? are obstacles moving? fast? do you want to map? how fast are you moving?)
-Go for the computationally cheapest solution necessary that your hardware can carry on.
Let me illustrate with an example: you need to move a wheelchair robot in a hospital environment. This basically can carry any off-the shelf sensor that I can think of at the moment.
-You're not driving a race car, but you definitely need to detect things a bit ahead, so you need a range of at the very least 3-4 meters (if you go with i.e. infrared, imagine the effect of stopping the vehicle when you detect something (within the centimeter range))
-Your environment is going to be pretty 2D: no uneven terrain and, probably, no obstacles hanging on the walls or floating around. You can work with a sensor that operates in a plane
-The environment will be dynamic. Furthermore, since the driver also contributes to control, it will be very unpredictable. You need fast response, i.e. reduced computational load if possible.
-Safety is of key importance, so the less uncertain the sensor, the better (e.g. sonars have a wide uncertainty detection angle)
If you put it all together, you might want to go for a laser. HOWEVER....
If your hospital has glass doors around, you need to keep in mind that they are INVISIBLE for the laser, so you either use another sensor, combine your laser with cheaper, less reliable ones to cope with this issue or solve the problem via software (e.g. map of the environment). Since you are indoors, Kinect might work for you (with restrictions), but it has a minimum detection area, so you might need additional sensors as well ...
Anyway, I would not go for computer vision in this particular case. Computer vision becomes more interesting when you want rich information, like the nature of the things around. Think, for example, of a social robot. It probably would like to know if something in front of it is a person or not: an obstacle must be avoided, but a person must be engaged in some activity.
Sorry about the length of the explanation. I hope it helps a bit ...
Vision is the fast way to sense your surroundings. Maybe that's why nature chose to evolve eyes in complex life forms for sensing. Capturing light and "seeing" the world is the easiest and least complex mechanism to sense the surroundings. For everything else, there's SONAR (bats etc).
Cameras (even depth cameras) are cheap and "all we need to do " is to find ways to interpret scenes, computer vision is one of the hot topics in robot sensing.
Unless you want to create only dumb robots with no autonomy, Computer vision is one of the best way to sense surroundings. Of course it's not enough, and that's why you have sensor fusion.