1、A high speed tri-vision system for automotive applicationsMarc Anthony Azzopardi & Ivan Grech & Jacques LeconteAbstractPurpose Cameras are excellent ways of non-invasively monitoring the interior and exterior of vehicles. In particular, high speed stereovision and multivision systems are important f
2、or transport applications such as driver eye tracking or collision avoidance. This paper addresses the synchronisation problem which arises when multivision camera systems are used to capture the high speed motion common in such applications. MethodsAn experimental, high-speed tri-vision camera syst
3、em intended for real-time driver eye-blink and saccade measurement was designed, developed, implemented and tested using prototype, ultra-high dynamic range, automotive- grade image sensors specifically developed by E2V (formerly Atmel) Grenoble SA as part of the European FP6 project SENSATION (adva
4、nced sensor development for attention stress, vigilance and sleep/wakefulness monitoring). Results The developed system can sustain frame rates of 59.8 Hz at the full stereovision resolution of 1280 480 but this can reach 750 Hz when a 10 k pixel Region of Interest (ROI) is used, with a maximum glob
5、al shutte speed of 1/48000 s and a shutter efficiency of 99.7%. The data can be reliably transmitted uncompressed over standard copper Camera-Link cables over 5 metres. The synchronisation error between the left and right stereo images is less than 100 ps and this has been verified both electrically
6、 and optically. Synchronisation is auto- matically established at boot-up and maintained during resolution changes. A third camera in the set can be configured independently. The dynamic range of the 10bit sensors exceeds 123 dB with a spectral sensitivity extending well into the infra-red range. Co
7、nclusion The system was subjected to a comprehensive testing protocol, which confirms that the salient require- ments for the driver monitoring application are adequately met and in some respects, exceeded. The synchronization technique presented may also benefit several other auto- motive stereovis
8、ion applications including near and far- field obstacle detection and collision avoidance, road condition monitoring and others.KeywordsSynchronisation . High-speed automotive multivision . Active safety . Driver monitoring . Sensors1 IntroductionOver the coming years, one of the areas of greatest r
9、esearch and development potential will be that of automotive sensor systems and telematics 1, 2. In particular, there is a steeply growing interest in the utilisation of multiple cameras within vehicles to augment vehicle Human-Machine Interfacing (HMI) for safety, comfort and security.For external
10、monitoring applications, cameras are emerging as viable alternatives to systems such Radio, Sound and Light/Laser Detection and Ranging (RADAR, SODAR, LADAR/LIDAR). The latter are typically rather costly and either have poor lateral resolution or require mechanical moving parts.For vehicle cabin app
11、lications, cameras outshine other techniques with their ability to collect large amounts of information in a highly unobtrusive way. Moreover, cameras can be used to satisfy several applications at once by re-processing the same vision data in multiple ways, thereby reducing the total number of sens
12、ors required to achieve equivalent functionality. However, automotive vision still faces several open challenges in terms of optoelectronic-performance, size, reliability, power con- sumption, sensitivity, multi-camera synchronisation, inter- facing and cost.In this paper, several of these problems
13、are addressed. As an example, driver head localisation, point of gaze detection and eye blink rate measurement is considered for which the design of a dash-board-mountable automotive stereovision camera system is presented. This was developed as part of a large FP6 Integrated Project - SENSATION (Ad
14、vanced Sensor Development for Attention, Stress, Vigilance and Sleep/Wakefulness Monitoring). The overarching goal of extendable to multivision systems 58.The camera system is built around a matched set of prototype, ultra-high dynamic range, automotive-grade, image sensors specifically developed an
15、d fabricated by E2V Grenoble SA for this application. The sensor which is a novelty in its own right, is the AT76C410ABA CMOS monochrome automotive image sensor. This sensor imple- ments a global shutter to allow distortion-free capture of fast motion. It also incorporates an on- chipMulti-ROI featu
16、re with up to eight Regions Of Interest (ROI) with pre- programming facility and allows fast switching from one image to another. In this way, several real-time parallel imaging processing tasks can be carried out with one sensor. Each ROI is independently programmableon-the-flywith respect to integ
17、ration time, gain, sub-sampling/binning, position, width and height.A fairly comprehensive series of“bench tests”were conducted in order to test the validity of the new concepts and to initially verify the reliability of the system across various typical automotive operating conditions. Additional r
18、igorous testing would of course be needed to guarantee a mean time before failure (MTBF) and to demonstrate the efficacy of the proposed design techniques over statistically significant production quantities.2 Application backgroundThe set of conceivable automotive camera applications is an ever-gro
19、wing list with some market research reports claiming over 10 cameras will be required per vehicle 9. The incomplete list includes occupant detection, occupant classification, driver recognition, driver vigilance and drowsiness monitoring 10, road surface condition moni- toring, intersection assistan
20、ce 11, lane-departure warning 12, blind spot warning, surround view, collision warning, mitigation or avoidance, headlamp control, accident record-ing, vehicle security, parking assistance, traffic sign detection 13, adaptive cruise control and night/synthetic vision (Fig. 1).2.1 Cost considerations
21、The automotive sector is a very cost-sensitive one and the monetary cost per subsystem remains an outstanding issue which could very well be the biggest hurdle in the way of full deployment of automotive vision. The supply-chain industry has been actively addressing the cost dilemma by introducing F
22、ield Programmable Gate Array (FPGA) vision processing and by moving towards inexpensive image sensors based on Complementary Metal Oxide Semiconductor (CMOS) technology 14. Much has been borrowed from other very large embedded vision markets which are also highly cost-sensitive: These are mobile tel
23、ephony and portable computing. However, automotive vision pushes the bar substantially higher in terms of performance requirements. The much wider dynamic range, higher speed, global shuttering, and excellent infra-red sensitivity are just a few of the characteristics that set most automotive vision
24、 applications apart. This added complex- ity increases cost. However, as the production volume picks up, unit cost is expected to drop quite dramatically by leveraging on the excellent economies of scale afforded by the CMOS manufacturing process.Some groups have been actively developing and pro- mo
25、ting ways of reducing the number of cameras required per vehicle. Some of these methods try to combine disparate applications to re-use the same cameras. Other techniques (and products) have emerged that trade-off some accuracy and reliability to enable the use of monocular vision in scenarios which
26、 traditionally required two or more cameras 10, 15, 16. Distance estimation for 3D obstacle localisation is one such example. Such tactics will serve well to contain cost in the interim. However, it is expected that the cost of the imaging devices will eventually drop to a level where it will no lon
27、ger be the determining factor in the overall cost of automotive vision systems. At this point, we argue that Fig. 1Some automotive vision applicationsreliability, performance and accuracy consid- erations will again reach the forefront.In this paper the cost issue is addressed, but in a different wa
28、y. Rather than discarding stereo- and multi-vision altogether, a low-cost (but still high-performance) technique for synchronously combining multiple cameras is pre- sented. Cabling requirements are likewise shared, resulting in a reduction in the corresponding cost and cable harness weight savings.
29、2.2 The role of high speed visionA number of automotive vision applications require high frame-rate video capture. External applications involving high relative motion such as traffic sign, oncoming traffic or obstacle detection are obvious candidates. The need for high speed vision is perhaps less
30、obvious in the interior of a vehicle. However, some driver monitoring applications can get quite demanding in this respect. Eye-blink and saccade measurement, for instance, is one of the techniques that may be employed to measure a drivers state of vigilance and to detect the onset of sleep 10, 16.
31、It so happens that these are also some of the fastest of all human motion and accurate rate of change measurements may require frame rates running up to several hundred hertz. Other applica- tions such as occupant detection and classification can be accommodated with much lower frame rates but then
32、the same cameras may occasionally be required to capture high speed motion for visual-servoing such as when modulating airbag release or seatbelt tensioning during a crash situation.2.3 A continued case for stereovision/multivisionSeveral of the applications mentioned, stand to benefit from the use
33、of stereovision or multivision sets of cameras operating in tandem. This may be necessary to extend the field of view or to increase diversity and ruggedness and also to allow accurate stereoscopic depth estimation 11. Then, of course, multivision is indeed one of the most effective ways of countera
34、cting optical occlusions.Monocular methods have established a clear role (alongside stereoscopy) but they rely on assumptions that may not always be true or consistently valid. Assumptions such as uniform parallel road marking, continuity of road texture, and operational vehicle head or tail lights
35、are somewhat utopian and real world variability serves to diminish reliability. Often, what is easily achievable with stereoscopy can prove to be substantially complex with monocular approaches 17. The converse may also be true, because stereovision depends on the ability to unambigu- ously find cor
36、responding features in multiple views. Stereovision additionally brings a few challenges of its own, such as the need for a large baseline camera separation, sensitivity to relative camera positioning and sensitivity to inter-camera synchronisation.Not surprisingly, it has indeed been shown that bet
37、ter performance (than any single method) can be obtained by combining the strengths of both techniques 18, 19. As the cost issue fades away, monovision and multivision should therefore be viewed as complimentary rather than competing techniques. This is nothing but yet another example of how vision
38、data can be processed and interpreted in multiple ways to improve reliability and obtain additional information.In this paper, the benefit of combining stereo and monocular methods is demonstrated at the hardware level. A tri-vision camera is presented that utilises a synchronized stereovision pair
39、of cameras for 3D head localisation and orientation measurement. Using this information, a third monocular high-speed camera can then be accurately controlled to rapidly track both eyes of the driver using the multi-ROI feature. Such a system greatly economises on bandwidth by limiting the high spee
40、d capture to very small and specific regions of interest. This compares favourably to the alternative method of running a stereovision system at high frame rate and at full resolution.2.4 The importance for high synchronisationOne of the basic tenets of multivision systems is the accurate temporal c
41、orrespondence between frames captured by the different cameras in the set. Even a slight frequency or phase difference between the image sampling processes of the cameras would lead to difficulties during transmis- sion and post processing. Proper operation usually rests on the ability to achieve sy
42、nchronised, low latency video capture between cameras in the same multivision set. Moreover, this requirement extends to the video transport mechanism which must also ensure synchronous delivery to the central processing hubs. The need for synchronization depends on the speed of the motion to be cap
43、tured rather than the actual frame rate employed, but in general, applications which require high speed vision will often also require high synchronisation.Interestingly, even preliminary road testing of automo- tive vision systems reveals another sticky problem camera vibration. This is a problem t
44、hat has already been faced many years ago by the first optical systems to enter mainstream vehicle use 20The optical tracking mechanisms used in car-entertainment CDROM/DVD drives are severely affected by automotive vibration and fairly complex (and fairly expensive) schemes are required to mitigate
45、 these effects 21. The inevitable vibration essentially converts nearly all mobile application scenarios into high speed vision problems because even low amplitude camera motion translates into significant image motion. The problem gets worse as the subject distance and/or optical focal length incre
46、ases.Mounting the cameras more rigidly helps by reducing the vibration amplitude, but it also automatically increases the vibration frequency which negates some of the gain. Active cancellation of vibration is no new topic 22; however, this usually comes at a disproportionate cost. Thus, while high
47、frame rates may not be important in all situations, short aperture times and high synchronization remain critically important to circumvent the vibration problem.A small numerical example quickly puts the problem into perspective. Consider a forward looking camera for in- lane obstacle monitoring ba
48、sed on a inch, 1024512 image sensor array with an active area of 5.72.9 mm behind a 28 mm (focal length) lens. If such a system is subjected to a modest 10 mrad amplitude, sinusoidal, angular vibration at 100 Hz, simple geometric optics implies a peak pixel shift rate of around 32,000 pixels/sec.Thu
49、s, if the error in correspondence between left and right stereo frames is to be limited to a vertical shift comparable to one pixel, a stereovision system would require a frame synchronisation accuracy which is better than 30 microseconds. Then on the road, the levels of vibration can get significantly worse and this does not yet take into account the additional high speed motion that may be present in the field of view. In summary, synchronization is a problem