The diversity of environmental sounds is vast and includes the sounds generated in indoor and in outdoor environments. These sounds convey information about human and social activities. Examples include cheering of the audience in a sports event, a gunshot in the street and hasty steps in a nursing home. Such information is helpful in applications that analyze audio and video content. Alerting sounds such as emergency vehicles, smoke alarms and medical monitoring alarms are of a special importance, as they are usually designed to warn people of hazardous situations, even when these are out of sight. Unfortunately, alerting sounds may be missed due to hearing impairment or just simple distraction, leading to hazardous and life-threatening situations. …show more content…
Since by their nature alarm sounds are intended to be easily detected, we could expect alarm sound detection to be a relatively simple task. However, the distinctive characteristics of alarm sounds are not formally defined. Moreover, such sounds are varied, and it is not obvious that they do indeed share common characteristics, rather than being learned by listeners as the conjunction of a set of more special purpose sound types [2]. The International Organization for Standardization has defined a recommendation for “auditory danger signals” (ISO 7731) [3]. However, this recommendation only gives rough guidelines for alarm sounds and is not widely used worldwide. Instead, most countries have their unique siren standardization. Moreover, many alarm sounds are not standardized at all, e.g., alarm clocks. In an attempt to define common characteristics shared by alarm sounds, three types of siren are defined in …show more content…
Since building a general model of alarm sounds is difficult, most works try to detect only particular alarms, usually sirens of emergency vehicles of a specific country. Many of these works do not perform well out of laboratory conditions since they do not model well enough ambient background sounds and usually do not consider shifts in frequency due to the Doppler effect. For example, in the work presented in [5], the authors try to detect a small set of pre-selected warning sounds in a simulated environment by cross-correlation. In [6], an artificial neural network was used to detect police vehicles in