文章基本信息

标题：Data structures for road condition AVI file video augmentation.
作者：Mihic, Srdan ; Ivetic, Dragan
期刊名称：Annals of DAAAM & Proceedings
印刷版ISSN：1726-9679
出版年度：2008
期号：January
语种：English
出版社：DAAAM International Vienna
摘要：The public institution Road Center of Vojvodina (Centar za puteve Vojvodine, CPV) is using ROad Measurement and Data Acquisition System (ROMDAS) (Bennett et al., 2007) for road inspection and maintenance. The ROMDAS system consists of several measuring devices (gyroscope, GPS receivers etc.), a video camera mounted on a vehicle, and software to process the collected discrete data. The video camera captures video into AVI format. The measuring devices capture the discrete data about the physical characteristics of the road-condition state such as: road roughness, transverse profile and rut depths, traffic density etc. After the completed survey run, the measured data are processed and analyzed.
关键词：Streaming media

Data structures for road condition AVI file video augmentation.

Mihic, Srdan ; Ivetic, Dragan

1. INTRODUCTION

The public institution Road Center of Vojvodina (Centar za puteve Vojvodine, CPV) is using ROad Measurement and Data Acquisition System (ROMDAS) (Bennett et al., 2007) for road inspection and maintenance. The ROMDAS system consists of several measuring devices (gyroscope, GPS receivers etc.), a video camera mounted on a vehicle, and software to process the collected discrete data. The video camera captures video into AVI format. The measuring devices capture the discrete data about the physical characteristics of the road-condition state such as: road roughness, transverse profile and rut depths, traffic density etc. After the completed survey run, the measured data are processed and analyzed.

ROMDAS captures the visual road-condition state with the video camera mounted on the vehicle. This video represents a valuable source of information for road managers and road engineers since it provides visual feedback of the collected discrete data. Unfortunately, the video is stored separately from the discrete data acquired by the measuring devices. Therefore, road engineers have to search the video manually in order to find details of interest provided by data analysis. This is a tedious and error prone task. Hence our approach has been to integrate both data and video in order to support more effective and comfortable information retrieval. The integration is carried out by encapsulation of both the video and the data into one file in a way that facilitates data storage and communication. The data used for video augmentation are called augmented data.

Augmented Video stream Framework (AVF) was designed for ROMDAS road-condition state video augmentation with discrete data collected by ROMDAS measurement devices. The AVF provides a full search of augmented data according to the data properties. The implementation of the AVF is based on Microsoft DirectShow for synchronized playback of the basic video and augmented data (Mihic, 2007).

The proposed AVF framework was created to solve one specific problem of road engineering. It was designed, however, to be extensible and applicable on a wide range of information systems.

This paper focuses on data structures that encapsulate the augmented data.

2. VIDEO AUGMENTATION

The problem which we were to solve can be classified as part of the so-called field of video augmentation. Studies have shown that video stream augmented with data provides a deeper understanding of captured reality and promotes active watching (Correia & Chambel, 1999). Additionally, the augmented data provide consumers with the ability to acquire new knowledge and advocate content-oriented video access and retrieval.

One of frequently used approaches for video augmentation is annotated video. An annotated video is a video augmented with annotations. By annotations Schroeter et al. (Schroeter et al., 2007) mean: descriptions, notes, subjective comments and various observations that can be attached to the video document without actual document modification. The variety of annotated information is limited by the annotator's knowledge, and it is subjective (Schroeter et al., 2007).

Another popular approach for video augmentation is augmented video. An augmented video is the result of augmenting a certain video with non-perceivable data captured at the time of the recording. Usually, 3D computer generated objects are rendered on top of the video and merged into the video stream.

The ROMDAS system can be classified as an annotated video system with characteristics of augmented video. The measured discrete ROMDAS data are closely related to the video content and therefore inseparable from the video.

Agosti and Ferro abstracted the definition of annotation to include all forms of video data augmentation. Annotations are divided by their correlation to the video content into: content enrichment and stand-alone documents. The former regards annotations as closely related to the video content and therefore inseparable from the video. The latter regards annotations as real documents and autonomous entities that maintain some sort of connection with the video content (Agosti & Ferro, 2007). By their definition, the ROMDAS system can be classified as a stand-alone annotation system, whereas the nature of the augmented data poses that the system should be a content enrichment annotation system.

3. AUGMENTED DATA STRUCTURES

Microsoft's Audio Video Interleave (AVI) was used as implementation multimedia container format (MCF). AVI was chosen because ROMDAS video camera captures video into AVI format. Additionally, our comprehensive study of commercial MCFs (Mihic, 2007) showed that other MCFs offer very similar features, and almost all of them are suitable for augmented data encapsulation and storage. In the early stages of development there were considerations to create a new specifically designed MCF, like in ANNODEX system (Pfeiffer et al., 2003). Since our study showed that existing MCFs are suitable, that approach was abandoned.

For security reasons and to achieve effective data storage, it has been required to restrict access to augmented data and enable augmented data compression while maintaining compatibility.

The next two approaches for augmenting data into MFC are common: interleaving or non-interleaving. The former approach is suitable for applications where video streaming is needed, and the latter is suitable for applications where high compression ratio and effective data encryption is needed. AVI MCF does not support interleaving of augmented data and therefore the latter option was chosen.

AVI file is organized into small pieces called chunks. Chunks are identified by their name (FOURCC) and they can be nested. Specification provides a chunk named 'INFO' for additional data description. The AVI specification requires that AVI file parsers should ignore any unknown 'INFO' sub chunk (CORPORATE Microsoft Corp. 1991). According to this constraint augmented data encapsulated in 'INFO' chunk should maintain compatibility and therefore would enable unauthorized consumers to watch the basic video. We created an 'INFO' sub chunk named 'AUGD', since several sub chunks are already defined by the specification. All the augmented data are encapsulated into this chunk. The augmented data are compressed and encrypted before encapsulation. Arbitrary compression and encryption algorithms can be used.

The augmented data are structured in an object-oriented way. The data are described using a type system similar to those in C++ and Java programming languages (Fig. 1). AVF defines: bool, int, double, string, image, audio stream and video stream as atomic types, as same as in (Romero & Correia, 2003). All the data are encapsulated into classes. AVF type system supports class inheritance. The concrete value of the measured discrete ROMDAS data in certain time interval is represented by the Object class. We have chosen time intervals instead of frame/frame intervals because of the nature of the discrete ROMDAS data. Namely, real-time video capturing poses constraints on achievable frame rate, and the time resolution of measurement devices goes as low as [10.sup.-3]s. If we would have used frames/frame intervals, valuable measured data could not have been described using our system.

The implemented time resolution of AVF framework was set to [10.sup.-8]s--the lowest value usable in commercial multimedia presentation frameworks.

Although the AVF type system was designed to describe discrete ROMDAS data, it can also describe arbitrary data structured in an object-oriented manner.

4. CONCLUSION

The ROad Measurement and Data Acquisition System (ROMDAS) collects and analyses the road-condition state through videos and the discrete data acquired by specific measurement devices. Separation of video and data storage forces road engineers to search the video manually in order to find details of interest.

[FIGURE 1 OMITTED]

We concluded that the augmented video should be a self-contained entity allowing the full data search according to data properties. A hybrid video augmentation system was designed: Augmented Video stream Framework (AVF). The AVF enables creation, search and playback of self-contained augmented AVI files for effective road surveying. The AVF approach is valuable far beyond the application area of road maintenance.

This paper introduced AVF data structures used for video data augmentation. The AVF uses type system similar to C++ and Java programming languages and offers encapsulation of arbitrary data in an object-oriented manner. Supported AVF atomic types are: bool, int, double, string, image, audio stream and video stream. Time intervals were used as synchronization units between the video and the augmented data.

The ROMDAS video augmentation was carried out using only discrete atomic types because ROMDAS measurement devices only capture the discrete data. In the future we plan to enhance ROMDAS videos with continual media (e.g. supplemental video of road roughness state).

Further research will be conducted towards creation of augmented videos based on ontologies of different applications.

Acknowledgements This research was supported by IT Project No. 13013, financed by the government of Republic of Serbia.

5. REFERENCES

Agosti, M. & Ferro, N. (2007). A formal model of annotations of digital content. ACM Transactions on Information Systems (TOIS), Vol. 26, No. 1, (November 2007), Article No. 3, 57 pages, ISSN: 1046-8188

Bennett, C.; R., Chamorro, A.; Chen, C.; Solminihac, De. H. & Flintisch, G.W. (2007). Data Collection Technologies for Road Management, The World Bank, East Asia Pacific Transport Unit, Washington, D.C.

CORPORATE Microsoft Corp. (1991). Microsoft Windows multimedia programmer's reference, Microsoft Press, ISBN: 1-55615-389-9, Redmond, WA, USA

Correia, N. & Chambel T. (1999). Active video watching using annotation. Proceedings of Seventh ACM international Conference on Multimedia (Part 2), pp. 151-154, ISBN: 1-58113-239-5, Orlando, Florida, USA, October 1999, ACM, New York, NY, USA

Mihic, S. (2007). Augmented Video stream Framework. M.Sc. thesis (in Serbian), Faculty of Technical Sciences, Novi Sad, Serbia

Pfeiffer, S.; Parker, C. & Schremmer, C. (2003). Annodex: A Simple Architecture to Enable Hyperlinking. Proceedings of the 5th ACM SIGMM international Workshop on Multimedia information Retrieval. pp. 87-93, ISBN: 1-58113-778-8, Berkeley, California, November 2003, ACM, New York, NY, USA

Romero, L. & Correia, N. (2003). Mixed reality hypermedia: HyperReal: a hypermedia model for mixed reality. Proceedings of the Fourteenth ACM Conference on Hypertext and Hypermedia, pp. 2-9, ISBN: 1-58113-704-4, Nottingham, UK, August 2003, ACM, New York, NY, USA

Schroeter, R.; Hunter, J.; Newman, A. (2007). Annotating Relationships between Multiple Mixed-Media Digital Objects by Extending Annotea. Lecture Notes in Computer Science, Vol. 4519/2007, (June 2007) pp. 533-548, ISSN: 0302-9743