TR#283: Video and Image Semantics: Advanced Tools for Telecommunications

Alex Pentland and Rosalind Picard

Article available in:
SPIE Pater 2185-05
Storage and Retrieval of Image
& Video Databases II,
San Jose, CA, Feb. 1994.

Within the next decade, the majority of data carried over telecommunications links is likely to be visual material. The biggest problem in delivering video and image services is that the technology for organizing, searching, and presenting images is still in its infancy. Consequently the goal of the M.I.T. Media Laboratory's Advanced Tools for Telecommunications Project is to develop tools for automatically understanding and using the semantics of video and image materials. To support visual services, we must first be able to build multimedia databases quickly and cheaply. We must be able to extract and represent the content of the video clips and images sufficiently well that the computer can automatically select material that fulfills the needs of wide range of users and purposes. And finally, the computer must be able to automatically assemble this material into a coherent presentation. The system we are building for this project takes in video and image material over ISDN lines, parses it into keyframes, subjects it to semantics-preserving image compression, and then stores it in an analogical database. This material can then be further annotated off-line. When users ask a question, the stored semantics and on-line similarity judgements are used to automatically assemble a multimedia presentation that can be sent back out over the telecommunications network.

