Multimedia Data Multimedia Data Text Vector Graphics 3-D Vector Graphics Raster Graphics Digital Image Voxel Audio Digital Video 1
Text There are three types of text that are used to produce pages of documents Unformated text Formated text Hypertext Unformated text Unformated text, also known as plaintext Enables pages to be created which compromise strings of fixed-sized characters from a limited character set 2
ASCII character set American Standard Code for Information Interchange Each character is represented by 7 bits There are 128 alternative characters In addition to all normal alphabetic, numeric and punition characters (printable characters) the set also includes a number of control characters (control characters) 3
Unicode Unicode is an industry standard designed to allow text and symbols from all of the writing systems of the world to be consistently represented and manipulated by computers The standard has been implemented in many recent technologies, including XML, the Java programming language, and modern operating systems Unicode covers almost all scripts (writing systems) in current use today, including: Arabic, Armenian, Bengali, Braille embossing patterns, Canadian Aboriginal Syllabics, Cherokee, Coptic, Cyrillic, Devanāgarī, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi (Punjabi), Han (Kanji, Hanja, Hanzi), Hangul (Korean), Hebrew, Hiragana and Katakana (Japanese), International Phonetic Alphabet (IPA), Khmer (Cambodian), Kannada, Lao, Latin, Malayalam, Mongolian, Myanmar (Burmese), Oriya, Syriac, Tamil, Telugu, Thai, Tibetan, Tifinagh, Yi, Zhuyin (Bopomofo) 4
UTF-8/16 Most popular encodings include: UTF-8 an 8-bit, variable-width encoding, compatible with ASCII UTF-16 a 16-bit, variable-width encoding (Big-endian/Little-endian!) UTF-16 is the native internal representation of text in the Microsoft Windows NT/Windows 2000/XP/CE, Mac OS X and Symbian operating systems Java and.net bytecode environments 22 (hex 7A) small Z (Latin) 007A z 27700 (hex 6C34) water (Chinese) 6C34 Formated text Formated text, also known as richtext Enables pages and complete documents to be created which compromise of strings of characters of different styles, size snd shape with tables, graphics, and images inserted at appropriate points (Example: Word) 5
Hyprtext Enables an intergrated set of documents (each comprising formated text) to be created which have defined linkages between them (hyperlinks) HTML HyperText Markup Language (HTML) An example of more general set of mark-up languages These are used to describe how the content of a document are to be presented on a printer or a display Postscript SGLM (Standart Generalization Mark-up language on which XML (Extensible Markup Language) and HTML are based TeX, LaTeX 6
Directives Page formating commands New paragraph <P> Start ad end in boldface <B>test</B> Bulleted list <HL>list</HL> Include an Image <IMG SRC= an image > Link to another page <A HREFF= URL >anothe rpage</a> Graphics, Digital Images 7
Vector Graphics Use of geometrical primitives such as points, lines, curves, and polygons, which are all based upon mathematical equations to represent images in computer graphics It is used by contrast to the term raster graphics, which is the representation of images as a collection of pixels (dots) Example Consider a circle of radius r. The main pieces of information a program needs in order to draw this circle are the radius r the location of the center point of the circle stroke line style and colour (possibly transparent) fill style and colour (possibly transparent) 8
Advantages This minimal amount of information translates to a much smaller file size compared to large raster images the size of representation doesn't depend on the dimensions of the object One can indefinitely zoom in on e.g. a circle arc, and it remains smooth Example 9
Vector based video game Star-Wars Atari (1983) 10
3-D Vector Graphics 3D computer graphics are works of graphic art that were created with the aid of digital computers and specialized 3D software (CAD, Computer-aided design) 11
Raster Graphics A raster graphics image, digital image, or bitmap, is a data file or structure representing a generally rectangular grid of pixels, or points of color, on a computer monitor, paper, or other display device VGA VGA, video graphics array type of display Consists of a matrix of 640 horizontal pixels by 480 vertical pixels 8 bits per pixel which allows each pixel to have 256 different colors 12
Digital Image A digital image is a representation of a twodimensional image as a finite set of digital values, called picture elements or pixels Typically, the pixels are stored in computer memory as a raster image or raster map, a twodimensional array of small integers Digital images can be created by a variety of input devices and techniques, such as digital cameras, scanners Digital Image acquisition 13
Color Principle Additive color mixing (RGB) White is produced when all three primary colors (Red, Green, Blue) are mixed Substractive color mixing Black is produced when all three secondary colors (Cyan, Magenta, Yellow) are mixed (Example, painting) 14
The color of each pixel is individually defined Images in the red, green, blue (RGB) color space consist of colored pixels defined by three numbers one for red, green and blue Less colorful images require less information per pixel An image with only black and white pixels requires only a single bit for each pixel Generation of a digital Image 15
16
17
Pixel Depth The range of different colors that can be produced 12 bits, 4 bits per primary color yielding 4096 different colors 24 bits, 8 bits per primary color yielding 16 million colors The eye can not discriminate between such a range of colors A selected subset subset of this range is used 18
Color look-up table CLUT The selected colors are stored in a table Each pixel value is used as an address to a location within the table For example, if each pixel is 8 bits and the CLUT contains 24 bit entries, this will provide a subset of 256 different colors selected from the palette of 16 million color The amount of memory is saved Voxel A voxel (a portmanteau of the words volumetric and pixel) is a volume element, representing a value on a regular grid in three dimensional space Analogous to a pixel, which represents 2D image data Voxels are frequently used in the visualisation and analysis of medical and scientific 3D data 19
Audio 20
Sampling - Audio In signal processing, sampling is the reduction of a continuous signal to a discrete signal Conversion of a sound wave (a continuous time signal) to a sequence of samples (a discrete time signal) The sampling frequency or sampling rate f s is defined as the number of samples obtained in one second, or f s = 1 / T (T sampling interval) The sampling rate is measured in Hertz or in samples per second Bandwidth Speech signals 50 Hz-10 khz Sampling rate for stereo 2*10 khz Music-quality audio 15Hz - 20 khz Sampling rate for stereo 2*20 khz, 40k samples/s The number of bits per sample must be chosen so that the quantization noise generated by sampling process is at an acceptable level (reconstructing) 21
12 bits per sample for speech 16 bits per sample in music In addition, since in most applications involving music stereo is required, two such signals need to be digitized In practice, both sampling rate used and the number of bits per sample are often less CD-quality audio The sampling rate 44.1 k samples/sec 16 bit per sample Bit rate per channel =sampling rate/s * bits per sample 44.1*10 3 *16=705.6 k bits per sample = 705600 bit per sec Total bit rate is 1.4112 Mbps (stereo, 2*chanel) =1411200 bits/sec 22
Capacity of a CD Capacity for 60 Min (3600 sec) 1411200 bits/sec*3600 sec = 5080320000 bits = 635040000 bytes = 605.62 Mbyts Digital Video In multimedia databases the video signal needs to be in a digital form, since then it becomes possible to store it Made up of images and sound Example 25 picture per second, each 250 KB, 6.25 MB per second 23
Comparison Object Elements Order Size Time dependent? Text Character set Sequence (!) >10KB no Vector Graphics Vectors, numbers Set (!) >10KB no Digital Image Pixels, numbers Matrix, vector >1MB no Audio numbers Sequence (!) 600MB (CD) yes Video Images, audio Sequence (!) >2GB yes Conclusion Multimedia objects are represented by digital information (numbers, vectors) Can we define a similarity criteria between multimedia objects? Pattern recognition, neuronal networks Yet not efficient... (database to big!) What are the features? The size of the multimedia object may be huge (video) Compression... What is the information content of an object? 24
Multimedia Data Text Vector Graphics 3-D Vector Graphics Raster Graphics Digital Image Voxel Audio Digital Video 25