PASIG Digital Preservation BOOTcamp Best* Practices in Preserving Common Content Types: AudioVisual files Kara Van Malssen AudioVisual Preservation Solutions kara@avpreserve.com www.avpreserve.com MAY 22, 2013
File 1 File 2 File 3 Format PDF 1.2 PDF 1.2 PDF 1.4 File 1 File 2 File 3 Format PDF 1.2 PDF 1.2 PDF 1.4 Page Count 20 20000 40 Encryption Yes No Yes File Size 1 MB 120 MB 2 MB Valid No Yes No Well-formed Yes Yes Yes
File 1 File 2 File 3 Format.mov.mov.mp4
Containers and Encoded Essence
Container Essence
Container Essence
Types of possible Data Within AV Containers Video Audio Text Metadata Image Captions Subtitles Chapters Attachments Timecode Matroska Container Specifica1on
formats{ common video File Container QuickTime (.mov) DV (.dv) MPEG-4 (.mp4) MPEG-2 (.mpg) AVI (.avi) MXF (.mxf) Matroska (.mkv) Flash Video (.flv)
formats{ COmmon video stream encoding MPEG-2 MPEG-4 (H.264, AVCHD, etc) DV ProRes JPEG2000 FFV1 Theora Sorenson Uncompressed*
Formats{ common audio File Container WAV (.wav) BWF (.wav) MP3 (.mp3) MP2 (.mp2) AIFF (.aiff) FLAC (.flac)
formats{ COmmon audio stream encodings PCM MP3 AAC Windows Media Audio AC3 FLAC
ffmpeg -codecs Codec encoder/decoder A codec is a device or computer program capable of encoding or decoding a digital data stream or signal. The word codec is a portmanteau of "coder-decoder" or, less commonly, "compressordecompressor". - http://en.wikipedia.org/wiki/codec
A Primer on Codecs for Moving Image and Sound Archives Chris Lacinak http://bit.ly/avps_codecs
More info on AV file formats/stream encodings: http://digitalpreservation.gov/formats/fdd/descriptions.shtml http://wiki.multimedia.cx/index.php?title=main_page http://fileformats.archiveteam.org/wiki/video
Significant properties/ characteristics?
Some additional important AV file characteristics... General Video audio data rate duration file size *other embedded data color space sampling rate aspect ratio bit depth frame size channels frame rate bit depth pixel format chroma subsampling
MPEG-4 Encoding Data rate = 3500 Kbps
MPEG-4 Encoding Data rate = 1000 Kbps
MPEG-4 Encoding Data rate = 250 Kbps
Data Rate & File Size VIDEO 1 hour 210 Mb/s 92 GB VIDEO 1 hour 50 Mb/s 22 GB VIDEO 1 hour 25 Mb/s 11 GB VIDEO 1 hour 1.5 Mb/s 1 GB AUDIO 1 hour 4.6 Mb/s 2 GB AUDIO 1 hour 128 Kb/s 56 MB
Data Rate, File Size & Usage VIDEO 1 hour 210 Mb/s 92 GB Preserva1on Master VIDEO 1 hour 50 Mb/s 22 GB Preserva1on Master or Mezzanine VIDEO 1 hour 25 Mb/s 11 GB Preserva1on Master or Mezzanine VIDEO 1 hour 1.5 Mb/s 1 GB Access Copy AUDIO 1 hour 4.6 Mb/s 2 GB Preserva1on Master AUDIO 1 hour 128 Kb/s 56 MB Access Copy
Digital AV File Types for Archival Purposes Preservation Master = highest possible data rate, open and interoperable encoding, appropriate container format Mezzanine (access / edit master) = medium data rate, open and interoperable encoding, appropriate container format Proxy (access copy) = low data rate, any encoding, any container
mediainfo [filename] ffmpeg -f lavfi -i mandelbrot -t 10 -c:v v210 v210.avi command-spiration: ffmpeg4archivists, Dave Rice & Misty De Meo, AMIA 2012
mediainfo [filename] http://samples.mplayerhq.hu
ffprobe -show_format -show_streams -show_data -show_error -show_versions [filename]
exiftool [filename]
Preservation format?
AUDIO reformatting = linear PCM / BWF (WAV) ~ min 24 bit / 48 KHz IASA, EBU, US-FADGI
ANALOG MIGRATION (grossly oversimplified) composite or s-video component Uncompressed DV Pro Res... Uncompressed DV MPEG- 2 Uncompressed DV DNx
DIGITAL MIGRATION (fairly oversimplified) firewire DV
H.264
DV XDCam HD MPEG2
But what format should I use? Budget? Infrastructure? Access requirements? Scale? Mission & Goals? Sustainability?
...It depends if there was a standard or best practice, I would point you to it
1. Know your files well 2. Understand your options Recommendations 3. Make informed decisions 4. Test before you transcode 5. Sustainability criteria is important! http://www.digitalpreservation.gov/formats/sustain/sustain.shtml