Scrubbing Metadata is Not Easy

Wed Sep 12 14:59:03 UTC 2012

Scrubbing Metadata is Not Easy

Take some random MP3 file.

$ dd if=/dev/urandom bs=1k count=1024 | lame -r -s 44.1 -m s - test.mp3
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB) copied, 0.18162 s, 5.8 MB/s
Assuming raw pcm input file
LAME 3.99.5 64bits (http://lame.sf.net)
Using polyphase lowpass filter, transition band: 16538 Hz - 17071 Hz
Encoding <stdin> to test.mp3
Encoding as 44.1 kHz stereo MPEG-1 Layer III (11x) 128 kbps qval=3

Add some ID3 tags to it...

$ perl -e 'printf "TAGThis is tag number %-105d\xFF", $_ for reverse 1..8' >> test.mp3
$ id3v2 -l test.mp3 
id3v1 tag info for test.mp3:
Title  : This is tag number 1            Artist:                               
Album  :                                 Year:     , Genre: Unknown (255)
Comment:                               
test.mp3: No ID3v2 tag

Try remuxing it using ffmpeg, and instruct ffmpeg to remove metadata (see manpage):

$ ffmpeg -i test.mp3 -map_metadata -1 -c:a copy test-2.mp3 
ffmpeg version 0.11.1 Copyright (c) 2000-2012 the FFmpeg developers
  built on Jun  9 2012 13:50:13 with gcc 4.7.0 20120505 (prerelease)
  configuration: --prefix=/usr --enable-libmp3lame --enable-libvorbis --enable-libxvid --enable-libx264 --enable-libvpx --enable-libtheora --enable-libgsm --enable-libspeex --enable-postproc --enable-shared --enable-x11grab --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libschroedinger --enable-libopenjpeg --enable-librtmp --enable-libpulse --enable-libv4l2 --enable-gpl --enable-version3 --enable-runtime-cpudetect --disable-debug --disable-static
  libavutil      51. 54.100 / 51. 54.100
  libavcodec     54. 23.100 / 54. 23.100
  libavformat    54.  6.100 / 54.  6.100
  libavdevice    54.  0.100 / 54.  0.100
  libavfilter     2. 77.100 /  2. 77.100
  libswscale      2.  1.100 /  2.  1.100
  libswresample   0. 15.100 /  0. 15.100
  libpostproc    52.  0.100 / 52.  0.100
[mp3 @ 0x237a100] max_analyze_duration 5000000 reached at 5015510
Input #0, mp3, from 'test.mp3':
  Metadata:
    title           : This is tag number 1          
    artist          :                               
    album           :                               
    date            :     
    comment         :                               
  Duration: 00:00:05.98, start: 0.000000, bitrate: 129 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16, 128 kb/s
Output #0, mp3, to 'test-2.mp3':
  Metadata:
    TSSE            : Lavf54.6.100
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=      95kB time=00:00:06.00 bitrate= 129.2kbits/s    
video:0kB audio:94kB global headers:0kB muxing overhead 0.466835%

Metadata is in input, but not in output. Looks like it worked, right? To make absolutely sure, let's do it again:

$ ffmpeg -i test-2.mp3 -map_metadata -1 -c:a copy test-3.mp3 
ffmpeg version 0.11.1 Copyright (c) 2000-2012 the FFmpeg developers
  built on Jun  9 2012 13:50:13 with gcc 4.7.0 20120505 (prerelease)
  configuration: --prefix=/usr --enable-libmp3lame --enable-libvorbis --enable-libxvid --enable-libx264 --enable-libvpx --enable-libtheora --enable-libgsm --enable-libspeex --enable-postproc --enable-shared --enable-x11grab --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libschroedinger --enable-libopenjpeg --enable-librtmp --enable-libpulse --enable-libv4l2 --enable-gpl --enable-version3 --enable-runtime-cpudetect --disable-debug --disable-static
  libavutil      51. 54.100 / 51. 54.100
  libavcodec     54. 23.100 / 54. 23.100
  libavformat    54.  6.100 / 54.  6.100
  libavdevice    54.  0.100 / 54.  0.100
  libavfilter     2. 77.100 /  2. 77.100
  libswscale      2.  1.100 /  2.  1.100
  libswresample   0. 15.100 /  0. 15.100
  libpostproc    52.  0.100 / 52.  0.100
[mp3 @ 0x259c100] max_analyze_duration 5000000 reached at 5015510
Input #0, mp3, from 'test-2.mp3':
  Metadata:
    encoder         : Lavf54.6.100
  Duration: 00:00:06.00, start: 0.000000, bitrate: 129 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16, 128 kb/s
Output #0, mp3, to 'test-3.mp3':
  Metadata:
    TSSE            : Lavf54.6.100
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=      95kB time=00:00:06.00 bitrate= 129.1kbits/s    
video:0kB audio:94kB global headers:0kB muxing overhead 0.467454%

Indeed, my Title tag is no longer displayed by ffmpeg. Now the output file of this should be twice as clean, right? Let's see with ffprobe...

$ ffprobe test-3.mp3 
ffprobe version 0.11.1 Copyright (c) 2007-2012 the FFmpeg developers
  built on Jun  9 2012 13:50:13 with gcc 4.7.0 20120505 (prerelease)
  configuration: --prefix=/usr --enable-libmp3lame --enable-libvorbis --enable-libxvid --enable-libx264 --enable-libvpx --enable-libtheora --enable-libgsm --enable-libspeex --enable-postproc --enable-shared --enable-x11grab --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libschroedinger --enable-libopenjpeg --enable-librtmp --enable-libpulse --enable-libv4l2 --enable-gpl --enable-version3 --enable-runtime-cpudetect --disable-debug --disable-static
  libavutil      51. 54.100 / 51. 54.100
  libavcodec     54. 23.100 / 54. 23.100
  libavformat    54.  6.100 / 54.  6.100
  libavdevice    54.  0.100 / 54.  0.100
  libavfilter     2. 77.100 /  2. 77.100
  libswscale      2.  1.100 /  2.  1.100
  libswresample   0. 15.100 /  0. 15.100
  libpostproc    52.  0.100 / 52.  0.100
[mp3 @ 0x12df240] max_analyze_duration 5000000 reached at 5015510
Input #0, mp3, from 'test-3.mp3':
  Metadata:
    encoder         : Lavf54.6.100
  Duration: 00:00:06.00, start: 0.000000, bitrate: 129 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16, 128 kb/s

Clean indeed.

$ id3v2 -l test-3.mp3 
id3v1 tag info for test-3.mp3:
Title  : This is tag number 3            Artist:                               
Album  :                                 Year:     , Genre: Unknown (255)
Comment:                               
test-3.mp3: No ID3v2 tag

Trust no one.


Posted by OpBaI | Permanent link | File under: fileformats