Sophie: wavlevel-0.2-4mdk i586

wavlevel-0.2-4mdk.i586.rpm

WAVLEVEL is a tool to adapt the volume of a set of WAV
files (maybe from different CDs) so the listener hears no
annoying change in loudness from song to song. This is
useful when you compile a sampler with many different
artists.

I.) THEORY

There are 4 different strategies:
 
 1. Use maximum dynamic range for each song (NORMalize)
 (call wavlevel -mn)
 Every song will be scaled, so that it's data
 stretch from -32768 to  +32768, thus the
 maximum loudnes of each song is (roughly) the same.  
 This method  produces good results if the ratio  
 (peak_volume/average_volume) is roughly the same in all 
 songs, but will provide unsatisfying results if you 
 combine a song that has only one loud part with one that 
 is uniformely loud.

 2. Use average peak-to-peak as a norm value (AMPLitude)
 (call wavlevel -ma)
 Not only the loudest part is considered, but an average
 of all peak-to-peak values of the song is calculated.
 The song with maximum loudness will get the full
 availlable range, the other songs are scaled so that
 their average peak-to-peak values match.

 3. Use logarightmic peak-to-peak as a norm value
 (LogAMPLitude) (call wavlevel -ml)
 This works just like the peak-to-peak method, but
 log10(amplitude) is calculated to an average value.
 This takes into account that the human ear has a
 logarithmic characteristic. Theoretically. ;-)
 Practically the muting/amplification effect is too weak.

 4. Use average energy in wave as a norm value (POWER)
 (call wavlevel -mp)
 The concept is the same as before, but every song will 
 have the same average energy per wave.
 I approximate the energy as frequency*amplitude


The method that has proven to deliver best results is the
amplitude average. Here is the output from

wavlevel -c -v -ma *.wav :

processing
bob_marley.jamming.wav   analyzing volume statistics...
    min. sample=-29691, max. sample=30579
    avr. pp=5231, max. pp=43300 in 214404 cycles
    avr. lpp=3.453030, max. lpp=4.636488
    avr. en=343.923632, max. en=5587.500000
processing frank_sinatra.new_york_new_york.wav
  analyzing volume statistics...
    min. sample=-32669, max. sample=32676
    avr. pp=8630, max. pp=59473 in 316207 cycles
    avr. lpp=3.637568, max. lpp=4.774320
    avr. en=465.781139, max. en=9445.000000
processing gary_moore.friday_on_my_mind.wav
  analyzing volume statistics...
    min. sample=-27956, max. sample=25091
    avr. pp=7135, max. pp=39081 in 574770 cycles
    avr. lpp=3.686575, max. lpp=4.591966
    avr. en=479.042950, max. en=3672.625000
processing pink_floyd.high_hopes.wav
  analyzing volume statistics...
    min. sample=-32768, max. sample=31910
    avr. pp=5769, max. pp=54135 in 636927 cycles
    avr. lpp=3.272179, max. lpp=4.733478
    avr. en=255.377495, max. en=6459.000000
processing the_police.msg_in_a_bottle.wav
  analyzing volume statistics...
    min. sample=-32768, max. sample=32767
    avr. pp=8027, max. pp=61773 in 495523 cycles
    avr. lpp=3.728423, max. lpp=4.790799
    avr. en=853.328151, max. en=13311.000000
processing the_specials.monkey_man.wav
  analyzing volume statistics...
    min. sample=-24045, max. sample=20398
    avr. pp=4011, max. pp=31682 in 299900 cycles
    avr. lpp=3.392512, max. lpp=4.500813
    avr. en=314.747845, max. en=8877.000000

global maximum dynamics ratio is 11.522
at normvalue 5231.000 and value range span = 60270.000.
global normvalue calculated to 5687.704.

processing bob_marley.jamming.wav
    offset is -444 
    amplification is 1.087307
    raising avr energy 343.924 -> 373.951, avr pp. 5231 -> 5687.704
processing frank_sinatra.new_york_new_york.wav
    offset is -3 
    amplification is 0.659062
    raising avr energy 465.781 -> 306.979, avr pp. 8630 -> 5687.704
processing gary_moore.friday_on_my_mind.wav
    offset is 1432 
    amplification is 0.797155
    raising avr energy 479.043 -> 381.872, avr pp. 7135 -> 5687.704
processing pink_floyd.high_hopes.wav
    offset is 429 
    amplification is 0.985908
    raising avr energy 255.377 -> 251.779, avr pp. 5769 -> 5687.704
processing the_police.msg_in_a_bottle.wav
    offset is 0 
    amplification is 0.708572
    raising avr energy 853.328 -> 604.644, avr pp. 8027 -> 5687.704
processing the_specials.monkey_man.wav
    offset is 1823 
    amplification is 1.418026
    raising avr energy 314.748 -> 446.321, avr pp. 4011 -> 5687.704


That's how you read the output:
 min. sample : minimum value found in data
 max. sample : maximum value found in data
 avr. pp     : average peak to peak value of song
 max. pp     : maximum peak to peak value of song
 avr. lpp    : average log10(pp) of song
 max. lpp    : maximum log10(pp) of song
 avr. en     : average wave energy of song (pp*f)	
 max. en     : maximum wave energy of song

the loudest song (Frank Sinatra) was muted by factor 0.659,
the quietest song (The Specials) amplified by factor 1.418.

II.) EXAMPLES

 In practice, when you have a directory full of WAVs you
 want to compile to a CD, just call

 wavlevel -d -v -ma *.wav

 It pays to have the -v option (verbosity on) there
 because then you can write down the normvalue used.
 (In the above output it would be 5231.000)
 This is useful when you want to add anoyher file to the
 collection. You needn't proces all the others again (this
 wouldn't be optimal for the sound quality anyway) but
 simply run:

 wavlevel -d -v -ma -n 5231.000 my_new_song.wav

 And voila, the new song sounds just as loud as the others!
 (Of course, you must not change norming-methods between
 the two run. Normvalues are different for each method)