A place to cache linked articles (think custom and personal wayback machine)
Nelze vybrat více než 25 témat Téma musí začínat písmenem nebo číslem, může obsahovat pomlčky („-“) a může být dlouhé až 35 znaků.

index.md 59KB

před 4 roky
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193119411951196119711981199120012011202120312041205120612071208120912101211121212131214121512161217121812191220122112221223122412251226122712281229123012311232123312341235123612371238123912401241124212431244124512461247124812491250125112521253125412551256125712581259126012611262126312641265126612671268126912701271127212731274127512761277127812791280128112821283128412851286128712881289129012911292129312941295129612971298129913001301130213031304130513061307130813091310131113121313131413151316131713181319132013211322132313241325132613271328132913301331133213331334133513361337133813391340134113421343134413451346134713481349135013511352135313541355135613571358135913601361136213631364136513661367136813691370137113721373137413751376137713781379138013811382138313841385138613871388138913901391139213931394139513961397139813991400140114021403140414051406140714081409141014111412141314141415141614171418141914201421142214231424142514261427142814291430143114321433143414351436143714381439144014411442
  1. title: 24/192 Music Downloads
  2. url: https://www.xiph.org/~xiphmont/demo/neil-young.html
  3. hash_url: 383aee31d355fe1d52369314e49ffd40
  4. <div id="toc_intro">
  5. <img src="https://www.xiph.org/~xiphmont/demo/players2-small.jpg"/>
  6. <p class="aside">
  7. Also see Xiph.Org's new
  8. video, <a href="https://video.xiph.org/vid2.shtml">Digital Show
  9. &amp; Tell</a>, for detailed demonstrations of digital sampling
  10. in action on real equipment!
  11. </p>
  12. <p>Articles last month revealed that musician Neil
  13. Young and Apple's Steve Jobs discussed offering
  14. digital music downloads of 'uncompromised studio quality'.
  15. Much of the press and user commentary was particularly
  16. enthusiastic about the prospect of uncompressed 24 bit 192kHz
  17. downloads. 24/192 featured prominently in my own
  18. conversations with Mr. Young's group several months ago.</p>
  19. <p>Unfortunately, there is no point to distributing music in
  20. 24-bit/192kHz format. Its playback fidelity is slightly
  21. inferior to 16/44.1 or 16/48, and it takes up 6 times the
  22. space.</p>
  23. <p>There are a few real problems with the audio quality and
  24. 'experience' of digitally distributed music today. 24/192
  25. solves none of them. While everyone fixates on 24/192 as a
  26. magic bullet, we're not going to see any actual
  27. improvement.</p>
  28. </div>
  29. <div id="toc_ftbn">
  30. <h2>First, the bad news</h2>
  31. <p>In the past few weeks, I've had conversations with
  32. intelligent, scientifically minded individuals who believe
  33. in 24/192 downloads and want to know how anyone could
  34. possibly disagree. They asked good questions that deserve
  35. detailed answers.</p>
  36. <p>I was also interested in what motivated high-rate digital
  37. audio advocacy. Responses indicate that few people
  38. understand basic signal theory or the
  39. <a href="http://en.wikipedia.org/wiki/Sampling_theorem">sampling
  40. theorem</a>, which is hardly surprising. Misunderstandings
  41. of the mathematics, technology, and physiology arose in most
  42. of the conversations, often asserted by professionals who
  43. otherwise possessed significant audio expertise. Some even
  44. argued that the sampling theorem doesn't really explain how
  45. digital audio actually works [<a href="#foot1">1</a>].</p>
  46. <p>Misinformation and superstition only serve charlatans. So,
  47. let's cover some of the basics of why 24/192 distribution
  48. makes no sense before suggesting some improvements that
  49. actually do.</p>
  50. </div>
  51. <div id="toc_gmye">
  52. <h3>Gentlemen, meet your ears</h3>
  53. <p>The ear hears via hair cells that sit on the resonant
  54. basilar membrane in the cochlea. Each hair cell is
  55. effectively tuned to a narrow frequency band determined by
  56. its position on the membrane. Sensitivity peaks in the
  57. middle of the band and falls off to either side in a
  58. lopsided cone shape overlapping the bands of other nearby
  59. hair cells. A sound is inaudible if there are no hair cells
  60. tuned to hear it.</p>
  61. <img src="https://www.xiph.org/~xiphmont/demo/cochlea-and-responses.png"/>
  62. <div class="caption">
  63. <p>Above left: anatomical cutaway drawing of a human cochlea with the
  64. basilar membrane colored in beige. The membrane is
  65. tuned to resonate at different frequencies along its length,
  66. with higher frequencies near the base and lower frequencies
  67. at the apex. Approximate locations of several frequencies
  68. are marked.</p>
  69. <p>Above right: schematic diagram representing hair cell response
  70. along the basilar membrane as a bank of overlapping filters.</p>
  71. </div>
  72. <p>This is similar to an analog radio that picks up the
  73. frequency of a strong station near where the tuner is
  74. actually set. The farther off the station's frequency is,
  75. the weaker and more distorted it gets until it disappears
  76. completely, no matter how strong. There is an upper (and
  77. lower) audible frequency limit, past which the sensitivity
  78. of the last hair cells drops to zero, and hearing ends.</p>
  79. </div>
  80. <div id="toc_sratas">
  81. <h3>Sampling rate and the audible spectrum</h3>
  82. <p>I'm sure you've heard this many, many times: The human
  83. hearing range spans 20Hz to 20kHz. It's important to know
  84. how researchers arrive at those specific numbers.</p>
  85. <p>First, we measure the 'absolute threshold of hearing'
  86. across the entire audio range for a group of listeners.
  87. This gives us a curve representing the very quietest sound
  88. the human ear can perceive for any given frequency as
  89. measured in ideal circumstances on healthy ears. Anechoic
  90. surroundings, precision calibrated playback equipment, and
  91. rigorous statistical analysis are the easy part. Ears and
  92. auditory concentration both fatigue quickly, so testing must
  93. be done when a listener is fresh. That means lots of breaks
  94. and pauses. Testing takes anywhere from many hours to many
  95. days depending on the methodology.</p>
  96. <p>Then we collect data for the opposite extreme, the
  97. 'threshold of pain'. This is the point where the audio
  98. amplitude is so high that the ear's physical and neural
  99. hardware is not only completely overwhelmed by the input,
  100. but experiences physical pain. Collecting this data is
  101. trickier. You don't want to permanently damage anyone's
  102. hearing in the process.</p>
  103. <img src="https://www.xiph.org/~xiphmont/demo/ath-top.png"/>
  104. <div class="caption">
  105. <p>Above: Approximate equal loudness curves derived from
  106. Fletcher and Munson (1933) plus modern sources for
  107. frequencies &gt; 16kHz. The absolute threshold of hearing and
  108. threshold of pain curves are marked in red. Subsequent
  109. researchers refined these readings, culminating in the Phon
  110. scale and the ISO 226 standard equal loudness curves. Modern
  111. data indicates that the ear is significantly less sensitive
  112. to low frequencies than Fletcher and Munson's results. </p>
  113. </div>
  114. <p>The upper limit of the human audio range is defined to be
  115. where the absolute threshold of hearing curve crosses the
  116. threshold of pain. To even faintly perceive the audio at
  117. that point (or beyond), it must simultaneously be unbearably
  118. loud.</p>
  119. <p>At low frequencies, the cochlea works like a bass reflex
  120. cabinet. The <em>helicotrema</em> is an opening at the apex
  121. of the basilar membrane that acts as a port tuned to
  122. somewhere between 40Hz to 65Hz depending on the
  123. individual. Response rolls off steeply below this
  124. frequency.</p>
  125. <p>Thus, 20Hz - 20kHz is a generous range. It thoroughly
  126. covers the audible spectrum, an assertion backed by nearly a
  127. century of experimental data.</p>
  128. </div>
  129. <div id="toc_ggage">
  130. <h3>Genetic gifts and golden ears</h3>
  131. <p>Based on my correspondences, many people believe in
  132. individuals with extraordinary gifts of hearing. Do such
  133. 'golden ears' really exist?</p>
  134. <p>It depends on what you call a golden ear.</p>
  135. <p>Young, healthy ears hear better than old or damaged ears.
  136. Some people are exceptionally well trained to hear nuances
  137. in sound and music most people don't even know exist. There
  138. was a time in the 1990s when I could identify every major
  139. mp3 encoder by sound (back when they were all pretty bad),
  140. and could demonstrate this reliably in double-blind testing
  141. [<a href="#foot2">2</a>].</p>
  142. <p>When healthy ears combine with highly trained
  143. discrimination abilities, I would call that person a golden
  144. ear. Even so, below-average hearing can also be trained to
  145. notice details that escape untrained listeners. Golden ears
  146. are more about training than hearing beyond the physical
  147. ability of average mortals.</p>
  148. <p>Auditory researchers would love to find, test, and document
  149. individuals with truly exceptional hearing, such as a
  150. greatly extended hearing range. Normal people are nice and
  151. all, but everyone wants to find a genetic freak for a really
  152. juicy paper. We haven't found any such people in the
  153. past 100 years of testing, so they probably don't exist.
  154. Sorry. We'll keep looking.</p>
  155. </div>
  156. <div id="toc_s">
  157. <h3>Spectrophiles</h3>
  158. <p>Perhaps you're skeptical about everything I've just
  159. written; it certainly goes against most marketing material.
  160. Instead, let's consider a hypothetical Wide Spectrum Video
  161. craze that doesn't carry preexisting audiophile baggage.</p>
  162. <img src="https://www.xiph.org/~xiphmont/demo/visspec.png"/>
  163. <div class="caption">
  164. <p>Above: The approximate log scale response of the human
  165. eye's rods and cones, superimposed on the visible
  166. spectrum. These sensory organs respond to light in
  167. overlapping spectral bands, just as the ear's hair cells
  168. are tuned to respond to overlapping bands of sound
  169. frequencies.</p>
  170. </div>
  171. <p>The human eye sees a limited range of frequencies of
  172. light, aka, the visible spectrum. This is directly
  173. analogous to the audible spectrum of sound waves. Like the
  174. ear, the eye has sensory cells (rods and cones) that detect
  175. light in different but overlapping frequency bands.</p>
  176. <p>The visible spectrum extends from about 400THz (deep red)
  177. to 850THz (deep violet) [<a href="#foot3">3</a>].
  178. Perception falls off steeply at the edges. Beyond these
  179. approximate limits, the light power needed for the slightest
  180. perception can fry your retinas. Thus, this is a generous
  181. span even for young, healthy, genetically gifted
  182. individuals, analogous to the generous limits of the audible
  183. spectrum.</p>
  184. <p>In our hypothetical Wide Spectrum Video craze, consider a
  185. fervent group of Spectrophiles who believe these limits
  186. aren't generous enough. They propose that video represent
  187. not only the visible spectrum, but also infrared and
  188. ultraviolet. Continuing the comparison, there's an even
  189. more hardcore [and proud of it!] faction that insists this
  190. expanded range is yet insufficient, and that video feels so
  191. much more natural when it also includes microwaves and some
  192. of the X-ray spectrum. To a Golden Eye, they insist, the
  193. difference is night and day!</p>
  194. <p>Of course this is ludicrous.</p>
  195. <p>No one can see X-rays (or infrared, or ultraviolet, or
  196. microwaves). It doesn't matter how much a person believes
  197. he can. Retinas simply don't have the sensory hardware.</p>
  198. <p>Here's an experiment anyone can do: Go get your Apple IR
  199. remote. The LED emits at 980nm, or about 306THz, in the
  200. near-IR spectrum. This is not far outside of the visible
  201. range. Take the remote into the basement, or the darkest
  202. room in your house, in the middle of the night, with the
  203. lights off. Let your eyes adjust to the blackness.</p>
  204. <img src="https://www.xiph.org/~xiphmont/demo/apple-ir.jpg"/>
  205. <div class="caption">
  206. <p>Above: Apple IR remote photographed using a digital
  207. camera. Though the emitter is quite bright and the
  208. frequency emitted is not far past the red portion of
  209. the visible spectrum, it's completely invisible to the
  210. eye.</p>
  211. </div>
  212. <p>Can you see the Apple Remote's LED flash when you press a
  213. button [<a href="#foot4">4</a>]? No? Not even the tiniest
  214. amount? Try a few other IR remotes; many use an IR
  215. wavelength a bit closer to the visible band, around
  216. 310-350THz. You won't be able to see them either. The rest
  217. emit right at the edge of visibility from 350-380 THz and
  218. may be just barely visible in complete blackness with
  219. dark-adjusted eyes [<a href="#foot5">5</a>]. All would be
  220. blindingly, painfully bright if they were well inside the
  221. visible spectrum.</p>
  222. <p>These near-IR LEDs emit from the visible boundry to at most
  223. 20% beyond the visible frequency limit. 192kHz audio
  224. extends to 400% of the audible limit. Lest I be accused of
  225. comparing apples and oranges, auditory and visual perception
  226. drop off similarly toward the edges.</p>
  227. </div>
  228. <div id="toc_1ch">
  229. <h3>192kHz considered harmful</h3>
  230. <p>192kHz digital music files offer no benefits. They're not
  231. quite neutral either; practical fidelity is slightly worse.
  232. The ultrasonics are a liability during playback. </p>
  233. <p>Neither audio transducers nor power amplifiers are free of
  234. distortion, and distortion tends to increase rapidly at the
  235. lowest and highest frequencies. If the same transducer
  236. reproduces ultrasonics along with audible content, any
  237. nonlinearity will shift some of the ultrasonic content down
  238. into the audible range as an uncontrolled spray of
  239. intermodulation distortion products covering the entire
  240. audible spectrum. Nonlinearity in a power amplifier will
  241. produce the same effect. The effect is very slight, but
  242. listening tests have confirmed that both effects can be
  243. audible.</p>
  244. <img src="https://www.xiph.org/~xiphmont/demo/intermod.png"/>
  245. <div class="caption">
  246. <p>Above: Illustration of distortion products resulting
  247. from intermodulation of a 30kHz and a 33kHz tone in a
  248. theoretical amplifier with a nonvarying total harmonic
  249. distortion (THD) of about .09%. Distortion products
  250. appear throughout the spectrum, including at frequencies
  251. lower than either tone.</p>
  252. <p>Inaudible ultrasonics contribute to intermodulation
  253. distortion in the audible range (light blue area).
  254. Systems not designed to reproduce ultrasonics typically
  255. have much higher levels of distortion above 20kHz, further
  256. contributing to intermodulation. Widening a design's
  257. frequency range to account for ultrasonics requires
  258. compromises that decrease noise and distortion performance
  259. within the audible spectrum. Either way, unneccessary
  260. reproduction of ultrasonic content diminishes
  261. performance.</p>
  262. </div>
  263. <p>There are a few ways to avoid the extra distortion:</p>
  264. <ol>
  265. <li>
  266. <p>A dedicated ultrasonic-only speaker, amplifier, and
  267. crossover stage to separate and independently reproduce
  268. the ultrasonics you can't hear, just so they don't mess
  269. up the sounds you can.</p>
  270. </li>
  271. <li><p>Amplifiers and transducers designed for wider
  272. frequency reproduction, so ultrasonics don't cause audible
  273. intermodulation. Given equal expense and complexity, this
  274. additional frequency range must come at the cost of some
  275. performance reduction in the audible portion of the
  276. spectrum.</p></li>
  277. <li>
  278. <p>Speakers and amplifiers carefully designed not to reproduce
  279. ultrasonics anyway.</p>
  280. </li>
  281. <li>
  282. <p>Not encoding such a wide frequency range to begin with. You can't
  283. and won't have ultrasonic intermodulation distortion in the audible
  284. band if there's no ultrasonic content.</p>
  285. </li>
  286. </ol>
  287. <p>They all amount to the same thing, but only 4) makes any sense.</p>
  288. <p>If you're curious about the performance of your own system,
  289. the following samples contain a 30kHz and a 33kHz tone in a
  290. 24/96 WAV file, a longer version in a FLAC, some tri-tone
  291. warbles, and a normal song clip shifted up by 24kHz so that
  292. it's entirely in the ultrasonic range from 24kHz to 46kHz:</p>
  293. <p>Assuming your system is actually capable of full 96kHz
  294. playback [<a href="#foot6">6</a>], the above files should be
  295. completely silent with no audible noises, tones, whistles,
  296. clicks, or other sounds. If you hear anything, your system
  297. has a nonlinearity causing audible intermodulation of the
  298. ultrasonics. Be careful when increasing volume; running into
  299. digital or analog clipping, even soft clipping, will suddenly
  300. cause loud intermodulation tones.</p>
  301. <p>In summary, it's not certain that intermodulation from
  302. ultrasonics will be audible on a given system. The added
  303. distortion could be insignificant or it could be noticable.
  304. Either way, ultrasonic content is never a benefit, and on
  305. plenty of systems it will audibly hurt fidelity. On the
  306. systems it doesn't hurt, the cost and complexity of handling
  307. ultrasonics could have been saved, or spent on improved audible range
  308. performance instead.</p>
  309. </div>
  310. <div id="toc_sfam">
  311. <h3>Sampling fallacies and misconceptions</h3>
  312. <p>Sampling theory is often unintuitive without a signal processing
  313. background. It's not surprising most people, even brilliant PhDs in
  314. other fields, routinely misunderstand it. It's also not
  315. surprising many people don't even realize they have it wrong.</p>
  316. <img src="https://www.xiph.org/~xiphmont/demo/jaggy.png"/>
  317. <div class="caption">
  318. <p>Above: Sampled signals are often depicted as a rough
  319. stairstep (red) that seems a poor approximation of the
  320. original signal. However, the representation is
  321. mathematically exact and the signal recovers the exact
  322. smooth shape of the original (blue) when converted back to
  323. analog.</p>
  324. </div>
  325. <p>The most common misconception is that sampling is
  326. fundamentally rough and lossy. A sampled signal is often
  327. depicted as a jagged, hard-cornered stair-step facsimile of
  328. the original perfectly smooth waveform. If this is how you
  329. envision sampling working, you may believe that the faster
  330. the sampling rate (and more bits per sample), the finer the
  331. stair-step and the closer the approximation will be. The
  332. digital signal would sound closer and closer to the original
  333. analog signal as sampling rate approaches infinity.</p>
  334. <p>Similarly, many non-DSP people would look at the following:</p>
  335. <img src="https://www.xiph.org/~xiphmont/demo/jaggy2.png"/>
  336. <p>And say, "Ugh!&amp;quot It might appear that a sampled
  337. signal represents higher frequency analog waveforms
  338. badly. Or, that as audio frequency increases, the sampled
  339. quality falls and frequency response falls off, or becomes
  340. sensitive to input phase.</p>
  341. <p>Looks are deceiving. These beliefs are incorrect!</p>
  342. <p class="aside">
  343. <span class="caption">added 2013-04-04:</span><br/> As a
  344. followup to all the mail I got about digital waveforms and
  345. stairsteps, I demonstrate actual digital behavior on real
  346. equipment in our
  347. video <a href="https://video.xiph.org/vid2.shtml">Digital
  348. Show &amp; Tell</a> so you need not simply take me at my
  349. word here!
  350. </p>
  351. <p>All signals with content entirely below the Nyquist
  352. frequency (half the sampling rate) are captured perfectly
  353. and completely by sampling; an infinite sampling rate is not
  354. required. Sampling doesn't affect frequency response or
  355. phase. The analog signal can be reconstructed losslessly,
  356. smoothly, and with the exact timing of the original analog
  357. signal.</p>
  358. <p>So the math is ideal, but what of real world complications?
  359. The most notorious is the band-limiting requirement. Signals
  360. with content over the Nyquist frequency must be lowpassed
  361. before sampling to avoid aliasing distortion; this analog
  362. lowpass is the infamous antialiasing filter. Antialiasing
  363. can't be ideal in practice, but modern techniques bring it
  364. very close. ...and with that we come to oversampling.</p>
  365. </div>
  366. <div id="toc_o">
  367. <h3>Oversampling</h3>
  368. <p>Sampling rates over 48kHz are irrelevant to high fidelity
  369. audio data, but they are internally essential to several
  370. modern digital audio techniques. <em>Oversampling</em> is the
  371. most relevant example [<a href="#foot7">7</a>].</p>
  372. <p>Oversampling is simple and clever. You may recall from my
  373. <a href="http://www.xiph.org/video/vid1.shtml">A Digital
  374. Media Primer for Geeks</a> that high sampling rates
  375. provide a great deal more space between the highest
  376. frequency audio we care about (20kHz) and the Nyquist
  377. frequency (half the sampling rate).
  378. <a href="http://www.xiph.org/video/vid1.shtml?time=678.1">
  379. This allows for simpler, smoother, more reliable analog
  380. anti-aliasing filters, and thus higher fidelity</a>. This
  381. extra space between 20kHz and the Nyquist frequency is
  382. essentially just spectral padding for the analog
  383. filter.</p>
  384. <img src="https://www.xiph.org/~xiphmont/demo/filters.png"/>
  385. <div class="caption">
  386. <p>Above: Whiteboard diagram from <u>A Digital Media
  387. Primer for Geeks</u> illustrating the transition band
  388. width available for a 48kHz ADC/DAC (left) and a 96kHz
  389. ADC/DAC (right).</p>
  390. </div>
  391. <p>That's only half the story. Because digital filters have
  392. few of the practical limitations of an analog filter, we can
  393. complete the anti-aliasing process with greater efficiency
  394. and precision digitally. The very high rate raw digital
  395. signal passes through a digital anti-aliasing filter, which
  396. has no trouble fitting a transition band into a tight space.
  397. After this further digital anti-aliasing, the extra padding
  398. samples are simply thrown away. Oversampled playback
  399. approximately works in reverse.</p>
  400. <p>This means we can use low rate 44.1kHz or 48kHz audio with
  401. all the fidelity benefits of 192kHz or higher sampling
  402. (smooth frequency response, low aliasing) and none of the
  403. drawbacks (ultrasonics that cause intermodulation
  404. distortion, wasted space). Nearly all of today's
  405. analog-to-digital converters (ADCs) and digital-to-analog
  406. converters (DACs) oversample at very high rates. Few people
  407. realize this is happening because it's completely automatic
  408. and hidden.</p>
  409. <p>ADCs and DACs didn't always transparently
  410. oversample. Thirty years ago, some recording consoles
  411. recorded at high sampling rates using only analog filters,
  412. and production and mastering simply used that high rate
  413. signal. The digital anti-aliasing and decimation steps
  414. (resampling to a lower rate for CDs or DAT) happened in the
  415. final stages of mastering. This may well be one of the
  416. early reasons 96kHz and 192kHz became associated with
  417. professional music production [<a href="#foot8">8</a>].</p>
  418. </div>
  419. <div id="toc_1bv2b">
  420. <h3>16 bit vs 24 bit</h3>
  421. <p>OK, so 192kHz music files make no sense. Covered, done. What about
  422. 16 bit vs. 24 bit audio?</p>
  423. <p>It's true that 16 bit linear PCM audio does not quite cover
  424. the entire theoretical dynamic range of the human ear in
  425. ideal conditions. Also, there are (and always will be)
  426. reasons to use more than 16 bits in recording and
  427. production.</p>
  428. <p>None of that is relevant to playback; here 24 bit audio is
  429. as useless as 192kHz sampling. The good news is that at
  430. least 24 bit depth doesn't harm fidelity. It just doesn't
  431. help, and also wastes space.</p>
  432. </div>
  433. <div id="toc_rye">
  434. <h3>Revisiting your ears</h3>
  435. <p>We've discussed the frequency range of the ear, but what
  436. about the dynamic range from the softest possible sound to
  437. the loudest possible sound?</p>
  438. <p>One way to define absolute dynamic range would be to look
  439. again at the absolute threshold of hearing and threshold of
  440. pain curves. The distance between the highest point on the
  441. threshold of pain curve and the lowest point on the absolute
  442. threshold of hearing curve is about 140 decibels for a
  443. young, healthy listener. That wouldn't last long though;
  444. +130dB is loud enough to damage hearing permanently in
  445. seconds to minutes. For reference purposes, a jackhammer at
  446. one meter is only about 100-110dB.</p>
  447. <p>The absolute threshold of hearing increases with age and
  448. hearing loss. Interestingly, the threshold of pain decreases
  449. with age rather than increasing. The hair cells of the cochlea
  450. themselves posses only a fraction of the ear's 140dB range;
  451. musculature in the ear continuously adjust the amount of sound
  452. reaching the cochlea by shifting the ossicles, much as the iris
  453. regulates the amount of light entering the eye
  454. [<a href="#foot9">9</a>]. This mechanism stiffens with age,
  455. limiting the ear's dynamic range and reducing the effectiveness
  456. of its protection mechanisms [<a href="#foot10">10</a>].</p>
  457. </div>
  458. <div id="toc_en">
  459. <h3>Environmental noise</h3>
  460. <p>Few people realize how quiet the absolute threshold of
  461. hearing really is.</p>
  462. <p>The very quietest perceptible sound is about -8dbSPL
  463. [<a href="#foot11">11</a>]. Using an A-weighted scale, the
  464. hum from a 100 watt incandescent light bulb one meter away
  465. is about 10dBSPL, so about 18dB louder. The bulb will be
  466. much louder on a dimmer.</p>
  467. <p>20dBSPL (or 28dB louder than the quietest audible sound) is
  468. often quoted for an empty broadcasting/recording studio or
  469. sound isolation room. This is the baseline for an
  470. exceptionally quiet environment, and one reason you've
  471. probably never noticed hearing a light bulb.</p>
  472. </div>
  473. <div id="toc_tdro1b">
  474. <h3>The dynamic range of 16 bits</h3>
  475. <p>16 bit linear PCM has a dynamic range of 96dB according to
  476. the most common definition, which calculates dynamic range
  477. as (6*bits)dB. Many believe that 16 bit audio cannot
  478. represent arbitrary sounds quieter than -96dB. This is
  479. incorrect.</p>
  480. <p>I have linked to two 16 bit audio files here; one contains
  481. a 1kHz tone at 0 dB (where 0dB is the loudest possible tone)
  482. and the other a 1kHz tone at -105dB.</p>
  483. <img src="https://www.xiph.org/~xiphmont/demo/-105dB.png"/>
  484. <div class="caption">
  485. <p>Above: Spectral analysis of a -105dB tone encoded as 16
  486. bit / 48kHz PCM. 16 bit PCM is clearly deeper than 96dB,
  487. else a -105dB tone could not be represented, nor would
  488. it be audible.</p>
  489. </div>
  490. <p>How is it possible to encode this signal, encode it with no
  491. distortion, and encode it well above the noise floor, when
  492. its peak amplitude is one third of a bit?</p>
  493. <p>Part of this puzzle is solved by proper dither, which
  494. renders quantization noise independent of the input signal.
  495. By implication, this means that dithered quantization
  496. introduces no distortion, just uncorrelated noise. That in
  497. turn implies that we can encode signals of arbitrary depth,
  498. even those with peak amplitudes much smaller than one bit
  499. [<a href="#foot12">12</a>]. However, dither doesn't change
  500. the fact that once a signal sinks below the noise floor, it
  501. should effectively disappear. How is the -105dB tone
  502. still clearly audible above a -96dB noise floor?</p>
  503. <p>The answer: Our -96dB noise floor figure is effectively
  504. wrong; we're using an inappropriate definition of dynamic
  505. range. (6*bits)dB gives us the RMS noise of the entire
  506. broadband signal, but each hair cell in the ear is sensitive
  507. to only a narrow fraction of the total bandwidth. As each
  508. hair cell hears only a fraction of the total noise floor
  509. energy, the noise floor at that hair cell will be much lower
  510. than the broadband figure of -96dB.</p>
  511. <p>Thus, 16 bit audio can go considerably deeper than 96dB.
  512. With use of shaped dither, which moves quantization noise
  513. energy into frequencies where it's harder to hear, the
  514. effective dynamic range of 16 bit audio reaches 120dB in
  515. practice [<a href="#foot13">13</a>], more than fifteen times
  516. deeper than the 96dB claim.</p>
  517. <p>120dB is greater than the difference between a mosquito
  518. somewhere in the same room and a jackhammer a foot
  519. away.... or the difference between a deserted 'soundproof'
  520. room and a sound loud enough to cause hearing damage in
  521. seconds.</p>
  522. <p>16 bits is enough to store all we can hear, and will
  523. be enough forever.</p>
  524. </div>
  525. <div id="toc_stnr">
  526. <h3>Signal-to-noise ratio</h3>
  527. <p>It's worth mentioning briefly that the ear's S/N ratio is
  528. smaller than its absolute dynamic range. Within a given
  529. critical band, typical S/N is estimated to only be about 30dB.
  530. Relative S/N does not reach the full dynamic range even when
  531. considering widely spaced bands. This assures that linear
  532. 16 bit PCM offers higher resolution than is actually
  533. required.</p>
  534. <p>It is also worth mentioning that increasing the bit depth
  535. of the audio representation from 16 to 24 bits does not
  536. increase the perceptible resolution or 'fineness' of the
  537. audio. It only increases the dynamic range, the range
  538. between the softest possible and the loudest possible sound,
  539. by lowering the noise floor. However, a 16-bit noise floor is
  540. already below what we can hear.</p>
  541. </div>
  542. <div id="toc_wd2bm">
  543. <h3>When does 24 bit matter?</h3>
  544. <p>Professionals use 24 bit samples in recording and
  545. production [<a href="#foot14">14</a>] for headroom, noise
  546. floor, and convenience reasons.</p>
  547. <p>16 bits is enough to span the real hearing range with room
  548. to spare. It does not span the entire possible signal range
  549. of audio equipment. The primary reason to use 24 bits when
  550. recording is to prevent mistakes; rather than being careful
  551. to center 16 bit recording-- risking clipping if you guess
  552. too high and adding noise if you guess too low-- 24 bits
  553. allows an operator to set an approximate level and not worry
  554. too much about it. Missing the optimal gain setting by a
  555. few bits has no consequences, and effects that dynamically
  556. compress the recorded range have a deep floor to work
  557. with.</p>
  558. <p>An engineer also requires more than 16 bits during mixing
  559. and mastering. Modern work flows may involve literally
  560. thousands of effects and operations. The quantization noise
  561. and noise floor of a 16 bit sample may be undetectable
  562. during playback, but multiplying that noise by a few
  563. thousand times eventually becomes noticeable. 24 bits keeps
  564. the accumulated noise at a very low level. Once the music
  565. is ready to distribute, there's no reason to keep more than
  566. 16 bits.</p>
  567. </div>
  568. <div id="toc_lt">
  569. <h3>Listening tests</h3>
  570. <p>Understanding is where theory and reality meet. A matter is
  571. settled only when the two agree.</p>
  572. <p>Empirical evidence from listening tests backs up the
  573. assertion that 44.1kHz/16 bit provides highest-possible
  574. fidelity playback. There are numerous controlled tests
  575. confirming this, but I'll plug a recent paper,
  576. <a href="http://www.aes.org/e-lib/browse.cfm?elib=14195"><u>Audibility
  577. of a CD-Standard A/D/A Loop Inserted into High-Resolution
  578. Audio Playback</u></a>, done by local folks here at the
  579. <a href="http://www.bostonaudiosociety.org/">Boston Audio
  580. Society</a>.</p>
  581. <p>Unfortunately, downloading the full paper requires an AES
  582. membership. However it's been discussed widely in articles
  583. and on forums, with the authors joining in. Here's a few
  584. links:</p>
  585. <p>This paper presented listeners with a choice between
  586. high-rate DVD-A/SACD content, chosen by high-definition
  587. audio advocates to show off high-def's superiority, and that
  588. same content resampled on the spot down to 16-bit / 44.1kHz
  589. Compact Disc rate. The listeners were challenged to
  590. identify any difference whatsoever between the two using an
  591. ABX methodology. BAS conducted the test using high-end
  592. professional equipment in noise-isolated studio listening
  593. environments with both amateur and trained professional
  594. listeners.</p>
  595. <p>In 554 trials, listeners chose correctly 49.8% of the
  596. time. In other words, they were guessing. Not one listener
  597. throughout the entire test was able to identify which was
  598. 16/44.1 and which was high rate [<a href="#foot15">15</a>],
  599. and the 16-bit signal wasn't even dithered!</p>
  600. <p>Another recent study [<a href="#foot16">16</a>] investigated
  601. the possibility that ultrasonics were audible, as earlier studies had
  602. suggested. The test was constructed to maximize the possibility of
  603. detection by placing the intermodulation products where they'd be most
  604. audible. It found that the ultrasonic tones were not audible... but
  605. the intermodulation distortion products introduced by the loudspeakers
  606. could be.</p>
  607. <p>This paper inspired a great deal of further research, much
  608. of it with mixed results. Some of the ambiguity is
  609. explained by finding that ultrasonics can induce more
  610. intermodulation distortion than expected in power amplifiers
  611. as well. For
  612. example, <a href="http://www.davidgriesinger.com/intermod.ppt">David
  613. Griesinger reproduced this experiment</a>
  614. [<a href="#foot17">17</a>] and found that his loudspeaker
  615. setup did not introduce audible intermodulation distortion
  616. from ultrasonics, but his stereo amplifier did.</p>
  617. </div>
  618. <div id="toc_cl">
  619. <h3>Caveat Lector</h3>
  620. <p>It's important not to cherry-pick individual papers or
  621. 'expert commentary' out of context or from self-interested
  622. sources. Not all papers agree completely with these results
  623. (and a few disagree in large part), so it's easy to find
  624. minority opinions that appear to vindicate every imaginable
  625. conclusion.
  626. <em>Regardless, the papers and links above are
  627. representative of the vast weight and breadth of the
  628. experimental record.</em> No peer-reviewed paper that has
  629. stood the test of time disagrees substantially with these
  630. results. Controversy exists only within the consumer and
  631. enthusiast audiophile communities.</p>
  632. <p>If anything, the number of ambiguous, inconclusive, and
  633. outright invalid experimental results available through
  634. Google highlights how tricky it is to construct an accurate,
  635. objective test. The differences researchers look for are
  636. minute; they require rigorous statistical analysis to spot
  637. subconscious choices that escape test subjects' awareness.
  638. That we're likely trying to 'prove' something that doesn't
  639. exist makes it even more difficult. Proving a null
  640. hypothesis is akin to proving the halting problem; you
  641. can't. You can only collect evidence that lends overwhelming
  642. weight.</p>
  643. <p>Despite this, papers that confirm the null hypothesis are
  644. especially strong evidence; confirming inaudibility is far
  645. more experimentally difficult than disputing
  646. it. Undiscovered mistakes in test methodologies and
  647. equipment nearly always produce false positive results (by
  648. accidentally introducing audible differences) rather than
  649. false negatives.</p>
  650. <p>If professional researchers have such a hard time properly
  651. testing for minute, isolated audible differences, you can
  652. imagine how hard it is for amateurs.</p>
  653. </div>
  654. <div id="toc_htisualc">
  655. <h3>How to [inadvertently] screw up a listening comparison</h3>
  656. <p>The number one comment I heard from believers in super high
  657. rate audio was [paraphrasing]: <i>"I've listened to high
  658. rate audio myself and the improvement is obvious. Are you
  659. seriously telling me not to trust my own ears?"</i></p>
  660. <p>Of course you can trust your ears. It's brains that are
  661. gullible. I don't mean that flippantly; as human beings,
  662. we're all wired that way.</p>
  663. </div>
  664. <div id="toc_cbtpeadb">
  665. <h3>Confirmation bias, the placebo effect, and double-blind</h3>
  666. <p>In any test where a listener can tell two choices apart via
  667. any means apart from listening, the results will usually be
  668. what the listener expected in advance; this is
  669. called <a href="http://en.wikipedia.org/wiki/Confirmation_bias">
  670. confirmation bias</a> and it's similar to
  671. the <a href="http://en.wikipedia.org/wiki/Placebo_effect">placebo
  672. effect</a>. It means people 'hear' differences because of
  673. subconscious cues and preferences that have nothing to do
  674. with the audio, like preferring a more expensive (or more
  675. attractive) amplifier over a cheaper option.</p>
  676. <p> The human brain is designed to notice patterns and
  677. differences, even where none exist. This tendency can't just
  678. be turned off when a person is asked to make objective
  679. decisions; it's completely subconscious. Nor can a bias be
  680. defeated by mere skepticism. <em>Controlled experimentation
  681. shows that awareness of confirmation bias can increase
  682. rather than decreases the effect!</em> A test that doesn't
  683. carefully eliminate confirmation bias is worthless
  684. [<a href="#foot18">18</a>].</p>
  685. <p>In <em>single-blind</em> testing, a listener knows nothing
  686. in advance about the test choices, and receives no feedback
  687. during the course of the test. Single-blind testing is
  688. better than casual comparison, but it does not eliminate
  689. the <a href="http://en.wikipedia.org/wiki/Experimenter%27s_bias">
  690. experimenter's bias</a>. The test administrator can easily
  691. inadvertently influence the test or transfer his own
  692. subconscious bias to the listener through inadvertent cues
  693. (eg, "Are you sure that's what you're hearing?", body
  694. language indicating a 'wrong' choice, hesitating
  695. inadvertently, etc). An experimenter's bias has also been
  696. experimentally proven to influence a test subject's
  697. results.</p>
  698. <p><em>Double-blind</em> listening tests are the gold
  699. standard; in these tests neither the test administrator nor
  700. the testee have any knowledge of the test contents or
  701. ongoing results. Computer-run ABX tests are the most famous
  702. example, and there are freely available tools for performing
  703. ABX tests on your own computer[<a href="#foot19">19</a>].
  704. ABX is considered a minimum bar for a listening test to be
  705. meaningful; reputable audio forums such
  706. as <a href="http://www.hydrogenaudio.org/">Hydrogen
  707. Audio</a>
  708. often <a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=3974#entry149481">do
  709. not even allow discussion of listening results unless they
  710. meet this minimum objectivity requirement</a>
  711. [<a href="#foot20">20</a>].</p>
  712. <img src="https://www.xiph.org/~xiphmont/demo/squishyball.png"/>
  713. <div class="caption">
  714. <p>Above: Squishyball, a simple command-line ABX tool, running in an xterm.</p>
  715. </div>
  716. <p>I personally don't do any quality comparison tests during
  717. development, no matter how casual, without an ABX
  718. tool. Science is science, no slacking.</p>
  719. </div>
  720. <div id="toc_lt">
  721. <h3>Loudness tricks</h3>
  722. <p>The human ear can consciously discriminate amplitude
  723. differences of about 1dB, and experiments show subconscious
  724. awareness of amplitude differences under .2dB. Humans
  725. almost universally consider louder audio to sound better,
  726. and .2dB is enough to establish this preference. Any
  727. comparison that fails to carefully amplitude-match the
  728. choices will see the louder choice preferred, even if the
  729. amplitude difference is too small to consciously notice.
  730. Stereo salesmen have known this trick for a long time.</p>
  731. <p>The professional testing standard is to match sources to
  732. within .1dB or better. This often requires use of an
  733. oscilloscope or signal analyzer. Guessing by turning the
  734. knobs until two sources sound about the same is not good
  735. enough.</p>
  736. </div>
  737. <div id="toc_c">
  738. <h3>Clipping</h3>
  739. <p>Clipping is another easy mistake, sometimes obvious only in
  740. retrospect. Even a few clipped samples or their aftereffects
  741. are easy to hear compared to an unclipped signal.</p>
  742. <p>The danger of clipping is especially pernicious in tests
  743. that create, resample, or otherwise manipulate digital signals
  744. on the fly. Suppose we want to compare the fidelity of 48kHz
  745. sampling to a 192kHz source sample. A typical way is to
  746. downsample from 192kHz to 48kHz, upsample it back to 192kHz,
  747. and then compare it to the original 192kHz sample in an ABX
  748. test [<a href="#foot21">21</a>]. This arrangement allows us
  749. to eliminate any possibility of equipment variation or sample
  750. switching influencing the results; we can use the same DAC to
  751. play both samples and switch between without any hardware mode
  752. changes.</p>
  753. <p>Unfortunately, most samples are mastered to use the full
  754. digital range. Naive resampling can and often will clip
  755. occasionally. It is necessary to either monitor for clipping
  756. (and discard clipped audio) or avoid clipping via some other
  757. means such as attenuation.</p>
  758. </div>
  759. <div id="toc_dmdm">
  760. <h3>Different media, different master</h3>
  761. <p>I've run across a few articles and blog posts that declare
  762. the virtues of 24 bit or 96/192kHz by comparing a CD to an
  763. audio DVD (or SACD) of the 'same' recording. This comparison
  764. is invalid; the masters are usually different.</p>
  765. </div>
  766. <div id="toc_ic">
  767. <h3>Inadvertent cues</h3>
  768. <p>Inadvertant audible cues are almost inescapable in older
  769. analog and hybrid digital/analog testing setups. Purely
  770. digital testing setups can completely eliminate the problem in
  771. some forms of testing, but also multiply the potential of
  772. complex software bugs. Such limitations and bugs have a long
  773. history of causing false-positive results in testing
  774. [<a href="#foot22">22</a>].</p>
  775. <p><a href="http://www.bostonaudiosociety.org/bas_speaker/abx_testing2.htm"><u>The
  776. Digital Challenge - More on ABX Testing</u></a>, tells a
  777. fascinating story of a specific listening test conducted in
  778. 1984 to rebut audiophile authorities of the time who asserted
  779. that CDs were inherently inferior to vinyl. The article is
  780. not concerned so much with the results of the test (which I
  781. suspect you'll be able to guess), but the processes and
  782. real-world messiness involved in conducting such a test. For
  783. example, an error on the part of the testers inadvertantly
  784. revealed that an invited audiophile expert had not been making
  785. choices based on audio fidelity, but rather by listening to
  786. the slightly different clicks produced by the ABX switch's
  787. analog relays!</p>
  788. <p>Anecdotes do not replace data, but this story is
  789. instructive of the ease with which undiscovered flaws can bias
  790. listening tests. Some of the audiophile beliefs discussed
  791. within are also highly entertaining; one hopes that some
  792. modern examples are considered just as silly 20 years from
  793. now.</p>
  794. </div>
  795. <div id="toc_ftgn">
  796. <h2>Finally, the good news</h2>
  797. <p>What actually works to improve the quality of the digital
  798. audio to which we're listening?</p>
  799. </div>
  800. <div id="toc_bh">
  801. <h3>Better headphones</h3>
  802. <p>The easiest fix isn't digital. The most dramatic possible
  803. fidelity improvement for the cost comes from a good pair of
  804. headphones. Over-ear, in ear, open or closed, it doesn't much
  805. matter. They don't even need to be expensive, though expensive
  806. headphones can be worth the money.</p>
  807. <p>Keep in mind that some headphones are expensive because
  808. they're well made, durable and sound great. Others are
  809. expensive because they're $20 headphones under a several
  810. hundred dollar layer of styling, brand name, and marketing. I
  811. won't make specfic recommendations here, but I will say you're
  812. not likely to find good headphones in a big box store, even if
  813. it specializes in electronics or music. As in all other
  814. aspects of consumer hi-fi, do your research (and caveat
  815. emptor).</p>
  816. </div>
  817. <div id="toc_lf">
  818. <h3>Lossless formats</h3>
  819. <p>It's true enough that a properly encoded Ogg file (or MP3,
  820. or AAC file) will be indistinguishable from the original at a
  821. moderate bitrate.</p>
  822. <p>But what of badly encoded files?</p>
  823. <p>Twenty years ago, all mp3 encoders were really bad by
  824. today's standards. Plenty of these old, bad encoders are
  825. still in use, presumably because the licenses are cheaper and
  826. most people can't tell or don't care about the difference
  827. anyway. Why would any company spend money to fix what it's
  828. completely unaware is broken?</p>
  829. <p>Moving to a newer format
  830. like <a href="http://www.vorbis.com">Vorbis</a> or AAC doesn't
  831. necessarily help. For example, many companies and individuals
  832. used (and still
  833. use) <a href="http://xiphmont.livejournal.com/51160.html">FFmpeg's
  834. very-low-quality built-in Vorbis encoder</a> because it was
  835. the default in FFmpeg and they were unaware how bad it
  836. was. AAC has an even longer history of widely-deployed,
  837. low-quality encoders; all mainstream lossy formats do.</p>
  838. <p>Lossless formats
  839. like <a href="http://flac.sourceforge.net/">FLAC</a> avoid any
  840. possibility of damaging audio fidelity
  841. [<a href="#foot23">23</a>] with a poor quality lossy encoder,
  842. or even by a good lossy encoder used incorrectly.</p>
  843. <p>A second reason to distribute lossless formats is to avoid
  844. generational loss. Each reencode or transcode loses more
  845. data; even if the first encoding is transparent, it's very
  846. possible the second will have audible artifacts. This matters
  847. to anyone who might want to remix or sample from downloads. It
  848. especially matters to us codec researchers; we need clean
  849. audio to work with. </p>
  850. </div>
  851. <div id="toc_bm">
  852. <h3>Better masters</h3>
  853. <p>The <a href="http://www.aes.org/e-lib/browse.cfm?elib=14195">
  854. BAS test I linked earlier</a> mentions as an aside that the
  855. SACD version of a recording <em>can</em> sound substantially
  856. better than the CD release. It's not because of increased
  857. sample rate or depth but because the SACD used a
  858. higher-quality master. When bounced to a CD-R, the SACD
  859. version still sounds as good as the original SACD and
  860. better than the CD release because the original audio used to
  861. make the SACD was better. Good production and mastering
  862. obviously contribute to the final quality of the music
  863. [<a href="#foot24">24</a>].</p>
  864. <p>The recent coverage of 'Mastered for iTunes' and similar
  865. initiatives from other industry labels is somewhat
  866. encouraging. What remains to be seen is whether or not Apple
  867. and the others actually 'get it' or if this is merely a hook
  868. for selling consumers yet another, more expensive copy of
  869. music they already own.</p>
  870. </div>
  871. <div id="toc_su">
  872. <h3>Surround</h3>
  873. <p>Another possible 'sales hook', one I'd enthusiastically buy
  874. into myself, is surround recordings. Unfortunately, there's
  875. some technical peril here.</p>
  876. <p>Old-style discrete surround with many channels (5.1, 7.1,
  877. etc) is a technical relic dating back to the theaters of the
  878. 1960s. It is inefficient, using more channels than competing
  879. systems. The surround image is limited, and tends to collapse
  880. toward the nearer speakers when a listener sits or shifts out of
  881. position.</p>
  882. <p>We can represent and encode excellent and robust
  883. localization with systems like Ambisonics. The problems are
  884. the cost of equipment for reproduction and the fact that
  885. something encoded for a natural soundfield both sounds bad
  886. when mixed down to stereo, and can't be created artificially
  887. in a convincing way. It's hard to fake ambisonics or
  888. holographic audio, sort of like how 3D video always seems to
  889. degenerate into a gaudy gimmick that reliably makes 5% of
  890. the population motion sick.</p>
  891. <p>Binaural audio is similarly difficult. You can't simulate
  892. it because it works slightly differently in every person.
  893. It's a learned skill tuned to the self-assembling system of
  894. the pinnae, ear canals, and neural processing, and it never
  895. assembles exactly the same way in any two individuals.
  896. People also subconsciously shift their heads to enhance
  897. localization, and can't localize well unless they do.
  898. That's something that can't be captured in a binaural
  899. recording, though it can to an extent in fixed surround.</p>
  900. <p>These are hardly impossible technical hurdles. Discrete
  901. surround has a proven following in the marketplace, and I'm
  902. personally especially excited by the possibilities offered
  903. by Ambisonics.</p>
  904. </div>
  905. <div id="toc_outro">
  906. <h2>Outro</h2>
  907. <blockquote>
  908. "I never did care for music much.<br/>
  909. It's the high fidelity!"<br/>
  910.      —Flanders &amp; Swann, <u>A Song of Reproduction</u>
  911. </blockquote>
  912. <p>The point is enjoying the music, right? Modern playback
  913. fidelity is incomprehensibly better than the already excellent
  914. analog systems available a generation ago. Is the logical
  915. extreme any more than just
  916. another <a href="http://www.youtube.com/watch?v=M3w1_E1V46M">first
  917. world problem</a>? Perhaps, but bad mixes and
  918. encodings <em>do</em> bother me; they distract me from the
  919. music, and I'm probably not alone.</p>
  920. <p>Why push back against 24/192? Because it's a solution to a
  921. problem that doesn't exist, a business model based on
  922. willful ignorance and scamming people. The more that
  923. pseudoscience goes unchecked in the world at large, the
  924. harder it is for truth to overcome truthiness... even if
  925. this is a small and relatively insignificant example.</p>
  926. <blockquote>
  927. "For me, it is far better to grasp the Universe as it really
  928. is than to persist in delusion, however satisfying and
  929. reassuring."
  930. <br/>     —Carl Sagan
  931. </blockquote>
  932. </div>
  933. <div id="toc_more">
  934. <h2>Further reading</h2>
  935. <p>Readers have alerted me to a pair of excellent papers of
  936. which I wasn't aware before beginning my own article. They
  937. tackle many of the same points I do in greater detail.</p>
  938. <ul>
  939. <li>
  940. <p><a href="http://www.meridian.co.uk/ara/coding2.pdf"><u>Coding
  941. High Quality Digital Audio</u></a> by Bob Stuart
  942. of Meridian Audio is beautifully concise despite
  943. its greater length. Our conclusions differ
  944. somewhat (he takes as given the need for a
  945. slightly wider frequency range and bit depth
  946. without much justification), but the presentation
  947. is clear and easy to follow. <i>[Edit: I may not
  948. agree with many of Mr. Stuart's other articles,
  949. but I like this one a lot.]</i></p></li>
  950. <li><p><a href="http://lavryengineering.com/pdfs/lavry-sampling-theory.pdf">
  951. <u>Sampling Theory For Digital Audio</u></a> [Updated link 2012-10-04] by Dan
  952. Lavry of Lavry Engineering is another article that several
  953. readers pointed out. It expands my two pages or so about
  954. sampling, oversampling, and filtering into a more detailed
  955. 27 page treatment. Worry not, there are plenty of graphs,
  956. examples and references.</p></li>
  957. </ul>
  958. <p>Stephane Pigeon
  959. of <a href="http://www.audiocheck.net/">audiocheck.net</a>
  960. wrote to plug the browser-based listening tests featured on
  961. his web site. The set of tests is relatively small as yet,
  962. but several were directly relevant in the context of this
  963. article. They worked well and I found the quality to be
  964. quite good.</p>
  965. </div>
  966. <div id="toc_fn">
  967. <h2>Footnotes</h2>
  968. <ol>
  969. <li id="foot2">
  970. <p>If it wasn't the most boring
  971. party trick ever, it was pretty close.
  972. </p></li>
  973. <li id="foot3">
  974. <p>It's more typical to speak of
  975. visible light as wavelengths measured in nanometers or
  976. angstroms. I'm using frequency to be consistent with
  977. sound. They're equivalent, as frequency is just the
  978. inverse of wavelength.</p>
  979. </li>
  980. <li id="foot4">
  981. <p>The LED experiment doesn't work
  982. with 'ultraviolet' LEDs, mainly because they're not really
  983. ultraviolet. They're deep enough violet to cause a little
  984. bit of fluorescence, but they're still well within the
  985. visible range. Real ultraviolet LEDs cost anywhere from
  986. $100-$1000 apiece and would cause eye damage if used for
  987. this test. Consumer grade not-really-UV LEDs also emit
  988. some faint white light in order to appear brighter, so
  989. you'd be able to see them even if the emission peak really
  990. was in the ultraviolet.</p>
  991. </li>
  992. <li id="foot5">
  993. <p>The original version of this article stated that IR
  994. LEDs operate from 300-325THz (about 920-980nm),
  995. wavelengths that are invisible. Quite a few readers wrote
  996. to say that they could in fact just barely see the LEDs in
  997. some (or all) of their remotes. Several were kind enough
  998. to let me know which remotes these were, and I was able to
  999. test several on a spectrometer. Lo and behold, these
  1000. remotes were using higher-frequency LEDs operating from
  1001. 350-380THz (800-850nm), just overlapping the extreme
  1002. edge of the visible range.
  1003. </p>
  1004. </li>
  1005. <li id="foot6">
  1006. <p>Many systems that cannot play back 96kHz samples will
  1007. silently downsample to 48kHz, rather than refuse to play
  1008. the file. In this case, the tones will not be played at
  1009. all and playback would be silent no matter how nonlinear
  1010. the system is.</p>
  1011. </li>
  1012. <li id="foot7">
  1013. <p>Oversampling is not the only
  1014. application for high sampling rates in signal
  1015. processing. There are a few theoretical advantages to
  1016. producing band-limited audio at a high sampling rate
  1017. eschewing decimation, even if it is to be downsampled
  1018. for distribution. It's not clear what if any are used
  1019. in practice, as the workings of most professional
  1020. consoles are trade secrets.</p>
  1021. </li>
  1022. <li id="foot8">
  1023. <p>Historical reasoning or not,
  1024. there's no question that many professionals today use high
  1025. rates because they mistakenly assume that retaining
  1026. content beyond 20kHz sounds better, just as consumers
  1027. do.</p>
  1028. </li>
  1029. <li id="foot9">
  1030. <p>The sensation of eardrums
  1031. 'uncringing' after turning off loud music is quite
  1032. real!</p>
  1033. </li>
  1034. <li id="foot10">
  1035. <p>Some nice diagrams can be found
  1036. at the HyperPhysics site:<br/>
  1037. <a href="http://hyperphysics.phy-astr.gsu.edu/hbase/sound/protect.html#c1">http://hyperphysics.phy-astr.gsu.edu/hbase/sound/protect.html#c1</a></p>
  1038. </li>
  1039. <li id="foot11">
  1040. <p>20µPa is commonly defined to be
  1041. 0dB for auditory measurement purposes; it is approximately
  1042. equal to the threshold of hearing at 1kHz. The ear is as
  1043. much as 8dB more sensitive between 2 and 4kHz however.</p>
  1044. </li>
  1045. <li id="foot12">
  1046. <p>The following paper has the best explanation of dither
  1047. that I've run across. Although it's about image dither,
  1048. the first half covers the theory and practice of dither in
  1049. audio before extending its use into images:</p>
  1050. <p>Cameron Nicklaus Christou,
  1051. <a href="http://uwspace.uwaterloo.ca/bitstream/10012/3867/1/thesis.pdf">
  1052. <u>Optimal Dither and Noise Shaping
  1053. in Image Processing</u></a>
  1054. </p>
  1055. </li>
  1056. <li id="foot13">
  1057. <p>DSP engineers may point out, as
  1058. one of my own smart-alec compatriots did, that 16 bit
  1059. audio has a theoretically infinite dynamic range for a
  1060. pure tone if you're allowed to use an infinite Fourier
  1061. transform to extract it; this concept is very important to
  1062. radio astronomy.</p>
  1063. <p>Although the ear works not entirely unlike a Fourier transform, its
  1064. resolution is relatively limited. This places a limit on the maximum
  1065. practical dynamic depth of 16 bit audio signals.</p>
  1066. </li>
  1067. <li id="foot14">
  1068. <p>Production increasingly uses 32
  1069. bit float, both because it's very convenient on modern
  1070. processors, and because it completely eliminates the
  1071. possibility of accidental clipping at any point going
  1072. undiscovered and ruining a mix.</p>
  1073. </li>
  1074. <li id="foot15">
  1075. <p>Several readers have wanted to know how, if ultrasonics
  1076. can cause audible intermodulation distortion, the Meyer
  1077. and Moran 2007 test could have produced a null result.</p>
  1078. <p>It should be obvious that 'can' and 'sometimes' are not
  1079. the same as 'will' and 'always'. Intermodulation
  1080. distortion from ultrasonics is a possibility, not a
  1081. certainty, in any given system for a given set of
  1082. material. The Meyer and Moran null result indicates that
  1083. intermodulation distortion was inaudible on the systems
  1084. used during the course of their testing.</p>
  1085. <p>Readers are invited to <a href="#toc_intermod">try the
  1086. simple ultrasonic intermodulation distortion test
  1087. above</a> for a quick check of the intermodulation
  1088. potential of their own equipment.</p>
  1089. </li>
  1090. <li id="foot16">
  1091. <p>Karou and Shogo, <u>Detection of
  1092. Threshold for tones above 22kHz</u> (2001). Convention paper
  1093. 5401 presented at the 110th Convention, May 12-15 2001,
  1094. Amsterdam.</p>
  1095. </li>
  1096. <li id="foot17">
  1097. <p>Griesinger, <a href="http://www.davidgriesinger.com/intermod.ppt"><u>Perception
  1098. of mid-frequency and high-frequency intermodulation
  1099. distortion in loudspeakers, and its relationship to
  1100. high definition audio</u></a></p>
  1101. </li>
  1102. <li id="foot18">
  1103. <p>Since publication, several commentators wrote to me with
  1104. similar versions of the same anecdote [paraphrased]: "I
  1105. once listened to some headphones / amps / recordings
  1106. expecting result [A] but was totally surprised to find
  1107. [B] instead! Confirmation bias is hooey!"</p>
  1108. <p>I offer two thoughts.</p>
  1109. <p>First, confirmation bias does not replace all correct
  1110. results with incorrect results. It skews the results in
  1111. some uncontrolled direction by an unknown amount. How
  1112. can you tell right or wrong <em>for sure</em> if the
  1113. test is rigged by your own subconscious? Let's say you
  1114. expected to hear a large difference but were shocked to
  1115. hear a small difference. What if there was actually no
  1116. difference at all? Or, maybe there <em>was</em> a
  1117. difference and, being aware of a potential bias, your
  1118. well meaning skepticism overcompensated? Or maybe you
  1119. were completely right? Objective testing, such as ABX,
  1120. eliminates all this uncertainty.</p>
  1121. <p>Second, "So you think you're not biased? Great!
  1122. Prove it!" The value of an objective test lies not only
  1123. in its ability to inform one's own understanding, but
  1124. also to convince others. Claims require proof.
  1125. Extraordinary claims require extraordinary proof.</p>
  1126. </li>
  1127. <li id="foot19">
  1128. <p>The easiest tools to use for ABX testing are
  1129. probably:</p>
  1130. </li>
  1131. <li id="foot20">
  1132. <p>At Hydrogen Audio, the objective testing requirement is
  1133. abbreviated <em>TOS8</em> as it's the eighth item in the
  1134. Terms Of Service.</p>
  1135. </li>
  1136. <li id="foot21">
  1137. <p>It is commonly assumed that resampling irreparably
  1138. damages a signal; this isn't the case. Unless one makes
  1139. an obvious mistake, such as causing clipping, the
  1140. downsampled and then upsampled signal will be audibly
  1141. indistinguishable from the original. This is the usual
  1142. test used to establish that higher sampling rates are
  1143. unneccessary.</p>
  1144. </li>
  1145. <li id="foot22">
  1146. <p>It may not be strictly audio related,
  1147. but... faster-than-light neutrinos, anyone?</p>
  1148. </li>
  1149. <li id="foot23">
  1150. <p><a href="http://www.wired.com/gadgetlab/2012/02/why-neil-young-hates-mp3-and-what-you-can-do-about-it/">Wired
  1151. magazine implies that lossless formats like FLAC
  1152. are not always completely lossless</a>:</p>
  1153. <blockquote>
  1154. "Some purists will tell you to skip FLACs altogether
  1155. and just buy WAVs. [...] By buying WAVs, you can avoid
  1156. the potential data loss incurred when the file is
  1157. compressed into a FLAC. This data loss is rare, but it
  1158. happens."
  1159. </blockquote>
  1160. <p>This is false. A lossless compression process never
  1161. alters the original data in any way, and FLAC is no
  1162. exception.</p>
  1163. <p>In the event that Wired was referring to hardware
  1164. corruption of data files (disk failure, memory failure,
  1165. sunspots), FLAC and WAV would both be affected. A FLAC
  1166. file, however, is checksummed and would detect the
  1167. corruption. The FLAC file is also smaller than the WAV,
  1168. and so a random corruption would be less likely because
  1169. there's less data that could be affected.</p>
  1170. </li>
  1171. <li id="foot24">
  1172. <p>
  1173. The <a href="http://en.wikipedia.org/wiki/Loudness_war">'Loudness
  1174. War'</a> is a commonly cited example of bad mastering
  1175. practices in the industry today, though it's not the
  1176. only one. Loudness is also an older phenomenon than the
  1177. Wikipedia article leads the reader to believe; as early
  1178. as the 1950s, artists and producers pushed for the
  1179. loudest possible recordings. Equipment vendors
  1180. increasingly researched and marketed new technology to
  1181. allow hotter and hotter masters. Advanced vinyl
  1182. mastering equipment in the 1970s and 1980s, for example,
  1183. tracked and nested groove envelopes when possible in
  1184. order to allow higher amplitudes than the groove spacing
  1185. would normally permit.</p>
  1186. <p>Today's digital technology has allowed loudness to be
  1187. pumped up to an absurd level. It's also provided a
  1188. plethora of automatic, highly complex, proprietary DAW
  1189. plugins that are deployed en-masse without a wide
  1190. understanding of how they work or what they're really
  1191. doing.</p>
  1192. </li>
  1193. </ol>
  1194. </div>
  1195. <hr/>
  1196. <div class="author">
  1197. <address>—Monty
  1198. (<a href="mailto:monty@xiph.org">monty@xiph.org)</a> March
  1199. 1, 2012
  1200. <br/><i>last revised March 25, 2012 to add improvements
  1201. suggested by readers.
  1202. <br/>Edits and corrections made after this date are marked inline, except for spelling errors
  1203. <br/>spotted on Dec 30, 2012 and March 15, 2014, and an extra 'is' removed on April 1, 2013]
  1204. </i></address>
  1205. </div>
  1206. <div class="et">
  1207. <div class="etleft">
  1208. <div class="etcontent">
  1209. <a href="http://et.redhat.com/"><img src="https://www.xiph.org/~xiphmont/demo/et.png"/></a>
  1210. </div>
  1211. </div>
  1212. <div class="etcenter">
  1213. <div class="etcontent">
  1214. <p>Monty's articles and demo work are sponsored by Red Hat Emerging Technologies.
  1215. <br/>(C) Copyright 2012 Red Hat Inc. and Xiph.Org
  1216. <br/>Special thanks to Gregory Maxwell for technical
  1217. contributions to this article</p>
  1218. </div>
  1219. </div>
  1220. </div>
  1221. <div>
  1222. <img src="https://www.xiph.org/~xiphmont/demo/brick-redhat.jpg"/>
  1223. </div>