MP3: Difference between revisions

Content deleted Content added
Vecr (talk | contribs)
Going public: "encode and playback" -> "encode and play back"
m unpiped links using script
 
(7 intermediate revisions by 7 users not shown)
Line 22:
| latest release version = ISO/IEC 13818-3:1998
| latest release date = {{Start date and age|1998|04|df=y}}
| type = [[Lossy compression|Lossy]] [[Audio file format|audio]]
| container for =
| contained by = [[MPEG-ES]]
Line 33:
}}
 
'''MP3''' (formally '''MPEG-1 Audio Layer III''' or '''MPEG-2 Audio Layer III''')<ref name="rfc5219" /> is a [[audio coding format|coding format]] for [[digital audio]] developed largely by the [[Fraunhofer Society]] in Germany under the lead of [[Karlheinz Brandenburg]],.<ref>{{Cite web|url=https://www.youtube.com/watch?v=cuU16whZ-Fs|title=73. "Father" of the MP3, Karlheinz Brandenburg|date=13 July 2015 |via=www.youtube.com|access-date=2 January 2023|archive-date=2 January 2023|archive-url=https://web.archive.org/web/20230102160404/https://www.youtube.com/watch?v=cuU16whZ-Fs|url-status=live}}</ref><ref>{{Cite web|url=https://www.internethistorypodcast.com/2015/07/on-the-20th-birthday-of-the-mp3-an-interview-with-the-father-of-the-mp3-karlheinz-brandenburg/|title=On the 20th Birthday of the MP3, An Interview With The "Father" of the MP3, Karlheinz Brandenburg|access-date=2 January 2023|archive-date=2 January 2023|archive-url=https://web.archive.org/web/20230102160403/https://www.internethistorypodcast.com/2015/07/on-the-20th-birthday-of-the-mp3-an-interview-with-the-father-of-the-mp3-karlheinz-brandenburg/|url-status=live}}</ref> withIt supportwas fromdesigned otherto digitalgreatly scientistsreduce inthe otheramount countries.of Originallydata definedrequired asto the thirdrepresent audio, formatyet still sound like a faithful reproduction of the original [[MPEG-1uncompressed]] standard,audio itto wasmost retainedlisteners; andfor furtherexample, extended—definingcompared additionalto bit[[Compact ratesDisc andDigital supportAudio|CD-quality for moredigital [[surround channels|audio channels]]—as, theMP3 thirdcompression audiocan formatcommonly ofachieve thea subsequent75–95% reduction in size, depending on the [[MPEG-2bit rate]] standard.<ref>{{cite Aweb third|date=27 version,July known2017 as|title=MP3 (MPEG Layer III Audio Encoding) |url=https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.shtml |url-2status=live |archive-url=https://web.archive.org/web/20170814015755/https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.5—extendedshtml to|archive-date=14 betterAugust support2017 lower|access-date=9 November 2017 |publisher=The Library of Congress}}</ref> In popular usage, ''MP3'' often refers to [[bitComputer ratefile|files]]s—is commonlyof implementedsound butor ismusic notrecordings astored in the MP3 [[file format]] (.mp3) on consumer recognizedelectronic standarddevices.
 
ConcerningOriginally defined in 1991 as the third audio format of the [[audioMPEG-1]] compressionstandard, (data)it was retained and further extended—defining additional bit rates and support for more [[surround channels|audio compressionchannels]]—as (the aspectthird audio format of the subsequent [[MPEG-2]] standard. mostMP3 apparentas toa end[[file format]] commonly designates files containing an [[elementary stream]] of MPEG-users1 andAudio foror whichMPEG-2 itAudio isencoded bestdata, knownwithout other complexities of the MP3 standard. Concerning [[audio compression (data)|audio compression]], which is its most apparent element to end-users, MP3 uses [[lossy compression]] to encode data using inexact approximations and the partial discarding of data., Thisallowing allowsfor a large reduction in [[File size|file sizes]] when compared to uncompressed audio. The combination of small size and acceptable fidelity led to a boom in the distribution of music over the [[Internet]] in the mid-to-late 1990s, with MP3 serving as an enabling technology at a time when [[Bandwidth (computing)|bandwidth]] and storage were still at a premium. The MP3 format soon became associated with controversies surrounding [[copyright infringement]], [[music piracy]], and the file-[[ripping]] and [[file sharing|sharing]] services [[MP3.com#Original_version|MP3.com]] and [[Napster]], among others. With the advent of [[portable media player]]s (including "MP3 players"), a product category also including [[smartphones]], MP3 support remains near-universal and a [[De facto standard|''de facto'' standard]] for digital audio.
'''MP3''' (or '''mp3''') as a [[file format]] commonly designates files containing an [[elementary stream]] of MPEG-1 Audio or MPEG-2 Audio encoded data, without other complexities of the MP3 standard.
 
Concerning [[audio compression (data)|audio compression]] (the aspect of the standard most apparent to end-users and for which it is best known), MP3 uses [[lossy compression]] to encode data using inexact approximations and the partial discarding of data. This allows a large reduction in file sizes when compared to uncompressed audio. The combination of small size and acceptable fidelity led to a boom in the distribution of music over the Internet in the mid-to-late 1990s, with MP3 serving as an enabling technology at a time when bandwidth and storage were still at a premium. The MP3 format soon became associated with controversies surrounding [[copyright infringement]], [[music piracy]], and the file-[[ripping]] and [[file sharing|sharing]] services [[MP3.com#Original_version|MP3.com]] and [[Napster]], among others. With the advent of [[portable media player]]s, a product category also including [[smartphones]], MP3 support remains near-universal.
 
MP3 compression works by reducing the accuracy of certain components of sound that are considered (by psychoacoustic analysis) to be beyond the [[Hearing range#Humans|hearing capabilities]] of most humans. This method is commonly referred to as perceptual coding or [[psychoacoustic]] modeling.<ref name="Jayant1993" /> The remaining audio information is then recorded in a space-efficient manner using [[MDCT]] and [[FFT]] algorithms. Compared to [[Compact Disc Digital Audio|CD-quality digital audio]], MP3 compression can commonly achieve a 75–95% reduction in size. For example, an MP3 encoded at a constant bit rate of 128&nbsp;kbit/s would result in a file approximately 9% the size of the original CD audio.<ref>{{cite web |url= https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.shtml |title= MP3 (MPEG Layer III Audio Encoding) |date= 27 July 2017 |publisher= The Library of Congress |access-date= 9 November 2017 |archive-date= 14 August 2017 |archive-url= https://web.archive.org/web/20170814015755/https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.shtml |url-status= live }}</ref> In the early 2000s, compact disc players increasingly adopted support for playback of MP3 files on data CDs.
 
== History ==
The [[Moving Picture Experts Group]] (MPEG) designed MP3 as part of its [[MPEG-1]], and later [[MPEG-2]], standards. MPEG-1 Audio (MPEG-1 Part 3), which included MPEG-1 Audio Layer I, II, and III, was approved as a committee draft for an [[ISO]]/[[IEC]] standard in 1991,<ref name="cd-1991" /><ref name="neuron2-cd-1991" /> finalized in 1992,<ref name="dis-1992" /> and published in 1993 as ISO/IEC 11172-3:1993.<ref name="11172-3" /> An MPEG-2 Audio (MPEG-2 Part 3) extension with lower sample and bit rates was published in 1995 as ISO/IEC 13818-3:1995.<ref name="13818-3" /><ref name="mpeg-audio-faq-bc" /> It requires only minimal modifications to existing MPEG-1 decoders (recognition of the MPEG-2 bit in the header and addition of the new lower sample and bit rates).
 
== History ==
=== Background ===
{{See|Linear predictive coding|Modified discrete cosine transform}}
Line 49 ⟶ 45:
The MP3 [[lossy compression]] algorithm takes advantage of a perceptual limitation of human hearing called [[auditory masking]]. In 1894, the American physicist [[Alfred M. Mayer]] reported that a tone could be rendered inaudible by another tone of lower frequency.<ref name="Mayer1894" /> In 1959, Richard Ehmer described a complete set of auditory curves regarding this phenomenon.<ref name="Ehmer1959" /> Between 1967 and 1974, [[Eberhard Zwicker]] did work in the areas of tuning and masking of critical frequency-bands,<ref name="Zwicker" /><ref name="Eberhard" /> which in turn built on the fundamental research in the area from [[Harvey Fletcher]] and his collaborators at [[Bell Labs]].<ref name="Fletcher" />
 
Perceptual coding was first used for [[speech coding]] compression with [[linear predictive coding]] (LPC),<ref name="Schroeder2014">{{cite book |last1= Schroeder |first1= Manfred R. |title= Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder |date= 2014 |publisher= Springer |isbn= 978-3-319-05660-9 |chapter= Bell Laboratories |page= 388 |chapter-url= https://books.google.com/books?id=d9IkBAAAQBAJ&pg=PA388}}</ref> which has origins in the work of [[Fumitada Itakura]] ([[Nagoya University]]) and Shuzo Saito ([[Nippon Telegraph and Telephone]]) in 1966.<ref>{{cite journal |last1= Gray |first1= Robert M. |title= A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |journal= Found. Trends Signal Process. |date= 2010 |volume= 3 |issue= 4 |pages= 203–303 |doi= 10.1561/2000000036 |url= https://ee.stanford.edu/~gray/lpcip.pdf |issn= 1932-8346 |doi-access= free |access-date= 14 July 2019 |archive-date= 9 October 2022 |archive-url= https://ghostarchive.org/archive/20221009/https://ee.stanford.edu/~gray/lpcip.pdf |url-status= live }}</ref> In 1978, [[Bishnu S. Atal]] and [[Manfred R. Schroeder]] at Bell Labs proposed an LPC speech [[codec]], called [[adaptive predictive coding]], that used a [[psychoacoustic]] coding-algorithm exploiting the masking properties of the human ear.<ref name="Schroeder2014"/><ref>{{cite book |last1= Atal |first1= B. |last2= Schroeder |first2= M. |title= ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing |chapter= Predictive coding of speech signals and subjective error criteria |date= 1978 |volume= 3 |pages= 573–576 |doi= 10.1109/ICASSP.1978.1170564}}</ref> Further optimization by Schroeder and Atal with J.L. Hall was later reported in a 1979 paper.<ref name="Schroeder1979"/> That same year, a psychoacoustic masking codec was also proposed by M. A. Krasner,<ref name="Krasner" /> who published and produced hardware for speech (not usable as music bit-compression), but the publication of his results in a relatively obscure [[Lincoln Laboratory]] Technical Report<ref>{{cite web|last1= Krasner|first1= M. A.|title= Digital Encoding of Speech Based on the Perceptual Requirement of the Auditory System (Technical Report 535)|url= https://apps.dtic.mil/dtic/tr/fulltext/u2/a077355.pdf|ref= Lincoln Laboratory, MIT|date= 18 June 1979|url-status= live|archive-url= https://web.archive.org/web/20170903070321/https://www.dtic.mil/dtic/tr/fulltext/u2/a077355.pdf|archive-date= 3 September 2017}}</ref> did not immediately influence the mainstream of psychoacoustic codec-development.
 
The [[discrete cosine transform]] (DCT), a type of [[transform coding]] for lossy compression, proposed by [[N. Ahmed|Nasir Ahmed]] in 1972, was developed by Ahmed with T. Natarajan and [[K. R. Rao]] in 1973; they published their results in 1974.<ref name="Ahmed">{{cite journal |last= Ahmed |first= Nasir |author-link= N. Ahmed |title= How I Came Up With the Discrete Cosine Transform |journal= [[Digital Signal Processing (journal)| Digital Signal Processing]] |date= January 1991 |volume= 1 |issue= 1 |pages= 4–5 |doi= 10.1016/1051-2004(91)90086-Z |bibcode= 1991DSP.....1....4A |url= https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform |access-date= 19 November 2019 |archive-date= 10 June 2016 |archive-url= https://web.archive.org/web/20160610013109/https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform |url-status= live }}</ref><ref name="pubDCT">{{Citation |first1= Nasir |last1= Ahmed |author1-link= N. Ahmed |first2= T. |last2= Natarajan |first3= K. R. |last3= Rao |title= Discrete Cosine Transform |journal= IEEE Transactions on Computers |date= January 1974 |volume= C-23 |issue= 1 |pages= 90–93 |doi= 10.1109/T-C.1974.223784|s2cid= 149806273 }}</ref><ref name="pubRaoYip">{{Citation |last1= Rao |first1= K. R. |author-link1= K. R. Rao |last2= Yip |first2= P. |title= Discrete Cosine Transform: Algorithms, Advantages, Applications |publisher= Academic Press |location= Boston |year= 1990 |isbn= 978-0-12-580203-1}}</ref> This led to the development of the [[modified discrete cosine transform]] (MDCT), proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987,<ref>J. P. Princen, A. W. Johnson und A. B. Bradley: ''Subband/transform coding using filter bank designs based on time domain aliasing cancellation'', IEEE Proc. Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2161–2164, 1987</ref> following earlier work by Princen and Bradley in 1986.<ref>John P. Princen, Alan B. Bradley: ''Analysis/synthesis filter bank design based on time domain aliasing cancellation'', IEEE Trans. Acoust. Speech Signal Processing, ''ASSP-34'' (5), 1153–1161, 1986</ref> The MDCT later became a core part of the MP3 algorithm.<ref name="Guckert">{{cite web |last1= Guckert |first1= John |title= The Use of FFT and MDCT in MP3 Audio Compression |url= http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |website= [[University of Utah]] |date= Spring 2012 |access-date= 14 July 2019 |archive-date= 12 February 2021 |archive-url= https://web.archive.org/web/20210212022237/http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |url-status= live }}</ref>
Line 150 ⟶ 146:
 
=== Encoding and decoding ===
In short, MP3 compression works by reducing the accuracy of certain components of sound that are considered (by psychoacoustic analysis) to be beyond the [[Hearing range#Humans|hearing capabilities]] of most humans. This method is commonly referred to as perceptual coding or [[psychoacoustic]] modeling.<ref name="Jayant1993" /> The remaining audio information is then recorded in a space-efficient manner using [[MDCT]] and [[FFT]] algorithms.
 
The MP3 encoding algorithm is generally split into four parts. Part 1 divides the audio signal into smaller pieces, called frames, and an MDCT filter is then performed on the output. Part 2 passes the sample into a 1024-point [[fast Fourier transform]] (FFT), then the [[psychoacoustic]] model is applied and another MDCT filter is performed on the output. Part 3 quantifies and encodes each sample, known as noise allocation, which adjusts itself to meet the bit rate and [[sound masking]] requirements. Part 4 formats the [[bitstream]], called an audio frame, which is made up of 4 parts, the [[Header (computing)|header]], [[Error checking|error check]], [[audio data]], and [[#Ancillary data|ancillary data]].<ref name="Guckert"/>
 
Line 227 ⟶ 225:
| –
|-
| n/a
 
| 144
| –
Line 367 ⟶ 366:
 
{{reflist|30em|refs=
<ref name="mp3-name">{{cite web | url = http://www.businesswire.com/news/home/20050712005686/en/Fraunhofer-IIS-Happy-Birthday-MP3! | title = Happy Birthday MP3! | publisher = Fraunhofer IIS | date = 12 July 2005 | access-date = 18 July 2010 | archive-date = 11 December 2014 | archive-url = https://web.archive.org/web/20141211110033/http://www.businesswire.com/news/home/20050712005686/en/Fraunhofer-IIS-Happy-Birthday-MP3! | url-status = livedead }}</ref>
<ref name="audio/mpeg">{{cite journal | url = http://tools.ietf.org/html/rfc3003 | title = The audio/mpeg Media Type&nbsp;— RFC 3003 | publisher = IETF | date = November 2000 | doi = 10.17487/RFC3003 | access-date = 7 December 2009 | last1 = Nilsson | first1 = M. | archive-date = 13 April 2012 | archive-url = https://web.archive.org/web/20120413074234/http://tools.ietf.org/html/rfc3003 | url-status = live }}</ref>
<ref name="RTP">{{cite journal | url = http://tools.ietf.org/html/rfc3555#page-24 | title = MIME Type Registration of RTP Payload Formats&nbsp;— RFC 3555 | publisher = IETF | date = July 2003 | doi = 10.17487/RFC3555 | access-date = 7 December 2009 | last1 = Casner | first1 = S. | last2 = Hoschka | first2 = P. | archive-date = 14 January 2012 | archive-url = https://web.archive.org/web/20120114154203/http://tools.ietf.org/html/rfc3555#page-24 | url-status = live }}</ref>
Line 474 ⟶ 473:
 
If there are already suitable links, propose additions or replacements on
the article's talk page, or submit your link to the relevant category at.
the Open Directory Project (dmoz.org) and link there using {{Dmoz}}.
 
-->
* [https://www.mp3-history.com/ MP3-history.com] {{Webarchive|url=https://web.archive.org/web/20200211201051/https://www.mp3-history.com/ |date=11 February 2020 }}, The Story of MP3: How MP3 was invented, by Fraunhofer IIS.<!-- https://web.archive.org/web/20070610231859/http://www.iis.fraunhofer.de/EN/bf/amm/mp3history/mp3history03.jsp -->
* {{Curlie|Computers/Multimedia/Music_and_Audio/Audio_Formats/MP3/}}
* [https://www.mp3-history.com/ MP3-history.com] {{Webarchive|url=https://web.archive.org/web/20200211201051/https://www.mp3-history.com/ |date=11 February 2020 }}, The Story of MP3: How MP3 was invented, by Fraunhofer IIS.
* [https://www.mp3newswire.net/sect/archive.htm MP3 News Archive]. {{Webarchive|url=https://web.archive.org/web/20190303201456/https://www.mp3newswire.net/sect/archive.htm |date=3 March 2019 }} – over 1000 articles from 1999 to 2011 focused on MP3 and digital audio.
* [https://www.mpeg.chiariglione.org/ MPEG.chiariglione.org] {{Webarchive|url=https://web.archive.org/web/20240410005848/https://mpeg.chiariglione.org/ |date=10 April 2024 }} – MPEG official website