MP3: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Add: bibcode, date. | Use this bot. Report bugs. | Suggested by Abductive | #UCB_webform 752/3850
m unpiped links using script
 
(10 intermediate revisions by 9 users not shown)
Line 22:
| latest release version = ISO/IEC 13818-3:1998
| latest release date = {{Start date and age|1998|04|df=y}}
| type = [[Lossy compression|Lossy]] [[Audio file format|audio]]
| container for =
| contained by = [[Elementary stream|MPEG-ES]]
| extended from =
| extended to =
Line 33:
}}
 
'''MP3''' (formally '''MPEG-1 Audio Layer III''' or '''MPEG-2 Audio Layer III''')<ref name="rfc5219" /> is a [[audio coding format|coding format]] for [[digital audio]] developed largely by the [[Fraunhofer Society]] in Germany under the lead of [[Karlheinz Brandenburg]],.<ref>{{Cite web|url=https://www.youtube.com/watch?v=cuU16whZ-Fs|title=73. "Father" of the MP3, Karlheinz Brandenburg|date=13 July 2015 |via=www.youtube.com|access-date=2 January 2023|archive-date=2 January 2023|archive-url=https://web.archive.org/web/20230102160404/https://www.youtube.com/watch?v=cuU16whZ-Fs|url-status=live}}</ref><ref>{{Cite web|url=https://www.internethistorypodcast.com/2015/07/on-the-20th-birthday-of-the-mp3-an-interview-with-the-father-of-the-mp3-karlheinz-brandenburg/|title=On the 20th Birthday of the MP3, An Interview With The "Father" of the MP3, Karlheinz Brandenburg|access-date=2 January 2023|archive-date=2 January 2023|archive-url=https://web.archive.org/web/20230102160403/https://www.internethistorypodcast.com/2015/07/on-the-20th-birthday-of-the-mp3-an-interview-with-the-father-of-the-mp3-karlheinz-brandenburg/|url-status=live}}</ref> withIt supportwas fromdesigned otherto digitalgreatly scientistsreduce inthe otheramount countries.of Originallydata definedrequired asto the thirdrepresent audio, formatyet still sound like a faithful reproduction of the original [[MPEG-1uncompressed]] standard,audio itto wasmost retainedlisteners; andfor furtherexample, extended—definingcompared additionalto bit[[Compact ratesDisc andDigital supportAudio|CD-quality for moredigital [[surround channels|audio channels]]—as, theMP3 thirdcompression audiocan formatcommonly ofachieve thea subsequent75–95% reduction in size, depending on the [[MPEG-2bit rate]] standard.<ref>{{cite Aweb third|date=27 version,July known2017 as|title=MP3 (MPEG Layer III Audio Encoding) |url=https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.shtml |url-2status=live |archive-url=https://web.archive.org/web/20170814015755/https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.5—extendedshtml to|archive-date=14 betterAugust support2017 lower|access-date=9 November 2017 |publisher=The Library of Congress}}</ref> In popular usage, ''MP3'' often refers to [[bitComputer ratefile|files]]s—is commonlyof implementedsound butor ismusic notrecordings astored in the MP3 [[file format]] (.mp3) on consumer recognizedelectronic standarddevices.
 
ConcerningOriginally defined in 1991 as the third audio format of the [[audioMPEG-1]] compressionstandard, (data)it was retained and further extended—defining additional bit rates and support for more [[surround channels|audio compressionchannels]]—as (the aspectthird audio format of the subsequent [[MPEG-2]] standard. mostMP3 apparentas toa end-users[[file andformat]] forcommonly whichdesignates itfiles iscontaining bestan known)[[elementary stream]] of MPEG-1 Audio or MPEG-2 Audio encoded data, without other complexities of the MP3 usesstandard. Concerning [[lossyaudio compression|lossy (data)|audio compression]], which is its most apparent element to end-users, MP3 uses [[lossy compression]] to encode data using inexact approximations and the partial discarding of data., Thisallowing allowsfor a large reduction in [[File size|file sizes]] when compared to uncompressed audio. The combination of small size and acceptable fidelity led to a boom in the distribution of music over the [[Internet]] in the mid-to-late 1990s, with MP3 serving as an enabling technology at a time when [[Bandwidth (computing)|bandwidth]] and storage were still at a premium. The MP3 format soon became associated with controversies surrounding [[copyright infringement]], [[music piracy]], and the file-[[ripping]] and [[file sharing|sharing]] services [[MP3.com#Original_version|MP3.com]] and [[Napster]], among others. With the advent of [[portable media player]]s (including "MP3 players"), a product category also including [[smartphones]], MP3 support remains near-universal and a [[De facto standard|''de facto'' standard]] for digital audio.
'''MP3''' (or '''mp3''') as a [[file format]] commonly designates files containing an [[elementary stream]] of MPEG-1 Audio or MPEG-2 Audio encoded data, without other complexities of the MP3 standard.
 
== History ==
Concerning [[audio compression (data)|audio compression]] (the aspect of the standard most apparent to end-users and for which it is best known), MP3 uses [[lossy compression|lossy data compression]] to encode data using inexact approximations and the partial discarding of data. This allows a large reduction in file sizes when compared to uncompressed audio. The combination of small size and acceptable fidelity led to a boom in the distribution of music over the Internet in the mid-to-late 1990s, with MP3 serving as an enabling technology at a time when bandwidth and storage were still at a premium. The MP3 format soon became associated with controversies surrounding [[copyright infringement]], [[music piracy]], and the file-[[ripping]] and [[file sharing|sharing]] services [[MP3.com#Original_version|MP3.com]] and [[Napster]], among others. With the advent of [[portable media player]]s, a product category also including [[smartphones]], MP3 support remains near-universal.
The [[Moving Picture Experts Group]] (MPEG) designed MP3 as part of its [[MPEG-1]], and later [[MPEG-2]], standards. MPEG-1 Audio (MPEG-1 Part 3), which included MPEG-1 Audio Layer I, II, and III, was approved as a committee draft for an [[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] standard in 1991,<ref name="cd-1991" /><ref name="neuron2-cd-1991" /> finalized in 1992,<ref name="dis-1992" /> and published in 1993 as ISO/IEC 11172-3:1993.<ref name="11172-3" /> An MPEG-2 Audio (MPEG-2 Part 3) extension with lower sample and bit rates was published in 1995 as ISO/IEC 13818-3:1995.<ref name="13818-3" /><ref name="mpeg-audio-faq-bc" /> It requires only minimal modifications to existing MPEG-1 decoders (recognition of the MPEG-2 bit in the header and addition of the new lower sample and bit rates).
 
MP3 compression works by reducing the accuracy of certain components of sound that are considered (by psychoacoustic analysis) to be beyond the [[Hearing range#Humans|hearing capabilities]] of most humans. This method is commonly referred to as perceptual coding or [[psychoacoustics|psychoacoustic]] modeling.<ref name="Jayant1993" /> The remaining audio information is then recorded in a space-efficient manner using [[Modified discrete cosine transform|MDCT]] and [[Fast Fourier transform|FFT]] algorithms. Compared to [[Compact Disc Digital Audio|CD-quality digital audio]], MP3 compression can commonly achieve a 75–95% reduction in size. For example, an MP3 encoded at a constant bit rate of 128&nbsp;kbit/s would result in a file approximately 9% the size of the original CD audio.<ref>{{cite web |url= https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.shtml |title= MP3 (MPEG Layer III Audio Encoding) |date= 27 July 2017 |publisher= The Library of Congress |access-date= 9 November 2017 |archive-date= 14 August 2017 |archive-url= https://web.archive.org/web/20170814015755/https://www.loc.gov/preservation/digital/formats/fdd/fdd000012.shtml |url-status= live }}</ref> In the early 2000s, compact disc players increasingly adopted support for playback of MP3 files on data CDs.
 
The [[Moving Picture Experts Group]] (MPEG) designed MP3 as part of its [[MPEG-1]], and later [[MPEG-2]], standards. MPEG-1 Audio (MPEG-1 Part 3), which included MPEG-1 Audio Layer I, II, and III, was approved as a committee draft for an [[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] standard in 1991,<ref name="cd-1991" /><ref name="neuron2-cd-1991" /> finalized in 1992,<ref name="dis-1992" /> and published in 1993 as ISO/IEC 11172-3:1993.<ref name="11172-3" /> An MPEG-2 Audio (MPEG-2 Part 3) extension with lower sample and bit rates was published in 1995 as ISO/IEC 13818-3:1995.<ref name="13818-3" /><ref name="mpeg-audio-faq-bc" /> It requires only minimal modifications to existing MPEG-1 decoders (recognition of the MPEG-2 bit in the header and addition of the new lower sample and bit rates).
 
== History ==
=== Background ===
{{See|Linear predictive coding|Modified discrete cosine transform}}
 
The MP3 lossy [[Audio compression (data)|audio-datalossy compression]] algorithm takes advantage of a perceptual limitation of human hearing called [[auditory masking]]. In 1894, the American physicist [[Alfred M. Mayer]] reported that a tone could be rendered inaudible by another tone of lower frequency.<ref name="Mayer1894" /> In 1959, Richard Ehmer described a complete set of auditory curves regarding this phenomenon.<ref name="Ehmer1959" /> Between 1967 and 1974, [[Eberhard Zwicker]] did work in the areas of tuning and masking of critical frequency-bands,<ref name="Zwicker" /><ref name="Eberhard" /> which in turn built on the fundamental research in the area from [[Harvey Fletcher]] and his collaborators at [[Bell Labs]].<ref name="Fletcher" />
 
Perceptual coding was first used for [[speech coding]] compression with [[linear predictive coding]] (LPC),<ref name="Schroeder2014">{{cite book |last1= Schroeder |first1= Manfred R. |title= Acoustics, Information, and Communication: Memorial Volume in Honor of Manfred R. Schroeder |date= 2014 |publisher= Springer |isbn= 978-3-319-05660-9 |chapter= Bell Laboratories |page= 388 |chapter-url= https://books.google.com/books?id=d9IkBAAAQBAJ&pg=PA388}}</ref> which has origins in the work of [[Fumitada Itakura]] ([[Nagoya University]]) and Shuzo Saito ([[Nippon Telegraph and Telephone]]) in 1966.<ref>{{cite journal |last1= Gray |first1= Robert M. |title= A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol |journal= Found. Trends Signal Process. |date= 2010 |volume= 3 |issue= 4 |pages= 203–303 |doi= 10.1561/2000000036 |url= https://ee.stanford.edu/~gray/lpcip.pdf |issn= 1932-8346 |doi-access= free |access-date= 14 July 2019 |archive-date= 9 October 2022 |archive-url= https://ghostarchive.org/archive/20221009/https://ee.stanford.edu/~gray/lpcip.pdf |url-status= live }}</ref> In 1978, [[Bishnu S. Atal]] and [[Manfred R. Schroeder]] at Bell Labs proposed an LPC speech [[codec]], called [[adaptive predictive coding]], that used a [[psychoacoustic]] coding-algorithm exploiting the masking properties of the human ear.<ref name="Schroeder2014"/><ref>{{cite book |last1= Atal |first1= B. |last2= Schroeder |first2= M. |title= ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing |chapter= Predictive coding of speech signals and subjective error criteria |date= 1978 |volume= 3 |pages= 573–576 |doi= 10.1109/ICASSP.1978.1170564}}</ref> Further optimization by Schroeder and Atal with J.L. Hall was later reported in a 1979 paper.<ref name="Schroeder1979"/> That same year, a psychoacoustic masking codec was also proposed by M. A. Krasner,<ref name="Krasner" /> who published and produced hardware for speech (not usable as music bit-compression), but the publication of his results in a relatively obscure [[Lincoln Laboratory]] Technical Report<ref>{{cite web|last1= Krasner|first1= M. A.|title= Digital Encoding of Speech Based on the Perceptual Requirement of the Auditory System (Technical Report 535)|url= https://apps.dtic.mil/dtic/tr/fulltext/u2/a077355.pdf|ref= Lincoln Laboratory, MIT|date= 18 June 1979|url-status= live|archive-url= https://web.archive.org/web/20170903070321/https://www.dtic.mil/dtic/tr/fulltext/u2/a077355.pdf|archive-date= 3 September 2017}}</ref> did not immediately influence the mainstream of psychoacoustic codec-development.
 
The [[discrete cosine transform]] (DCT), a type of [[transform coding]] for [[lossy compression]], proposed by [[N. Ahmed|Nasir Ahmed]] in 1972, was developed by Ahmed with T. Natarajan and [[K. R. Rao]] in 1973; they published their results in 1974.<ref name="Ahmed">{{cite journal |last= Ahmed |first= Nasir |author-link= N. Ahmed |title= How I Came Up With the Discrete Cosine Transform |journal= [[Digital Signal Processing (journal)| Digital Signal Processing]] |date= January 1991 |volume= 1 |issue= 1 |pages= 4–5 |doi= 10.1016/1051-2004(91)90086-Z |bibcode= 1991DSP.....1....4A |url= https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform |access-date= 19 November 2019 |archive-date= 10 June 2016 |archive-url= https://web.archive.org/web/20160610013109/https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform |url-status= live }}</ref><ref name="pubDCT">{{Citation |first1= Nasir |last1= Ahmed |author1-link= N. Ahmed |first2= T. |last2= Natarajan |first3= K. R. |last3= Rao |title= Discrete Cosine Transform |journal= IEEE Transactions on Computers |date= January 1974 |volume= C-23 |issue= 1 |pages= 90–93 |doi= 10.1109/T-C.1974.223784|s2cid= 149806273 }}</ref><ref name="pubRaoYip">{{Citation |last1= Rao |first1= K. R. |author-link1= K. R. Rao |last2= Yip |first2= P. |title= Discrete Cosine Transform: Algorithms, Advantages, Applications |publisher= Academic Press |location= Boston |year= 1990 |isbn= 978-0-12-580203-1}}</ref> This led to the development of the [[modified discrete cosine transform]] (MDCT), proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987,<ref>J. P. Princen, A. W. Johnson und A. B. Bradley: ''Subband/transform coding using filter bank designs based on time domain aliasing cancellation'', IEEE Proc. Intl. Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2161–2164, 1987</ref> following earlier work by Princen and Bradley in 1986.<ref>John P. Princen, Alan B. Bradley: ''Analysis/synthesis filter bank design based on time domain aliasing cancellation'', IEEE Trans. Acoust. Speech Signal Processing, ''ASSP-34'' (5), 1153–1161, 1986</ref> The MDCT later became a core part of the MP3 algorithm.<ref name="Guckert">{{cite web |last1= Guckert |first1= John |title= The Use of FFT and MDCT in MP3 Audio Compression |url= http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |website= [[University of Utah]] |date= Spring 2012 |access-date= 14 July 2019 |archive-date= 12 February 2021 |archive-url= https://web.archive.org/web/20210212022237/http://www.math.utah.edu/~gustafso/s2012/2270/web-projects/Guckert-audio-compression-svd-mdct-MP3.pdf |url-status= live }}</ref>
 
Ernst Terhardt and other collaborators constructed an algorithm describing auditory masking with high accuracy in 1982.<ref name="Terhardt1982" /> This work added to a variety of reports from authors dating back to Fletcher, and to the work that initially determined critical ratios and critical bandwidths.
Line 77 ⟶ 73:
A [[working group]] consisting of van de Kerkhof, Stoll, [[Leonardo Chiariglione]] ([[CSELT]] VP for Media), Yves-François Dehery, Karlheinz Brandenburg (Germany) and James D. Johnston (United States) took ideas from ASPEC, integrated the filter bank from Layer II, added some of their ideas such as the joint stereo coding of MUSICAM and created the MP3 format, which was designed to achieve the same quality at 128&nbsp;[[kbit/s]] as [[MPEG-1 Audio Layer II|MP2]] at 192&nbsp;kbit/s.
 
The algorithms for MPEG-1 Audio Layer I, II and III were approved in 1991<ref name="cd-1991" /><ref name="neuron2-cd-1991" /> and finalized in 1992<ref name="dis-1992" /> as part of [[MPEG-1]], the first standard suite by [[MPEG]], which resulted in the international standard '''[[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] 11172-3''' (a.k.a. ''MPEG-1 Audio'' or ''MPEG-1 Part 3''), published in 1993.<ref name = "11172-3" /> Files or data streams conforming to this standard must handle sample rates of 48k, 44100, and 32k and continue to be supported by current [[MP3 player]]s and decoders. Thus the first generation of MP3 defined {{math|14 × 3 {{=}} 42}} interpretations of MP3 frame data structures and size layouts.
 
The compression efficiency of encoders is typically defined by the bit rate because the compression ratio depends on the [[audio bit depth|bit depth]] and [[sampling rate]] of the input signal. Nevertheless, compression ratios are often published. They may use the [[compact disc]] (CD) parameters as references (44.1 [[kHz]], 2 channels at 16 bits per channel or 2×16 bit), or sometimes the [[Digital Audio Tape]] (DAT) SP parameters (48&nbsp;kHz, 2×16 bit). Compression ratios with this latter reference are higher, which demonstrates the problem with the use of the term ''compression ratio'' for lossy encoders.
 
Karlheinz Brandenburg used a CD recording of [[Suzanne Vega]]'s song "[[Tom's Diner]]" to assess and refine the MP3 [[compression algorithm]].<ref>{{cite web |title=The MP3: A History Of Innovation And Betrayal |url=https://www.npr.org/sections/therecord/2011/03/23/134622940/the-mp3-a-history-of-innovation-and-betrayal |website=NPR |access-date=3 August 2023 |date=2011-03-23 |archive-date=3 August 2023 |archive-url=https://web.archive.org/web/20230803092021/https://www.npr.org/sections/therecord/2011/03/23/134622940/the-mp3-a-history-of-innovation-and-betrayal |url-status=live }}</ref> This song was chosen because of its nearly [[Monaural|monophonic]] nature and wide spectral content, making it easier to hear imperfections in the compression format during playbacks. This particular track has an interesting property in that the two channels are almost, but not completely, the same, leading to a case where Binaural Masking Level Depression causes spatial unmasking of noise artifacts unless the encoder properly recognizes the situation and applies corrections similar to those detailed in the MPEG-2 AAC psychoacoustic model. Some more critical audio excerpts ([[glockenspiel]], triangle, [[accordion]], etc.) were taken from the [[European Broadcasting Union|EBU]] V3/SQAM reference compact disc and have been used by professional sound engineers to assess the subjective quality of the MPEG Audio formats.{{cn|date=August 2023}}
 
=== Going public ===
A reference simulation software implementation, written in the C language and later known as ''ISO 11172-5'', was developed (in 1991–1996) by the members of the ISO MPEG Audio committee to produce bit-compliant MPEG Audio files (Layer 1, Layer 2, Layer 3). It was approved as a committee draft of the ISO/IEC technical report in March 1994 and printed as document CD 11172-5 in April 1994.<ref name="paris_press" /> It was approved as a draft technical report (DTR/DIS) in November 1994,<ref name="singapore_press" /> finalized in 1996 and published as international standard ISO/IEC TR 11172-5:1998 in 1998.<ref name="ISO/IEC TR 11172-5:1998" /> The [[Reference implementation (computing)|reference software]] in C language was later published as a freely available ISO standard.<ref name="Software_Simulation.zip" /> Working in non-real time on several operating systems, it was able to demonstrate the first real-time hardware decoding (DSP based) of compressed audio. Some other real-time implementations of MPEG Audio encoders and decoders<ref>{{Cite book|title=A high-quality sound coding standard for broadcasting, telecommunications and multimedia systems.|last=Dehery |first=Yves-Francois|publisher=Elsevier Science BV |year=1994|isbn= 978-0-444-81580-4 |location=The Netherlands |pages=53–64|quote= This article refers to a Musicam (MPEG Audio Layer II) compressed digital audio workstation implemented on a microcomputer used not only as a professional editing station but also as a server on Ethernet for a compressed digital audio library, therefore anticipating the future MP3 on Internet }}</ref> were available for digital broadcasting (radio [[Digital audio broadcasting|DAB]], television [[Digital Video Broadcasting|DVB]]) towards consumer receivers and set-top boxes.
 
On 7 July 1994, the Fraunhofer Society released the first software MP3 encoder, called [[l3enc]].<ref name="MP3_Todays_Technology" /> The [[filename extension]] ''.mp3'' was chosen by the Fraunhofer team on 14 July 1995 (previously, the files had been named ''.bit'').<ref name="mp3-name" /> With the first real-time software MP3 player [[WinPlay3]] (released 9 September 1995) many people were able to encode and playbackplay back MP3 files on their PCs. Because of the relatively small [[hard drive]]s of the era (≈500–1000 [[megabyte|MB]]) lossy compression was essential to store multiple albums' worth of music on a home computer as full recordings (as opposed to [[Musical Instrument Digital Interface|MIDI]] notation, or [[Tracker (music software)|tracker]] files which combined notation with short recordings of instruments playing single notes).
 
==== Fraunhofer example implementation ====
Line 133 ⟶ 129:
In November 1997, the website [[mp3.com]] was offering thousands of MP3s created by independent artists for free.<ref name="seattlepi" /> The small size of MP3 files enabled widespread peer-to-peer [[file sharing]] of music [[Ripping|ripped]] from CDs, which would have previously been nearly impossible. The first large [[peer-to-peer]] filesharing network, [[Napster]], was launched in 1999. The ease of creating and sharing MP3s resulted in widespread [[copyright infringement]]. Major record companies argued that this free sharing of music reduced sales, and called it "[[music piracy]]". They reacted by pursuing lawsuits against [[Napster]], which was eventually shut down and later sold, and against individual users who engaged in file sharing.<ref name="Giesler" />
 
Unauthorized MP3 file sharing continues on next-generation [[peer-to-peer file sharing|peer-to-peer networks]]. Some authorized services, such as [[Beatport]], [[Bleep.com|Bleep]], [[Juno Records]], [[eMusic]], [[Zune Marketplace]], [[Wal-Mart|Walmart.com]], [[Rhapsody (online music service)|Rhapsody]], the recording industry approved re-incarnation of [[Napster (pay service)|Napster]], and [[Amazon.com]] sell unrestricted music in the MP3 format.
 
== Design ==
Line 145 ⟶ 141:
}}
 
An MP3 file is made up of MP3 frames, which consist of a header and a data block. This sequence of frames is called an [[elementary stream]]. Due to the "bit reservoir", frames are not independent items and cannot usually be extracted on arbitrary frame boundaries. The MP3 Data blocks contain the (compressed) audio information in terms of frequencies and amplitudes. The diagram shows that the MP3 Header consists of a [[sync word]], which is used to identify the beginning of a valid frame. This is followed by a bit indicating that this is the [[MPEG]] standard and two bits that indicate that layer 3 is used; hence MPEG-1 Audio Layer 3 or MP3. After this, the values will differ, depending on the MP3 file. ''[[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] 11172-3'' defines the range of values for each section of the header along with the specification of the header. Most MP3 files today contain [[ID3]] [[metadata]], which precedes or follows the MP3 frames, as noted in the diagram. The data stream can contain an optional [[checksum]].
 
[[Joint stereo]] is done only on a frame-to-frame basis.<ref name="Limitations"/>
 
=== Encoding and decoding ===
In short, MP3 compression works by reducing the accuracy of certain components of sound that are considered (by psychoacoustic analysis) to be beyond the [[Hearing range#Humans|hearing capabilities]] of most humans. This method is commonly referred to as perceptual coding or [[psychoacoustic]] modeling.<ref name="Jayant1993" /> The remaining audio information is then recorded in a space-efficient manner using [[MDCT]] and [[FFT]] algorithms.
 
The MP3 encoding algorithm is generally split into four parts. Part 1 divides the audio signal into smaller pieces, called frames, and an MDCT filter is then performed on the output. Part 2 passes the sample into a 1024-point [[fast Fourier transform]] (FFT), then the [[psychoacoustic]] model is applied and another MDCT filter is performed on the output. Part 3 quantifies and encodes each sample, known as noise allocation, which adjusts itself to meet the bit rate and [[sound masking]] requirements. Part 4 formats the [[bitstream]], called an audio frame, which is made up of 4 parts, the [[Header (computing)|header]], [[Error checking|error check]], [[audio data]], and [[#Ancillary data|ancillary data]].<ref name="Guckert"/>
 
Line 227 ⟶ 225:
| –
|-
| n/a
 
| 144
| –
Line 304 ⟶ 303:
MPEG-1 frames contain the most detail in 320&nbsp;kbit/s mode, the highest allowable bit rate setting,<ref>{{cite web |title=Sound Quality Comparison of Hi-Res Audio vs. CD vs. MP3 |url=https://www.sony.com/electronics/hi-res-audio-mp3-cd-sound-quality-comparison |website=www.sony.com |publisher=[[Sony]] |access-date=11 August 2020 |language=en |archive-date=14 September 2020 |archive-url=https://web.archive.org/web/20200914005253/https://www.sony.com/electronics/hi-res-audio-mp3-cd-sound-quality-comparison |url-status=live }}</ref> with silence and simple tones still requiring 32&nbsp;kbit/s. MPEG-2 frames can capture up to 12&nbsp;kHz sound reproductions needed up to 160&nbsp;kbit/s. MP3 files made with MPEG-2 do not have 20&nbsp;kHz bandwidth because of the [[Nyquist–Shannon sampling theorem]]. Frequency reproduction is always strictly less than half of the sampling rate, and imperfect filters require a larger margin for error (noise level versus sharpness of filter), so an 8&nbsp;kHz sampling rate limits the maximum frequency to 4&nbsp;kHz, while a 48&nbsp;kHz sampling rate limits an MP3 to a maximum 24&nbsp;kHz sound reproduction. MPEG-2 uses half and MPEG-2.5 only a quarter of MPEG-1 sample rates.
 
For the general field of human speech reproduction, a bandwidth of 5, 512&nbsp;Hz is sufficient to produce excellent results (for voice) using the sampling rate of 11,025 and VBR encoding from 44,100 (standard) WAV file. English speakers average 41–42&nbsp;kbit/s with -V 9.6 setting but this may vary with the amount of silence recorded or the rate of delivery (wpm). Resampling to 12,000 (6K bandwidth) is selected by the LAME parameter -V 9.4. Likewise -V 9.2 selects a 16,000 sample rate and a resultant 8K lowpass filtering. Older versions of LAME and FFmpeg only support integer arguments for the variable bit rate quality selection parameter. The n.nnn quality parameter (-V) is documented at lame.sourceforge.net but is only supported in LAME with the new style VBR variable bit rate quality selector—not average bit rate (ABR).
 
A sample rate of 44.1&nbsp;kHz is commonly used for music reproduction because this is also used for [[Red Book (audio CD standard)|CD audio]], the main source used for creating MP3 files. A great variety of bit rates are used on the Internet. A bit rate of 128&nbsp;kbit/s is commonly used,<ref name="Woon-Seng" /> at a compression ratio of 11:1, offering adequate audio quality in a relatively small space. As Internet [[bandwidth (computing)|bandwidth]] availability and hard drive sizes have increased, higher bit rates up to 320&nbsp;kbit/s are widespread. Uncompressed audio as stored on an audio-CD has a bit rate of 1,411.2&nbsp;kbit/s, (16 bit/sample × 44,100 samples/second × 2 channels / 1,000 bits/kilobit), so the bit rates 128, 160, and 192&nbsp;kbit/s represent [[Data compression ratio|compression ratios]] of approximately 11:1, 9:1 and 7:1 respectively.
Line 320 ⟶ 319:
{{main|ID3|APEv2 tag}}
 
A "tag" in an audio file is a section of the file that contains [[metadata]] such as the title, artist, album, track number, or other information about the file's contents. The MP3 standards do not define tag formats for MP3 files, nor is there a standard [[container format (digital)|container format]] that would support metadata and obviate the need for tags. However, several ''de facto'' standards for tag formats exist. As of 2010, the most widespread are [[ID3|ID3v1 and ID3v2]], and the more recently introduced [[APEv2 tag|APEv2]]. These tags are normally embedded at the beginning or end of MP3 files, separate from the actual MP3 frame data. MP3 decoders either extract information from the tags or just treat them as ignorable, non-MP3 junk data.
 
Playing and editing software often contains tag editing functionality, but there are also [[tag editor]] applications dedicated to the purpose. Aside from metadata about the audio content, tags may also be used for [[Digital rights management|DRM]].<ref name="Rae" /> [[ReplayGain]] is a standard for measuring and storing the loudness of an MP3 file ([[audio normalization]]) in its metadata tag, enabling a ReplayGain-compliant player to automatically adjust the overall playback volume for each file. [[MP3Gain]] may be used to reversibly modify files based on ReplayGain measurements so that adjusted playback can be achieved on players without ReplayGain capability.
Line 330 ⟶ 329:
The initial near-complete MPEG-1 standard (parts 1, 2, and 3) was publicly available on 6 December 1991 as ISO CD 11172.<ref name="Patel" /><ref name="mpegfa31.txt" /> In most countries, patents cannot be filed after [[prior art]] has been made public, and patents expire 20 years after the initial filing date, which can be up to 12 months later for filings in other countries. As a result, patents required to implement MP3 expired in most countries by December 2012, 21 years after the publication of ISO CD 11172.
 
An exception is the United States, where patents in force but filed before 8 June 1995 expire after the later of 17 years from the issue date or 20 years from the priority date. A lengthy patent prosecution process may result in a patent issued much later than normally expected (see [[submarine patent]]s). The various MP3-related patents expired on dates ranging from 2007 to 2017 in the United States.<ref name="big-list" /> Patents for anything disclosed in ISO CD 11172 filed a year or more after its publication are questionable. If only the known MP3 patents filed by December 1992 are considered, then MP3 decoding has been patent-free in the US since 22 September 2015, when {{US patent|5812672}}, which had a PCT filing in October 1992, expired.<ref name="Cogliati" /><ref name="US5812672" /><ref name="Patent expiration" /> If the longest-running patent mentioned in the aforementioned references is taken as a measure, then the MP3 technology became patent-free in the United States on 16 April 2017, when {{US patent|6009399}}, held<ref>{{Cite web|url=https://patents.google.com/patent/US6009399/en|title=Method and apparatus for encoding digital signals employing bit allocation using combinations of different threshold models to achieve desired bit rates|access-date=21 January 2023|archive-date=21 January 2023|archive-url=https://web.archive.org/web/20230121152351/https://patents.google.com/patent/US6009399/en|url-status=live}}</ref> and administered by [[Technicolor SA|Technicolor]],<ref>{{cite web|url=http://mp3licensing.com/patents/index.html|title=mp3licensing.com – Patents|work=mp3licensing.com|access-date=10 May 2008|archive-date=9 May 2008|archive-url=https://web.archive.org/web/20080509182032/http://www.mp3licensing.com/patents/index.html|url-status=live}}</ref> expired. As a result, many [[free and open-source software]] projects, such as the [[Fedora (operating system)|Fedora operating system]], have decided to start shipping MP3 support by default, and users will no longer have to resort to installing unofficial packages maintained by third party software repositories for MP3 playback or encoding.<ref>{{Cite web| url=https://fedoramagazine.org/full-mp3-support-coming-soon-to-fedora/| title=Full MP3 support coming soon to Fedora| date=2017-05-05| access-date=17 June 2017| archive-date=27 June 2017| archive-url=https://web.archive.org/web/20170627062915/https://fedoramagazine.org/full-MP3-support-coming-soon-to-fedora/| url-status=live}}</ref>
 
[[Technicolor SA|Technicolor]] (formerly called Thomson Consumer Electronics) claimed to control MP3 licensing of the Layer 3 patents in many countries, including the United States, Japan, Canada, and EU countries.<ref name="ffii" /> Technicolor had been actively enforcing these patents.<ref name="Technicolor" /> MP3 license revenues from Technicolor's administration generated about €100 million for the Fraunhofer Society in 2005.<ref name="Kistenfeger" /> In September 1998, the Fraunhofer Institute sent a letter to several developers of MP3 software stating that a license was required to "distribute and/or sell decoders and/or encoders". The letter claimed that unlicensed products "infringe the patent rights of Fraunhofer and Thomson. To make, sell or distribute products using the [MPEG Layer-3] standard and thus our patents, you need to obtain a license under these patents from us."<ref name="chillingeffects" /> This led to the situation where the [[LAME]] MP3 encoder project could not offer its users official binaries that could run on their computer. The project's position was that as source code, LAME was simply a description of how an MP3 encoder ''could'' be implemented. Unofficially, compiled binaries were available from other sources.
Line 338 ⟶ 337:
In September 2006, German officials seized MP3 players from [[SanDisk]]'s booth at the [[IFA show]] in Berlin after an Italian patents firm won an injunction on behalf of Sisvel against SanDisk in a dispute over licensing rights. The injunction was later reversed by a Berlin judge,<ref name="SanDisk MP3 seizure" /> but that reversal was in turn blocked the same day by another judge from the same court, "bringing the Patent Wild West to Germany" in the words of one commentator.<ref name="Patent Wild West" /> In February 2007, Texas MP3 Technologies sued Apple, Samsung Electronics and Sandisk in [[United States District Court for the Eastern District of Texas|eastern Texas federal court]], claiming infringement of a portable MP3 player patent that Texas MP3 said it had been assigned. Apple, Samsung, and Sandisk all settled the claims against them in January 2009.<ref name="law360" /><ref name="Kelly" />
 
[[Alcatel-Lucent]] has asserted several MP3 coding and compression patents, allegedly inherited from AT&T-Bell Labs, in litigation of its own. In November 2006, before the companies' merger, [[Alcatel-Lucent v. Microsoft|Alcatel sued Microsoft]] for allegedly infringing seven patents. On 23 February 2007, a San Diego jury awarded Alcatel-Lucent US $1.52 billion in damages for infringement of two of them.<ref name="MP3 payout" /> The court subsequently revoked the award, however, finding that one patent had not been infringed and that the other was not owned by Alcatel-Lucent; it was co-owned by [[AT&T Inc.|AT&T]] and Fraunhofer, who had licensed it to [[Microsoft]], the judge ruled.<ref name="Microsoft wins reversal" /> That defense judgment was upheld on appeal in 2008.<ref name="Alcatel-Lucent" />
 
== Alternative technologies ==
Line 344 ⟶ 343:
{{Main|List of codecs}}
 
Other lossy formats exist. Among these, [[Advanced Audio Coding]] (AAC) is the most widely used, and was designed to be the successor to MP3. There also exist other lossy formats such as [[mp3PRO]] and [[MPEG-1 Audio Layer II|MP2]]. They are members of the same technological family as MP3 and depend on roughly similar [[Psychoacoustics|psychoacoustic models]] and MDCT algorithms. Whereas MP3 uses a hybrid coding approach that is part MDCT and part [[Fast Fourier transform|FFT]], AAC is purely MDCT, significantly improving compression efficiency.<ref name="brandenburg"/> Many of the basic [[patent]]s underlying these formats are held by Fraunhofer Society, Alcatel-Lucent, [[Technicolor SA|Thomson Consumer Electronics]],<ref name="brandenburg" /> [[Bell Labs|Bell]], [[Dolby Laboratories|Dolby]], [[LG Electronics]], [[NEC]], [[NTT Docomo]], [[Panasonic]], [[Sony Corporation]],<ref name="businesswire">{{cite web |title=Via Licensing Announces Updated AAC Joint Patent License |url=https://www.businesswire.com/news/home/20090105005026/en/Licensing-Announces-Updated-AAC-Joint-Patent-License |website=[[Business Wire]] |access-date=18 June 2019 |date=5 January 2009 |archive-date=18 June 2019 |archive-url=https://web.archive.org/web/20190618122721/https://www.businesswire.com/news/home/20090105005026/en/Licensing-Announces-Updated-AAC-Joint-Patent-License |url-status=live }}</ref> [[Electronics and Telecommunications Research Institute|ETRI]], [[JVC Kenwood]], [[Philips]], [[Microsoft]], and [[Nippon Telegraph and Telephone|NTT]].<ref name="aac-licensors">{{cite web |title=AAC Licensors |url=http://www.via-corp.com/us/en/licensing/aac/licensors.html |website=Via Corp |access-date=6 July 2019 |archive-date=28 June 2019 |archive-url=https://web.archive.org/web/20190628173314/http://www.via-corp.com/us/en/licensing/aac/licensors.html }}</ref>
 
When the digital audio player market was taking off, MP3 was widely adopted as the standard hence the popular name "MP3 player". Sony was an exception and used their own [[ATRAC]] codec taken from their [[MiniDisc]] format, which Sony claimed was better.<ref>{{Cite news|url=https://www.nytimes.com/1999/09/30/technology/news-watch-new-player-from-sony-will-give-a-nod-to-mp3.html|title=NEWS WATCH; New Player from Sony Will Give a Nod to MP3|newspaper=The New York Times|date=30 September 1999|last1=Marriott|first1=Michel|access-date=24 September 2020|archive-date=3 July 2021|archive-url=https://web.archive.org/web/20210703065644/https://www.nytimes.com/1999/09/30/technology/news-watch-new-player-from-sony-will-give-a-nod-to-mp3.html|url-status=live}}</ref> Following criticism and lower than expected [[Walkman]] sales, in 2004 Sony for the first time introduced native MP3 support to its Walkman players.<ref>{{Cite web|url=https://www.cnet.com/reviews/sony-nw-e100-review/|title=Sony NW-E105 Network Walkman|access-date=24 September 2020|archive-date=31 October 2020|archive-url=https://web.archive.org/web/20201031221331/https://www.cnet.com/reviews/sony-nw-e100-review/|url-status=live}}</ref>
Line 367 ⟶ 366:
 
{{reflist|30em|refs=
<ref name="mp3-name">{{cite web | url = http://www.businesswire.com/news/home/20050712005686/en/Fraunhofer-IIS-Happy-Birthday-MP3! | title = Happy Birthday MP3! | publisher = Fraunhofer IIS | date = 12 July 2005 | access-date = 18 July 2010 | archive-date = 11 December 2014 | archive-url = https://web.archive.org/web/20141211110033/http://www.businesswire.com/news/home/20050712005686/en/Fraunhofer-IIS-Happy-Birthday-MP3! | url-status = livedead }}</ref>
<ref name="audio/mpeg">{{cite journal | url = http://tools.ietf.org/html/rfc3003 | title = The audio/mpeg Media Type&nbsp;— RFC 3003 | publisher = IETF | date = November 2000 | doi = 10.17487/RFC3003 | access-date = 7 December 2009 | last1 = Nilsson | first1 = M. | archive-date = 13 April 2012 | archive-url = https://web.archive.org/web/20120413074234/http://tools.ietf.org/html/rfc3003 | url-status = live }}</ref>
<ref name="RTP">{{cite journal | url = http://tools.ietf.org/html/rfc3555#page-24 | title = MIME Type Registration of RTP Payload Formats&nbsp;— RFC 3555 | publisher = IETF | date = July 2003 | doi = 10.17487/RFC3555 | access-date = 7 December 2009 | last1 = Casner | first1 = S. | last2 = Hoschka | first2 = P. | archive-date = 14 January 2012 | archive-url = https://web.archive.org/web/20120114154203/http://tools.ietf.org/html/rfc3555#page-24 | url-status = live }}</ref>
Line 474 ⟶ 473:
 
If there are already suitable links, propose additions or replacements on
the article's talk page, or submit your link to the relevant category at.
the Open Directory Project (dmoz.org) and link there using {{Dmoz}}.
 
-->
* [https://www.mp3-history.com/ MP3-history.com] {{Webarchive|url=https://web.archive.org/web/20200211201051/https://www.mp3-history.com/ |date=11 February 2020 }}, The Story of MP3: How MP3 was invented, by Fraunhofer IIS.<!-- https://web.archive.org/web/20070610231859/http://www.iis.fraunhofer.de/EN/bf/amm/mp3history/mp3history03.jsp -->
* {{Curlie|Computers/Multimedia/Music_and_Audio/Audio_Formats/MP3/}}
* [https://www.mp3-history.com/ MP3-history.com] {{Webarchive|url=https://web.archive.org/web/20200211201051/https://www.mp3-history.com/ |date=11 February 2020 }}, The Story of MP3: How MP3 was invented, by Fraunhofer IIS.
* [https://www.mp3newswire.net/sect/archive.htm MP3 News Archive]. {{Webarchive|url=https://web.archive.org/web/20190303201456/https://www.mp3newswire.net/sect/archive.htm |date=3 March 2019 }} – over 1000 articles from 1999 to 2011 focused on MP3 and digital audio.
* [https://www.mpeg.chiariglione.org/ MPEG.chiariglione.org] {{Webarchive|url=https://web.archive.org/web/20240410005848/https://mpeg.chiariglione.org/ |date=10 April 2024 }} – MPEG official website