<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title>high.definition.x264.standards.revision.3.1.addendum.1.nfo</title>
    <style type="text/css">
        @font-face {
            font-family: nfo;
            font-style:  normal;
            font-weight: normal;
            src: url(nfo.eot);
        }
        .nfo {
            padding: 12px;
            font-family: nfo, courier new;
            font-size: 11px;
            line-height: 1em;
        }
    </style>
</head>
<body>
    <pre class="nfo">��
  High.Definition.x264.Standards.Revision.3.1.Addendum.1-HDX��
  ��
  As there lately have been several pres with the wrong number of reference frames��
  we thought we&#39;d explain the reasoning behind the rule set.��
��
  On various x264 pages on the internet the following can be read about reference frames:��
  ��
  &#34;Selects the maximum number of reference frames that can be used. Referenced frames��
  are frames that refer to other frames (eg. if both frames are similar). Having a high ��
  referenced frame will improve quality but slow up encoding. For typical content, ��
  a reference frame of 3 to 5 is recommended. For content with a lot of repetition ��
  (eg. animation), a reference frame of 8 to 10 can be used.&#34;��
  ��
  So, higher reference frames means higher quality, why do we then enforce max 4 ��
  reference frames on high resolution video? It has to do with hardware players. ��
  The popcorn hour, twix, wd etc. all support Level 4.1 (L4.1) of the ITU-T h264 ��
  specification. All graphic cards that have DXVA also support L4.1. So let&#39;s see ��
  what L4.1 says about reference frames.��
  ��
  In table &#39;A-1 Level limits&#39; on page 283 (pdf 305) of the ITU-T specification it ��
  says that MaxDPB for L4.1 is 12288 KiB. MaxDPB is the Maximum Decoded Picture Buffer, ��
  which is the largest size allowed of the decoded picture buffer when decoding a video.��
  By supporting L4.1, the hardware players must have at least 12 MiB of buffer for ��
  storing the decoded pictures. This means that a video that requires a buffer of ��
  13 MiB is not guaranteed to work on one of these players.��
  ��
  As 16 * 16 pixels macroblocks are used, all resolutions needs to be mod16, for ease of��
  reading, the maths to make them mod16 is not included below.��
  ��
  The DPB in KiB is calculated as follows:��
  ��
  DPB = vertical resolution * horizontal resolution * 1.5 * reference frames / 1024��
  ��
  If we transform this formula to get the reference frames instead we get:��
  ��
  ref = 12288 * 1024 / (vertical resolution * horizontal resolution * 1.5)��
  ��
  We of course can&#39;t use partial frames for referencing and thus the reference frames ��
  should be rounded down to the closest integer. We can also transform this to get the ��
  maximum vertical resolution for a specific reference frames value, here we need to��
  round the vertical resolution down to mod16:��
  ��
  vertical res = 12288 * 1024 / (horizontal resolution * 1.5 * reference frames)��
  ��
  With the above formula we can conclude that the highest vertical resolution that we ��
  can have ref 5 on and still be L4.1 compliant is 864 pixels.��
  ��
  873.813333 = 12288 * 1024 / ( 1920 * 1.5 * 5 )��
  864 = floor( 873.813333 / 16 ) * 16��
  ��
  The 1.5 in the calculations above is the YV12 colourspace, it needs 12 bits to store ��
  1 pixel. In other words, 1.5 bytes per pixel.��
  ��
  So, to conclude this, the reason we put ref 4 as max for movies with vertical resolution��
  greater than 864 in rules is not because we want to be able to encode releases faster. ��
  It&#39;s because we want releases to be L4.1 compliant and thus possible to play on the ��
  popcorn hour, twix and other hardware players. And we require at least ref 5 on all ��
  videos where it&#39;s possible while still respecting L4.1, this to ensure high quality.��
  ��
  ITU-T specification:��
  http://www.itu.int/rec/dologin_pub.asp?lang=e&#38;id=T-REC-H.264-200711-I!!PDF-E&#38;type=items��
  ��
  12 bits per pixel for YV12:��
  http://msdn.microsoft.com/en-us/library/aa904813.aspx#yuvformats_420formats_12bitsperpixel</pre>
</body>
</html>