DataStorageInAviSynth

Part 1: Actual memory storage

Warning: technical. Most users will only need to read (and understand) Part 2.

In AviSynth, we process pixels of video data. These are stored in the memory in the form of data arrays. As noted elsewhere (like the AviSynthFAQ) there are two different colorspaces in v1.0x/2.0x, RGB and YUY2, with a third YV12 added in v2.5. These colorspaces store their data in different ways.

RGB (also known as RGB24): Every pixel is associated with three bytes of data: one red, one green, and one blue. They are interleaved (mixed) in the memory like this RGBRGBRGBRGB. Since data is most easily read in words of four bytes, most functions actually process 4 pixels at the same time in order to be able to read 3 words in a row. Usually memory is allocated as aligned which means that it always allocate memory in words of 4 bytes. Since the 3 to 4 relation isn't easy to map, RGB24 is very difficult to work with, and in general isn't recommended.

RGBA (also known as RGB32): This is an extension of RGB where a fourth color channel has been added. This is called the alpha channel, and is a definition of transparency in the pixel. For an explanation of how the alpha channel is used see Mask and Layer. In general however, you shouldn't rely on filters processing alpha correctly - in some cases filters will even produce garbage in this channel.

RGBA also makes this a lot easier as it requires four bytes per pixel, and thus each memory access of a word, will correspond to exactly one pixel. In fact, RGB will only be used if you have a source that returns RGB or if you explicitly use (ConvertToRGB24)?. ConvertToRGB will by default create RGBA. This is the recommended format for RGB-data.

YUY2: In this colorspace each pair of pixels will share the color data, as well as the data being interleaved like this YUYV|YUYV|YUYV|YUYV. A memory access of one word will get all data for two pixels. A pair should never be split, as this can create strange artifacts when almost all functions in YUY2 assumes that they should read a full word and process that. Even if they return the right width when they are done, they will most likely have used invalid data when processing valid pixels.

YV12: Now the real fun begins as this is a PlanarImageFormat. This means that data is not interleaved, but stored separately for each color channel. For filter writers this means that they can write one simple function that is called three times, one for each color channel, assuming that the operations are channel-independent (not always the case). Also, 4 pixels share the color data. These four pixels are a 2x2 matrix, i.e. two pairs on adjacent lines in the same field. As fields themselves are interleaved, this means that for fieldbased video line 1&3 share color data, as well as line 2&4. Framebased video though has the more instinctive way of sharing color, line 1&2 share just as line 3&4 does. Again, using aligned for both color and luma channels will allow easy memory access, and again, a 2x2 should never be split as this may create strange artifacts.

Part 2: How does this affect me as a user?

As long as you don't split a colorsharing unit it's up to the filterwriters to take care of problems with memory reading at end of lines.
Interlaced (fieldbased) video requires double mod on height.
Required Modulos in different colorspaces.
- RGB(A)
  - width mod-1 (no restriction)
  - height mod-1 (no restriction) if progressive
  - height mod-2 (even values) if interlaced
- YUY2
  - width mod-2 (even values)
  - height mod-1 (no restriction) if progressive
  - height mod-2 (even values) if interlaced
- YV12
  - width mod-2 (even values)
  - height mod-2 (even values) if progressive
  - height mod-4 if interlaced
Examples of valid Crops with input 320x240 progressive
- RGB(A)
  - Crop(1,7,-32,-19)
  - Crop(2,4,300,196)
- YUY2
  - Crop(2,7,-32,-19)
  - Crop(2,4,300,196)
- YV12
  - Crop(2,8,-32,-18)
  - Crop(2,4,300,196)
Note that final video may have other restrictions, most MPEG-n implementations want mod-16 on all resolutions etc.

More information

See more about ColorSpaces.

See a general introduction to WorkingWithImages.

This page is a edited summary of [this thread at Doom9's forum]