Home
RecentChanges

Search:

» AviSynth is a powerful video FrameServer for Win32.

AviSynth Links:
»Download
»Learn to script
»FAQ
»Manual
»Discussion fora
»Project page
»External filters
»FeedBack

» You can add pages to this website immediately. No login required.
Edit this document

» AboutAviSynth

 

Avisynth 
Logo

DataAlignment

(Until something more AviSynth related can be written this is from the AMDOptimizationGuide)

Memory alignment

In general, avoid misaligned data references. All data whose size is a power of two is considered aligned if it is naturally aligned.

For example:

  • Word accesses are aligned if they access an address divisible by two.
  • Doubleword accesses are aligned if they access an address divisible by four.
  • Quadword accesses are aligned if they access an address divisible by eight.
  • TBYTE accesses are aligned if they access an address divisible by eight.

In AviSynth each frame is most of the time aligned to an address divideable by 16. The exception for this rule is when crop() has been used. Furthermore each line is also most of the time aligned to an address divideable by 16 (or 8 for planar chroma).

Store to Load Forwarding

Whenever some data is written to memory it is put in the cache. It is however possible for the processor to access this data quite fast after it has been written. This is called Store To Load forwarding.

Avoid memory-size mismatches when different instructions operate on the same data. When an instruction stores and another instruction reloads the same data, keep their operands aligned and keep the loads/stores of each operand the same size.

The following code examples result in a store-to-loadforwarding (STLF) stall:

Example (avoid):

MOV [FOO], EAX
MOV [FOO+4], EDX
...
MOVQ MM0, [FOO]

Example (preferred):

MOV [FOO], EAX
MOV [FOO+4], EDX
...
MOVD MM0, [FOO]
PUNPCKLDQ MM0, [FOO+4]

Example (avoid):

MOVQ [foo], MM0
...
MOV EAX, [foo]
MOV EDX, [foo+4]

Example (preferred):

MOVD [foo], MM0
PSWAPD MM0, MM0
MOVD [foo+4], MM0
PSWAPD MM0, MM0
...
MOV EAX, [foo]
MOV EDX, [foo+4]

Back to AssemblerOptimizing

SourceForge Logo

 


Edit this document | View document history
Document last modified Sat, 15 Mar 2003 14:47:57