Monday, October 21, 2013

An efficient scanline blitting algorithm

For the bitbox kernel, we need to have a fast graphical objects blitting library, since we will often have the same needs. Blitting from low Z to high Z of a fixed tiled background + overdrawing small sprites (to keep the 5k cycles budget) only goes so far, because as we are overdrawing pixels, we will be hitting the 5k cycles / lines pretty soon.

We will sum up the needs of such a library and propose an algorithm.




The needs envisioned for such a lib are :

- bitbox kernel compatible (no frame buffer, frame and line-based callbacks, 640x480 x2bytes  32k colors on limited CPU), line by line blitting.
- be able to adapt to several decompression algorithms and pixel storage format (palettes, tiles) , some of them not allowing direct access to pixels (RLE encoding), and several algorithms (plain sprites, lines, polygons)
- allow opaque as well as transparent (also maybe partially transparent) sprites
- since we're relatively high def (640 pixels x 2bytes = 1280B per line), avoid as much as possible overdraw (ie drawing parts of a sprite that is under another) - will allow large sprites / layered backgrounds : imagine we have 5 layers background + 50 big sprites overlapping by 80%, for a given line we're avoiding many many pixels. and blitting a pixel can be several cycles (read, writes, ...)
- not using dynamic memory (games can be using it, but preferably not the kernel).

The proposed 2D scanline algorithm is the following :




  • Keep a list of objects with (x,y,z position and height dimension h+object callbacks)
  • Maintain an active objects list (ie in a line, the list of currently active objects)
    • The object list will be kept sorted along y so that each line we will only test if the next one is becoming visible
    • The active object list will be sorted along Z order, top first.
  •  For Each frame
    • reset the active list
    • call each object frame() callback to update graphical state (advance frame, reset line counters, ...)
  • For Each line
    • Make new objects active if needed 
    • Scan the active list to discard objects past their y+h dimension
    • Keep the active object list sorted along Z axis, first elements being on top
    • Maintain a x sorted list of opaque regions, ie regions that are fully blit, reset each line.
    • For each object from top to bottom, call the line() callback
  •  The object line callback calls the pre_draw(x1,x2,is_opaque) function. This predraw function will maintain a list of opaque blits. is_opaque can specifyif the blit is totally opaque or transparent.
    • If the element is opaque, we will limit the blit asked by the object to the non already painted parts, since the blit list contains the higher Z blits that are hiding the parts we've already blit - and put to the blit list.
      • Each part will either expand an already existing blit, or create a new one.
      • The object blit(x1,x2) will be called to effectively paint pixels there.
    •   If the element is transparent, we will remember the non-hidden parts but keep them in a stack, that will be drawn after (discarding hidden parts), but not blit them now, since we need what's below to paint over (we're not opaque).


Notice how we did not blit C nor background under B, and how the background pre-blit is cut in three blits by the blitter : before B, between B and E, and after E.  if an object has "holes", it will call the pre_blit several times.

Missing parts of the algorithm are :
 
- Allocation and management of object lists (with pool of polymorphic C objects) + optimized, in place sorting algorithms (we're creating / using/ discarding the blit list 31000 times/seconds where missing a period is a distorded image).
- Actually writing the different blit (plain, transparent, gradient, tiled, bitblt, palette based...) and line callbacks (rectangles, RLE sprites, semitransparent, lines, polygons, ...)
- Fast forwarding an object lines before the line zero

No comments:

Post a Comment