This matches the Context3D docs. Calling 'present' swaps
the buffers.
I wasn't certain if we actually need a double-buffered depth
texture, but I included one just to be safe.
Now that most of the complicated Context3D methods have been
implemented, we can simplify the overall design. Instead of queueing
up commands and having `present` execute them in a loop, we
can execute each command immediately. The key insight is that
a `RenderPass` is only needed for `DrawTriangles`, so we don't
have to store it in `Context3D` and deal with complicated lifetime
issues.
The old behavior gave us implicit double-buffering behavior,
since nothing would get rendered until a 'present' call.
Now that a 'drawTriangles' call will immediately submit
a draw command, we need to implement actual double buffering.
This is done in the next commit.
When rendering to an offscreen texture for `Bitmapdata.draw`,
we first render to a temporary frame buffer, and then copy the contents
of the frame buffer back to the target texture. However, this results
in blend modes being incorrectly applied - for example, rendering with
BlendMode.SUBTRACT will subtract against the framebuffer (which starts
with each pixel as 0x00000000), instead of the previous BitmapData
contents.
To fix this, we now use our texture target as the frame buffer
when performing `render_offscreen`. This ensure that we blend
over existing pixels (taking into account the `blendMode` provided
in the `BitmapData.draw` call).
When multisampling is enabled, we use a copy pipeline to copy
the existing contents of our texture to a fresh multisampled frame
buffer (the non-multisampled texture target becomes our resolve buffer).
* Take two: Delay reading image back from render backend using `SyncHandle`
This allows us to avoid blocking immediately after a `BitmapData.draw` call.
Instead, we only attempt to use the `SyncHandle` when performing an operation
that requires the CPU-side pixels (e.g. BitmapData.getPixel or BitmapData.setPixel).
In the best case, the SWF will never explicitly access the pixels of
the target BitmapData, removing the need to ever copy back the render backend
image to our BitmapData. If the SWF doesn't require access to the pixels immediately,
we can delay copying the pixels until they're actually needed, hopefully allowing
the render backend to finish processing the BitmapData.draw operation in
the backenground before we need the result.
Now that the CPU and GPU pixels can be intentionally out of sync with
each other, we need to ensure that we don't accidentally expose 'stale'
CPU-side pixels to ActionScript (which needs to remain unaware of
our internal laziness). We now use a wrapper type `BitmapDataWrapper`
to enforce that the `SyncHandle` is consumed before accessing the
underlying `BitmapData.
* core: Skip GPU->CPU sync for source and target BitmapData during draw
* Introduce DirtyState enum