The 'gc_arena' dependency was only used to manipulate the `GcCell`s
containing the vertex and fragment shaders; replacing these by a
reference to a plain old `Cell` means tha the Context3D traits and
types do not need to interact with GC'd object anymore.
As a knock-on effect, we can also remove the `Activation` parameter
from most of the `Context3DObject` methods.
The bind group layout only depends on the texture registers
(and 2D/cubemap type) accessed by the fragment shader, not on
the runtime texture bound with Context3D. This means that we can
build and cache it when we compile the AGAL program to a Naga
module.
Since the bind group layout is used for the overall pipeline, I've
refactored the shader caching code into `ShaderPairAgal`, which
holds both the vertex and fragment shader bytecode, and compiles
both in the `compile` function.
* wpgu: Initial implementation of PixelBender shader execution
The implementation is split across four crates:
* `ruffle_render` now holds the main PixelBender bytecode parsing
implementation (previously, this was in `ruffle_core`).
* `ruffle_core` holds some helper functions for converting between
AVM2 `Value`s and the PixelBender vector types.
* `naga-pixelbender` (newly created) constructs a Naga `Module`
from parsed PixelBender bytecode
* `ruffle_render_wgpu` sets up the render pipeline for the shader
constructed by `naga-pixelbender`, and actually executes the shader.
The Actionscript-side shader parameters are passed in through uniforms.
This allows us to cache the compiled `naga::Module` and associated
wgpu types inside `ShaderData`, when it's first created. Each invocation
of a `ShaderJob` only needs to create a bind group and render pass.
Limitations:
* Only a few of the PixelBender opcodes are implemented - however, this is
enough to get Stemlands cannon rotation working, as well as a cool
"donut" shader that I found and included as a test.
* PixelBender matrix types are not supported.
* Only BitmapData is supported as an input/output type - Flash Player
also supports using Vector and ByteArray
* ShaderJob execution is always synchronous.
* Adjust comments
* Address review comments
We use an `lru::LruCache` inside `ShaderModuleAgal`. This automatically
gives us the proper garbage-collection behavior (when the Flash
Program3D instance is garbage collected, we'll drop the
`ShaderModuleAgal` and the cache).
The cache is keyed on the data needed to compile the shader (vertex
attributes and sampler overrides). This lets us avoid shader
recompilations when a Stage3D program repeatedly uses the same
Program3D with different sampler overrides / vertex attribute formats.
In a previous PR, I introduced an optimization that used
`copy_texture_to_texture` to copy directly from a BitmapData GPU
texture to a Stage3D GPU texture.
Unfortunately, this optimization is incorrect. A BitmapData GPU
texture can be modified at any time by normal AVM2 code - in
particular, in might be modified before we submit the encoded
`copy_texture_to_texture` command. This shows up in Sniper Team,
which re-uses BitmapData objects for multiple distinct textures.
The previous 'optimization' resulted in the wrong BitmapData contents
getting uploaded to a texture (since it was changed before the copy
command was submitted).