Instead of binding every supported sampler combination
and selecting the correct index in our AGAL builder,
we now determine the correct sampler on the wgpu side,
and create one sampler binding per texture slot.
After some testing, and looking at OpenFL, I believe I've
determined the correct behavior for AGAL sampling:
Each time a Context3D.setProgram or Context3D.setSamplerStateAt
call is made, the sampler config for the used texture slot(s)
is updated with the new wrapping/filter behavior. For setProgram,
this comes from all of the 'tex' opcodes used within the program.
However, when the 'ignoresampler' flag is set in a 'tex' opcode,
the setProgram call does *not* override the existing sampler config.
As a result, that program will sample with the behavior determined
by the most recent setSamplerStateAt or setProgram call involving
the used texture slot(s).
Previously, we were always overriding the opcode sampler config
with the values from Context3D.setSamplerStateAt. However, I didn't
realize that the order of the calls matter, so none of my tests ended
up observing the effect of 'ignoresampler'.
We now need to process AGAL bytecode twice - a quick initial
parse to determine the sampler configs (which need to be updated
when we call 'setProgram'), and a second time when to build the
Naga module (which needs to wait until we have the vertex attributes
available, which can be changed by ActionScript after setting
the program).
* wpgu: Initial implementation of PixelBender shader execution
The implementation is split across four crates:
* `ruffle_render` now holds the main PixelBender bytecode parsing
implementation (previously, this was in `ruffle_core`).
* `ruffle_core` holds some helper functions for converting between
AVM2 `Value`s and the PixelBender vector types.
* `naga-pixelbender` (newly created) constructs a Naga `Module`
from parsed PixelBender bytecode
* `ruffle_render_wgpu` sets up the render pipeline for the shader
constructed by `naga-pixelbender`, and actually executes the shader.
The Actionscript-side shader parameters are passed in through uniforms.
This allows us to cache the compiled `naga::Module` and associated
wgpu types inside `ShaderData`, when it's first created. Each invocation
of a `ShaderJob` only needs to create a bind group and render pass.
Limitations:
* Only a few of the PixelBender opcodes are implemented - however, this is
enough to get Stemlands cannon rotation working, as well as a cool
"donut" shader that I found and included as a test.
* PixelBender matrix types are not supported.
* Only BitmapData is supported as an input/output type - Flash Player
also supports using Vector and ByteArray
* ShaderJob execution is always synchronous.
* Adjust comments
* Address review comments
We use an `lru::LruCache` inside `ShaderModuleAgal`. This automatically
gives us the proper garbage-collection behavior (when the Flash
Program3D instance is garbage collected, we'll drop the
`ShaderModuleAgal` and the cache).
The cache is keyed on the data needed to compile the shader (vertex
attributes and sampler overrides). This lets us avoid shader
recompilations when a Stage3D program repeatedly uses the same
Program3D with different sampler overrides / vertex attribute formats.
These are poorly documented, but from looking at OpenFL
and AGALMiniAssembler, they allow performing loads of the
form `vc[va0.x + offset]` - that is, computing a dynamic register
number, instead of using the register number present in the opcode.