This feature is disabled by default. When enabled, you can use
`ruffle_render::renderdoc::begin_frame_capture` and
`ruffle_render::renderdoc::end_frame_capture` to manually trigger
a RenderDoc frame capture (if Ruffle wasn't launched by RenderDoc,
this logs an error).
This is very useful when debugging Stage3D/PixelBender bugs, as you
can produce a capture containing only the relevant graphics calls.
* wpgu: Initial implementation of PixelBender shader execution
The implementation is split across four crates:
* `ruffle_render` now holds the main PixelBender bytecode parsing
implementation (previously, this was in `ruffle_core`).
* `ruffle_core` holds some helper functions for converting between
AVM2 `Value`s and the PixelBender vector types.
* `naga-pixelbender` (newly created) constructs a Naga `Module`
from parsed PixelBender bytecode
* `ruffle_render_wgpu` sets up the render pipeline for the shader
constructed by `naga-pixelbender`, and actually executes the shader.
The Actionscript-side shader parameters are passed in through uniforms.
This allows us to cache the compiled `naga::Module` and associated
wgpu types inside `ShaderData`, when it's first created. Each invocation
of a `ShaderJob` only needs to create a bind group and render pass.
Limitations:
* Only a few of the PixelBender opcodes are implemented - however, this is
enough to get Stemlands cannon rotation working, as well as a cool
"donut" shader that I found and included as a test.
* PixelBender matrix types are not supported.
* Only BitmapData is supported as an input/output type - Flash Player
also supports using Vector and ByteArray
* ShaderJob execution is always synchronous.
* Adjust comments
* Address review comments
We use an `lru::LruCache` inside `ShaderModuleAgal`. This automatically
gives us the proper garbage-collection behavior (when the Flash
Program3D instance is garbage collected, we'll drop the
`ShaderModuleAgal` and the cache).
The cache is keyed on the data needed to compile the shader (vertex
attributes and sampler overrides). This lets us avoid shader
recompilations when a Stage3D program repeatedly uses the same
Program3D with different sampler overrides / vertex attribute formats.
These are poorly documented, but from looking at OpenFL
and AGALMiniAssembler, they allow performing loads of the
form `vc[va0.x + offset]` - that is, computing a dynamic register
number, instead of using the register number present in the opcode.
In a previous PR, I introduced an optimization that used
`copy_texture_to_texture` to copy directly from a BitmapData GPU
texture to a Stage3D GPU texture.
Unfortunately, this optimization is incorrect. A BitmapData GPU
texture can be modified at any time by normal AVM2 code - in
particular, in might be modified before we submit the encoded
`copy_texture_to_texture` command. This shows up in Sniper Team,
which re-uses BitmapData objects for multiple distinct textures.
The previous 'optimization' resulted in the wrong BitmapData contents
getting uploaded to a texture (since it was changed before the copy
command was submitted).
When multisampling is enabled, we should create a new multisampled texture,
and use the existing texture as the resolve buffer. We also need to
call `update_has_depth_texture` to keep our pipeline aware of whether
or not we currently have a depth buffer attached.
Makes progress on #10641 (it has a stack overflow after
this PR, due to an unrelated issue).
wgpu requires buffer copy sizes and offsets to be 4-byte aligned.
Unfortunately, ActionScript can perform 2-byte aligned uploads
into an IndexBuffer3D.
To support this, we now keep a copy of the IndexBuffer3D on the CPU.
When performing an upload to the buffer, we round the offset down
and the size up to the nearest 4-byte aligned value. The cpu buffer
is used to fill out the write with existing data, so that we don't
corrupt the contents of the GPU buffer.
To avoid introducing a new RefCell, I've changed IndexBuffer3D
to use a `Box` instead of an `Rc` to store the trait object.
This allows us to pass a mutable reference down to the backend.
Generally, when transforming a difference between two points, `p1`
and `p2`, with a matrix `m`, we would like the following property
to hold:
```
m * (p1 - p2) == m * p1 - m * p2
```
Unfortunately, it wasn't like this before, because matrices have a
translation component, which is non-linear. In `m * p1 - m * p2`,
the translations of `m * p1` and `m * p2` are the same and therefore
cancel out each other. However, in `m * (p1 - p2)` the translation
stays.
In order to preserve this property, introduce a new `PointDelta`
type which is not subject to translation when transformed by a matrix.
For now, the following operations are supported:
* `Point - Point -> PointDelta`
* `Point + PointDelta -> Point`
* `Point += PointDelta`
* `Point - PointDelta -> Point`
* `Point -= PointDelta`
As a consequence, the expression `position + global_to_local_matrix * mouse_delta`
in `update_drag()` now ignores translation, which fixes#817.