Switch to a single render pass for the whole frame, as opposed to
a render pass per draw call. This should be a large improvement to
performance. This currently requires some unsafe to work around a
self-reference between RenderPass and CommandEncoder in Frame;
this could eventually be cleaned up by changing RenderBackend
to return a Frame object instead of using begin_frame/end_frame
pairs.
Also switch to using push constants for the transform/color
uniforms.
Change the usage of the stencil buffer to avoid running out of
stencil bits when too many nested masks are active.
This also cleans things up on wgpu which requires us to make
pipeline states in advice; now we only need a few stencil states
for masking as opposed to hundreds.