I found two issues, and fixing them brought the time from 3.1ms per frame down to 0.4ms per frame.
1.)
It locks and unlocks the vertex buffer lots of times per frame. Currently, it does:
Code: Select all
lock,fill,unlock,render,lock,fill,unlock,render,..
It can be changed to do:
Code: Select all
lock,fill,fill,fill,unlock,render,render,render
This requires another vector that stores the texture and vertex offset/size for the draw call. This made a modest improvement.
2.)
In one case, it was swapping textures 32 times per frame. This was due to a very simple problem: Images of alpha 0 were still being rendered. In particular, this means that a series of StaticImages will be broken up with invisible TL frames, chopping and changing the texture.
This is trivial to fix by early outing from the addQuad function.
Code: Select all
if (0.0f == colours.d_top_left.getAlpha() &&
0.0f == colours.d_top_right.getAlpha() &&
0.0f == colours.d_bottom_left.getAlpha() &&
0.0f == colours.d_bottom_right.getAlpha())
{
return;
}
When it comes to a patch, I'm open to suggestions. I can get a fresh copy and apply 1 or both these changes to it, to make a clean patch. Or I can do a multi-patch that has all 4 changes. Or I can do a patch on top of the previous rotation patch. Or lastly, I can sit back and do nothing.