Stumbled on an ST app note AN4841 today and noticed an interesting section:
Apologies for the weird small image, struggling with this forum app, original pdf here. Obviously it's tough to draw conclusions from random benchmarks. What struck me was that F32 performance seems to be roughly on par with Q31 performance if not slightly faster. This also gives a rough sense of how much additional headroom can be expected moving to an F7 over an F4. As mentioned elsewhere, I think the H7 is where we want to be looking going forward which bumps the clock rate up to 480Mhz but maintains pin compatibility with the F7. Not available until Q3 2019 according to ST.
As an aside: in one of my own projects I haven't been able to convince myself that the extra special ARM DSP F32 routines are actually faster in practice than the vanilla ARM GCC math library; I've looked at mainly
sqrt. Would be interested to hear others' experiences there.