I posted a breakdown of fuzz/overdrive/distortion modelling here
Proof-of-concept firmware is developed and made public in the "spilink" branch on github, but I haven't written a setup guide, and the firmware is also missing diagnostic reporting that 'd help setting it up. Currently having trouble to allocate time to finish this effort. Stay tuned...
haha "sold" 
It's a working draft, that's why it is listed as under development, and is a patch containing an embedded object rather than a library object. Configurable grain size and number are certainly missing. The code currently exploits dependencies between grain size, number and audio buffer size to maximize performance:
At 128 grains of 2048 samples, and all grain phases distributed evenly, exactly one single grain needs a new random position per audio buffer. But in a similar fashion, a grain duration of 1024, 4096 or 8192 samples can be done, with respectively 2 position updates per buffer, one position update per two buffers, or one position update per four buffers.
Variable pitch is not implemented, and 'd increase the dsp load significantly. The left and right outputs each contain 64 hard-panned grains, adding random panning 'd increase the dsp load too.
In the big picture, I got a bit fascinated by tonewheel/drawbar approaches: rather than doing voice allocation, just running a full set of oscillators. Cross-breeding tonewheels with granular would require a massive amount of grains, when limiting to 64 tonewheels and two grains per tonewheel this requires... 128 simultaneous grains. This proofs this is possible with room to spare to increase to 80 or 96 tonewheels... The ability to access a grain table of 8MB means, this table could be split into 64 segments, each containing source material for every semitone. That's 1.333 seconds per semitone. Or the table could contain a single glissando of 64 semitones in 83.3 seconds, avoiding the need for pitching grains. Even with only two grains per tonewheel, a single key can mix tonewheels with the drawbars, I think the result could be rich and expressive, and totally alias-free.
The overlapping grain offsets could also be synchronized to pitch rather than random offsets.
Or perhaps skipping the tonewheel idea, a polyphonic synth could be made that uses 32 grains per voice (from a table containing segments for every semitone), so 4 voices can be played, and reducing the number of grains per voice progressively when more than 4 keys are playing...