i cant test, as the subpatch example includes objects which are not in the community library.
the only way to see why this is more 'efficient' is to take a look at the generated source code, in xpatch.cpp. this should tell you what about the code is different... it could be just simply 'code repetition'...
you have to remember axoloti optimises for running performance NOT space, so it inlines a lot of code.
yes, because its the 'wrong question'... if you are combining objects, you don't need to 'connect them'.
i.e. if I want to combine object A and B into an object called C... then C contains the code from A and B, I'm not connecting A and B... so i take all the code that creates the output for A and just directly connect it to the code in B.
now heres the point... you can't do this 'blindly' , you need to do this with knowledge of what A and B do
take a couple of silly examples
A has an outlet int32 B has inlets int32,
A code
outlet_o = inlet_i >> 1;
B code
outlet_o = inlet_i >> 2;
so to combine for C (forgetting ,we can do it better)
C code
int32_t oa = inlet_i >> 1; // (this is inlet from A)
outlet_o = oa >> 2; // (outlet code from B)
easy... but thats because its a simple type...
but if you do it with audio buffers, then its an array (as are strings), so you have to iterate over it.
now with audio buffers often you can optimise, since by defaults you will have to iterate over the buffer twice but often when you look at the code, you can see you can combine the code, and only iterate over it once, i.e. do the 'merging' of the code at sample level not buffer level (which the inlets/outlets force)
similarly, you may find that when you look at 2 objects to combine, they will mess about scaling the data to conform to the unipolar/bipolar conventions, but you may not have to do this, when you combine the inlet to the outlet, and thereby save processing.
really its worth investing the time understanding the code.
one last point... bare in mind, when you take a 'snapshot' of the code from a couple of objects, if we change the factory object (e.g. fix a bug, make it more efficient) , you will not get that benefit.
your point about efficiency is important, Axoloti goal is that most users do not code objects ... as it dramatically cuts the user base. so if combining objects is gaining benefits, really we need to see why this is, and see if the code generation can be improved.... from your examples, I dont see any fundamental reason, why the code generator in axoloti cannot output as efficient code as your doing manually.
anyway, sorry, its a massive topic really... and pretty much begs as many questions as it answers 