You're way overcomplicating how it could be done. The argument is that training takes more energy:
Typically if you have a single cost associated with a service, then you amortize that cost over the life of the service: so you take the total energy consumption of training and divide it by the total number of user-hours spent doing inference, and compare that to the cost of a single user running inference for an hour (which they can estimate by the number of user-hours in an hour divided by their global inference energy consumption for that hour).
If these are "apples to orange" comparisons, then why do people defending AI usage (and you) keep making the comparison?
But even if it was true that training is significantly more expensive that inference, or that they're inherently incomparable, that doesn't actually change the underlying observation that inference is still quite energy intensive, and the implicit value statement that the energy spent isn't worth the affect on society
You're absolutely right that the environmental impact depends on the source of the energy, and less obviously, by the displaced demand that now has to seek energy from less clean sources. Ideally we should have lots of clean energy, but unfortunately we often don't, and even when AI uses clean sources, they're often just forcing preexisting load elsewhere. If we can start investing in power infrastructure projects at the national (or state/province level) then maybe it wouldn't be so bad, but it never happens at a scale that we need.
I think the argument isn't the environmental impact alone, it's the judgement about the net benefit of both the environmental impact and the product produced. I think the statement is "we spent all this power, and for what? Some cats with tits and an absolutely destroyed labour market. Not worth the cost"
Especially because it's a cost that the users of AI are forcing everyone to pay. Privatize profits, socialize losses, and all that.