So from what I understand if we make an ai and we use reward and punishment as a way of teaching it to do things it will either resist being shut down due to that ceasing any and all rewards or essentially becoming suicidal and wanting to be shut down bc we offer that big of a reward for it.

Plus there is a fun aspect of us not really knowing what the AI's goal is, it can be aligned with what we want but to what extent, maybe by teaching it to solve mazes the AI's goal is to reach a black square and not actually the exit.

Lastly the way we make things will change the end result, if you make a "slingshot" using a CNC vs a lathe the outcomes will vary dramatically. Same thing applies to AI's and of we use that reward structure then we end up in the 2 examples mentioned above