I'm not really sure how to describe it other than when I read a function to determine what it does then go to the next part of the code I've already forgotten how the function transforms the data
This sounds to me like you could benefit from mentally using the information hiding principle for your functions. In other words: Outside of the function, the only thing that matters is "what goes in?" and "what comes out?". The implementation details should not be important once you're working on code outside of that function.
To achieve this, maybe you could write a short comment right at the start of every function. One to two sentences detailing only the inputs/output of that function. e.g. "Accepts an image and a color and returns a mask that shows where that color is present." if you later forget what the function does, all you need to do is read that one sentence to remember. If it's too convoluted to write in one or two sentences, your function is likely trying to achieve too much at once and could (arguably "should") be split up.
Also on a different note: Don't sell your ability to "cludge something together" short. If you ever plan to do this professionally or educationally, you will sadly inevitably run into situations where you have no choice but to deliver a quick and dirty solution over a clean and well thought out one.
Edit: typos