I read one of the papers. About the specific question you have: given a string of bits s, they're making the choice to associate the empirical distribution to s, as if s was generated by an iid Bernoulli process. So if s has 10 zero bits and 30 one bits, its associated empirical distribution is Ber(3/4). This is the distribution which they're calculating the entropy of. I have no idea on what basis they are making this choice.
The rest of the paper didn't make sense to me - they are somehow assigning a number N of "information states" which can change over time as the memory cells fail. I honestly have no idea what it's supposed to mean and kinda suspect the whole thing is rubbish.
Edit: after reading the author's quotes from the associated hype article I'm 100% sure it's rubbish. It's also really funny that they didn't manage to catch the COVID-19 research hype train so they've pivoted to the simulation hypothesis.
Wait I know nothing about chemistry but I'm curious now, what are the footguns?