this post was submitted on 26 Sep 2024
543 points (99.3% liked)

Technology

59415 readers
2879 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Here is the text of the NIST sp800-63b Digital Identity Guidelines.

you are viewing a single comment's thread
view the rest of the comments
[–] General_Effort@lemmy.world 6 points 1 month ago (1 children)

You should accept Unicode; if doing so, you must count each code as one char.

Hmm. I wonder about this one. Different ways to encode the same character. Different ways to calculate the length. No obvious max byte size.

[–] dual_sport_dork@lemmy.world 10 points 1 month ago (2 children)

Who cares? It's going to be hashed anyway. If the same user can generate the same input, it will result in the same hash. If another user can't generate the same input, well, that's really rather the point. And I can't think of a single backend, language, or framework that doesn't treat a single Unicode character as one character. Byte length of the character is irrelevant as long as you're not doing something ridiculous like intentionally parsing your input in binary and blithely assuming that every character must be 8 bits in length.

[–] frezik@midwest.social 5 points 1 month ago

It matters for bcrypt/scrypt. They have a 72 byte limit. Not characters, bytes.

That said, I also think it doesn't matter much. Reasonable length passphrases that could be covered by the old Latin-1 charset can easily fit in that. If you're talking about KJC languages, then each character is actually a whole word, and you're packing a lot of entropy into one character. 72 bytes is already beyond what's needed for security; it's diminishing returns at that point.

[–] General_Effort@lemmy.world 1 points 1 month ago

If the same user can generate the same input, it will result in the same hash.

Yes, if. I don't know if you can guarantee that. It's all fun and games as long as you're doing English. In other languages, you get characters that can be encoded in more than 1 way. User at home has a localized keyboard with a dedicated key for such a character. User travels across the border and has a different language keyboard and uses a different way to create the character. Euro problems.

https://en.wikipedia.org/wiki/Unicode_equivalence

Byte length of the character is irrelevant as long as you’re not doing something ridiculous like intentionally parsing your input in binary and blithely assuming that every character must be 8 bits in length.

There is always some son-of-a-bitch who doesn’t get the word.

  • John F. Kennedy