this post was submitted on 05 Jul 2024
392 points (95.8% liked)

Programmer Humor

32410 readers
306 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

founded 5 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] dactylotheca@suppo.fi 17 points 4 months ago (3 children)

Oh yeah they definitely have uses, but there's a real tendency for people to go a bit crazy with them. Complex regexen aren't exactly readable, there's all kinds of fun performance gotchas, there's sometimes other tools/algorithms that are more suitable for the task, and sometimes people try to use them to eg. parse HTML because they don't know that it is literally impossible to use regular expressions to parse languages that aren't regular

[–] frezik@midwest.social 12 points 4 months ago (1 children)

It's entirely possible to parse HTML in PCRE. You shouldn't, but it is possible. The language stopped being strictly regular a long time ago and is entirely capable of doing it.

https://stackoverflow.com/a/4234491/830741

[–] dactylotheca@suppo.fi 7 points 4 months ago* (last edited 4 months ago)

Oh yeah, extensions which make them non-regular definitely can make it possible, but just because it's now somewhat possible with some regex engines doesn't mean it's a good idea

[–] FooBarrington@lemmy.world 5 points 4 months ago (1 children)

I've once written a JS decompiler (de-bundler?) using ~150 regex for step-wise transformations. Worked surprisingly well!

[–] Azzk1kr@feddit.nl 4 points 4 months ago (1 children)

What eldritch beast was summoned as a result?

[–] FooBarrington@lemmy.world 2 points 4 months ago

Well... No new ones, at least? Though it was around that time that I started hearing whispers in the night... "You can use WASM to ship Client-Side PHP"

[–] bleistift2@sopuli.xyz 3 points 4 months ago

it is literally impossible to use regular expressions to parse languages that aren’t regular

It’s impossible to parse the whole syntax tree, but that doesn’t mean you can’t get the subset you’re interested in.