this post was submitted on 22 Jul 2023
128 points (93.8% liked)
Lemmy.World Announcements
29063 readers
3 users here now
This Community is intended for posts about the Lemmy.world server by the admins.
Follow us for server news ๐
Outages ๐ฅ
https://status.lemmy.world
For support with issues at Lemmy.world, go to the Lemmy.world Support community.
Support e-mail
Any support requests are best sent to info@lemmy.world e-mail.
Report contact
- DM https://lemmy.world/u/lwreport
- Email report@lemmy.world (PGP Supported)
Donations ๐
If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.
If you can, please use / switch to Ko-Fi, it has the lowest fees for us
Join the team
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Load balancing applications is significantly more complex than most people anticipate. In the naive implementation it typically increases database loads and reduces site performance. Static content balancing is trivial, and cloudflare will do that by default, but implementing the hard part will require careful software development to prevent a naive implementation from bringing down the database. Sticky sessions are just the beginning.
I mean...this take is naive. Putting a load balancer up in front of a few servers isn't going to do anything to their database? No idea where you're even getting that from, as they are completely unrelated.
The total number of application servers accessing the database is what would affect db performance in a negative way, and load balancing doest automatically mean "do something stupid like spin up 100 app servers when we normally use 3". All you've described is a need for a db proxy in the off chance that Lemmy code has horrible access patterns for db transactions.
You can take your uninformed nerd rage elsewhere now, thank you.
You obviously haven't written one.
Simple case, without sticky sessions:
2 app servers behind a naive load balancer. Assume an actually restful service. Also assume a reasonable single app design with persistent db connections and db caching. Assume a single client. Single clients first connection comes in to app servers 1. App servers 1 makes db connection and grabs relevant data out of db. Caches information for client expecting a reconnect. Client makes second call, load balancer places it on app server 2, app servers 2 now makes a second connection and queries the data.
The db has now done twice the work for a single client. This pattern is surprisingly common and as the user count grows this duplication significantly degrades cache performance and increases load on the db. It only gets worse as the user count increases.
It's a common scenario for someone who doesn't understand the point of putting a load balancer in front of a stateful application, perhaps. Not for anyone trying to solve a traffic problem.
No idea where you are getting your ideas from, but this is an absolutely uninformed example of how NOT to do something in an ideal way.
I'm really interested now which one of you is right. While the other person put some effort and gave a lot of actual information, you just come off as arrogant. Still, maybe you're right. Care to elaborate why?
I'm not one of these 2 arguing. But in general the app servers don't do caching or state handling.
You cache things in a third external cache such as redis or memcached. So if a user connects to app server 1 and then to app server 2 they will both grab cachee info from redis. No extra db calls required. This has been the basic way of doing things even with old school WordPress sites forever. You also store session cookies in there or in the db.
And even if you weren't caching externally like this, databases use up a lot of memory to cache tons of data. So even if the same query hits the db the second hit would probably still be hot in memory and return super fast. It's not double the load. At least with postgres this is the case and it's what Lemmy uses.
Definitely this. I use PostgreSQL (which Lemmy uses on the backend) for an enterprise-grade system that has anywhere from 700-1k users at any given point in time, and it also takes in several million messages from external systems throughout the day. PostgreSQL is excellent at caching data in memory. I've got the code for that system up in another window while I write this.
At this point in time, it doesn't look like Lemmy is using any form of an L2 cache like Redis or Memched. The only single point of failure (that's not horizontally scalable) looks like the pic-rs server that Lemmy is using for image hosting. If anything, that could easily be swapped over to use something S3 compatible and easily hosted using something like Minio locally, or even directly off of B2 or Linode cloud storage (doesn't charge for requests).