Lemmy Server Performance

1

2

Lemmy server admins, please enable pg_stat_statements on your PostgreSQL server so the Lemmy community can better identify the performance problems on the more active sites (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

I have been working with pg_stat_statements extension to PG and it give us a way to see the actual SQL statements being executed by lemmy_server and the number of times they are being called.

This has less overhead than cranking up logging and several cloud computing services enable it by default (example) - so I don't believe it will have a significant slow down of the server.

A DATABASE RESTART WILL BE REQUIRED

It does require that PostgreSQL be restarted. Which can take 10 or 15 seconds, typically.

Debian / Ubuntu install steps

https://pganalyze.com/docs/install/self_managed/02_enable_pg_stat_statements_deb

Following the conventions of "Lemmy from Scratch" server install commands:

sudo -iu postgres psql -c "ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';"

Followed by a restart of the PostgreSQL service.

2

9

Redis, Memcached, dragonfly for Lemmy and 2010 presentation on scaling.... (lemm.ee)

submitted 2 years ago by RoundSparrow@lemm.ee to c/lemmyperformance@lemmy.ml

3 comments fedilink

Lemmy is incredibly unique in it's stance of not using Redis, Memcached, dragonfly... something. And all the CPU cores and RAM for what this week is reported as 57K active users across over 1200 Instance servers.

Why no Redis, Memcached, dragonfly? These are staples of API for scaling.

Anyway, Reddit too started with PostgreSQL and was open source.

MONDAY, MAY 17, 2010

http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html

"and growing Reddit to 7.5 million users per month"

Lesson 5: Memcache
The essence of this lesson is: memcache everything.

They store everything in memcache: 1. Database data 2. Session data 3. Rendered pages 4. Memoizing (remember previously calculated results) internal functions 5. Rate-limiting user actions, crawlers 6. Storing pre-computing listings/pages 7. Global locking.

They store more data now in Memcachedb than Postgres. It’s like memcache but stores to disk. Very fast. All queries are generated by same piece of control and is cached in memcached. Change password Links and associated state are cached for 20 minutes or so. Same for Captchas. Used for links they don’t want to store forever.

They built memoization into their framework. Results that are calculated are also cached: normalized pages, listings, everything.

3

4

post inclusion, a solid WHERE clause filter before any JOIN on SELECT post (listing of posts) (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

20 comments fedilink

4

5

lemmy Server Performance backwards from PostgreSQL data to Rust code and TRIGGER FUNCTION logic, listed observations (bulletintree.com)

submitted 2 years ago by RoundSparrow@bulletintree.com to c/lemmyperformance@lemmy.ml

0 comments fedilink

post primary key has gaps in it, the sequence is being used for transactions that are later canceled or some kind of orphan posts? This is observable from incoming federation posts from other instances.
comment_aggregates has a unique id column from comment table id column. There is a one to one row relationship. Can the logic be reworked to eliminate the extra column and related INDEX overhead?
Related to 2... Same issue probably exist in post_aggregates table and others that have one to one join relationships.

5

4

Lemmy Server Performance, PostgreSQL log_min_duration_statement - can big servers share logs? value 2500ms targeting "slow queries" (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

1 comments fedilink

This is the first post or comment in Lemmy history to say log_min_duration_statement ... ;)

It is possible to instruct PostgreSQL to log any query that takes over a certain amount of time, 2.5 seconds what I think would be a useful starting point. Minimizing the amount of logging activity to only those causing the most serious issues.

"Possibly the most generally useful log setting for troubleshooting performance, especially on a production server. Records only long-running queries for analysis; since these are often your "problem" queries, these are the most useful ones to know about. Used for pg_fouine." - https://postgresqlco.nf/doc/en/param/log_min_duration_statement/

I think it would really help if we could get lemmy.world or some other big site to turn on this logging and share it so we can try to better reproduce the performance overloads on development/testing systems. Thank you.

6

11

Code to Stress Test Lemmy for performance (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

3 comments fedilink

I thought some people were out there in June creating stress-testing scripts, but I haven't seen anything materializing/showing results in recent weeks?

I think it would be useful to have an API client that establishes some baseline performance number that can be run before a new release of Lemmy and at least ensure there is no performance regression?

The biggest problem I have had since day 1 is not being able to reproduce the data that lemmy.ml has inside. There is a lot of older content stored that does not get replicated, etc.

The site_aggregates UPDATE statement lacking a WHERE clause and hitting 1500 rows (number of known Lemmy instances) of data instead of 1 row is exactly the kind of data-centered problem that has slipped through the cracks. That was generating a ton of extra PostgreSQL I/O for every new comment and post from a local user.

The difficult things to take on:

Simulating 200 instances instead of just 5 that the current API testing code does. First, just to have 200 rows in many of the instance-specific tables so that local = false API calls are better exercised. And probably about 25 of those instances have a large number of remote subscribers to communities.
async federation testing. The API testing in lemmy right now does immediate delivery with the API call so you don't get to find out the tricky cases of servers being unreachable.
Bulk loading of data. On one hand it is good to exercise the API by inserting posts and comments one at a time, but maybe loading data directly into the PostgreSQL backend would speed up development and testing?
The impact of scheduled jobs such as updates to certain aggregate data and post ranking for sorting. We may want to add special API feature for testing code to trigger these on-demand to stress test that concurrency with PostgreSQL isn't running into overloads.
Historically, there have been changes to the PostgreSQL table layout and indexes (schema) with new versions of Lemmy, which can take significant time to execute on a production server with existing data. Some kind of expectation for server operators to know how long an upgrade can take to modify data.
Searching on communities, posts, comments with significant amounts of data in PostgreSQL. Scanning content of large numbers of posts and comments can be done by users at any time.
non-Lemmy federated content in database. Possible performance and code behavior that arises from Mastodon and other non-Lemmy interactions.

I don't think it would be a big deal if the test takes 30 minutes or even longer to run.

And I'll go out and say it: Is a large Lemmy server willing to offer a copy of their database for performance troubleshooting and testing? Lemmy.ca cloned their database last Sunday which lead to the discovery of site_aggregates UPDATE without WHERE problem. Maybe we can create a procedure of how to remove private messages and get a dump once a month from a big server to analyze possible causes of PostgreSQL overloads? This may be a faster path than building up from-scratch with new testing logic.

7

28

How lemmy.ca took on finding why PostgreSQL was at 100% CPU and crashing the Lemmy website this past weekend. PostgreSQL auto_explain (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

6 comments fedilink

IIRC, it was lemmy.ca full copy of live data that was used (copy made on development system, not live server - if I'm following). Shared Saturday July 22 on GitHub was this procedure:

...

Notable is the AUTO_EXPLAIN SQL activation statements:

LOAD 'auto_explain';
SET auto_explain.log_min_duration = 0;
SET auto_explain.log_analyze = true;
SET auto_explain.log_nested_statements = true;

This technique would be of great use for developers doing changes and study of PostgreSQL activity. Thank you!

8

3

Lemmy Server optimization of PostgreSQL by focusing on community ownership of a post - and visibility of a post in a single control field. Also local flag transition to instance_id field, with value 1 (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

5 comments fedilink

Right now querying posts has logic like this:

WHERE (((((((((("community"."removed" = $9) AND ("community"."deleted" = $10)) AND ("post"."removed" = $11)) AND ("post"."deleted" = $12)) AND (("community"."hidden" = $13)

Note that a community can be hidden or deleted, separate fields. And it also has logic to see if the creator of the post is banned in the community:

LEFT OUTER JOIN "community_person_ban" ON (("post"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post"."creator_id"))

And there is both a deleted boolean (end-user delete) and removed boolean (moderator removed) on a post.

Much of this also applies to comments. Which are also owned by the post, which are also owned by the community.

9

6

Lemmy Server Performance - is this Rust code reading the entire table of all site languages? Does it need to? (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

1 comments fedilink

in Communities create community/edit community there is a SiteLanguage::read with no site_id, should that call to read have site_id = 1?

For reference, on my production instance my site_language table has 198460 rows and my site table has 1503 rows. Average of 132 languages per site. counts: https://lemmyadmin.bulletintree.com/query/pgcounts?output=table

10

17

Lemmy Server Performance, two birds with one stone, mass deletes of content: Account Delete, Community Delete, User Ban - moving these operations to a linear queue and add undo support / grace period (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

9 comments fedilink

A general description of the proposed change and reasoning behind it is on GitHub: https://github.com/LemmyNet/lemmy/issues/3697

Linear execution of these massive changes to votes/comments/posts with concurrency awareness. Also adds a layer of social awareness, the impact on a community when a bunch of content is black-holed.

An entire site federation delete / dead server - also would fall under this umbrella of mass data change with a potential for new content ownership/etc.

11

1

Lemmy Server and language choices on every individual comment, many rows in the database per-community, per-site, etc. Overheard of a comment INSERT SQL (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

3 comments fedilink

Over a short period of time, this is my incoming federation activity for new comments. pg_stat_statements output being show. It is interesting to note these two INSERT statements on comments differ only in the DEFAULT value of language column. Also note the average execution times is way higher (4.3 vs. 1.28) when the language value is set, I assume due to INDEX updates on the column? Or possibly a TRIGGER?

About half of the comments coming in from other servers have default value.

WRITES are heavy, even if it is an INDEX that has to be revised. So INSERT and UPDATE statements are important to scrutinize.

12

3

REQUEST community review of Lemmy Server Performance on Post Votes, Comment Votes - the most frequent database writes. Optimize? (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

2 comments fedilink

Given how frequent these records are created, every vote by a user, I think it is important to study and review how it works.

The current design of lemmy_server 0.18.3 is to issue a SQL DELETE before (almost?) every INSERT of a new vote. The INSERT already has an UPDATE clause on it.

This is one of the few places in Lemmy that a SQL DELETE statement actually takes place. We have to be careful triggers are not firing multiple times, such as decreasing the vote to then immediately have it increase with the INSERT statement that comes later.

For insert of a comment, Lemmy doesn't seem to routinely run a DELETE before the INSERT. So why was this design chosen for votes? Likely the reason is because a user can "undo" a vote and have the record of them ever voting in the database removed. Is that the actual behavior in testing?

pg_stat_statements from an instance doing almost entirely incoming federation activity of post/comments from other instances:

DELETE FROM "comment_like" WHERE (("comment_like"."comment_id" = $1) AND ("comment_like"."person_id" = $2)) executed 14736 times, with 607 matching records.
INSERT INTO "comment_like" ("person_id", "comment_id", "post_id", "score") VALUES ($1, $2, $3, $4) ON CONFLICT ("comment_id", "person_id") DO UPDATE SET "person_id" = $5, "comment_id" = $6, "post_id" = $7, "score" = $8 RETURNING "comment_like"."id", "comment_like"."person_id", "comment_like"."comment_id", "comment_like"."post_id", "comment_like"."score", "comment_like"."published" executed 15883 times - each time transacting.
update comment_aggregates ca set score = score + NEW.score, upvotes = case when NEW.score = 1 then upvotes + 1 else upvotes end, downvotes = case when NEW.score = -1 then downvotes + 1 else downvotes end where ca.comment_id = NEW.comment_id TRIGGER FUNCTION update executing 15692 times.
update person_aggregates ua set comment_score = comment_score + NEW.score from comment c where ua.person_id = c.creator_id and c.id = NEW.comment_id TRIGGER FUNCTION update, same executions as previous.

There is some understanding to gain by the count of executions not being equal.

13

2

GREAT NEWS about Lemmy Server Performance, another major SQL mistake has been discovered today: every single comment & post create (INSERT) is updating ~1700 rows in the site_aggregates table (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

7 comments fedilink

Details here: https://github.com/LemmyNet/lemmy/issues/3165

This will VASTLY decrease the server load of I/O for PostgreSQL, as this mistaken code is doing writes of ~1700 rows (each known Lemmy instance in the database) on every single comment & post creation. This creates record-locking issues given it is writes, which are harsh on the system. Once this is fixed, some site operators will be able to downgrade their hardware! ;)

14

1

Lemmy scaling/performance: Move expensive PostgreSQL triggers to scheduled jobs. · GitHub Issue #3528 · LemmyNet/lemmy (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

15

1

Information Overload - Beehaw style - Beehaw (beehaw.org)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

16

1

lemmy_server Rust code now exposes internal metrics via Prometheus endpoint (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

17

1

Big Lemmy server lemmy.world has put into production critical performance fixes from a runaway SQL query identified by analyzing pg_stat_statements output - and greatly reduced their server overload! (lemmy.world)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

lemmy.world announcing to public that this is installed on their production server: https://lemmy.world/post/1061471

18

2

De-facto memory leak (lemmy.world)

submitted 2 years ago by DreadTowel@lemmy.world to c/lemmyperformance@lemmy.ml

2 comments fedilink

It looks like the lack of persistent storage for the federated activity queue is leading to instances running out of memory in a matter of hours. See my comment for more details.

Furthermore, this leads to data loss, since there is no other consistency mechanism. I think it might be a high priority issue, taking into account the current momentum behind growth of Lemmy...

19

1

lemmy PERFORMANCE CRISIS: Rust code in pull requests is starting to use moka caching crate (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

https://docs.rs/moka/latest/moka/

20

1

Lemmy PERFORMANCE CRISIS: popular instances with many remote instances following big communities could stop federating outbound for Votes on comments/posts - code path identified (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

I spent several hours tracing in production (updating the code a dozen times with extra logging) to identify the actual path the lemmy_server code uses for outbound federation of votes to subscribed servers.

Major popular servers, Beehaw, Leemy.world, Lemmy.ml - have a large number of instance servers subscribing to their communities to get copies of every post/comment. Comment votes/likes are the most common activity, and it is proposed that during the PERFORMANCE CRISIS that outbound vote/like sharing be turned off by these overwhelmed servers.

pull request for draft:

https://github.com/LemmyNet/lemmy/compare/main...RocketDerp:lemmy_comment_votes_nofed1:no_federation_of_votes_outbound0

EDIT: LEMMY_SKIP_FEDERATE_VOTES environment variable

21

1

PERFORMANCE OPTIMIZATION: lemmy_server Rust code all over database lookup: "= LocalSite::read" (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

Grep the lemmy server code for "= LocalSite::read" - and I find that even for a single vote by an end-user, it is doing an SQL query to the local site settings to see if downvotes are disabled.

Can some Rust programmers chime in here? Can we cache this in RAM and not fetch from SQL every time?

PostgreSQL is telling me that the 2nd most run query on my system, which is receiving incoming federation post/comment/votes, is this:

SELECT "local_site"."id", "local_site"."site_id", "local_site"."site_setup", "local_site"."enable_downvotes", "local_site"."enable_nsfw", "local_site"."community_creation_admin_only", "local_site"."require_email_verification", "local_site"."application_question", "local_site"."private_instance", "local_site"."default_theme", "local_site"."default_post_listing_type", "local_site"."legal_information", "local_site"."hide_modlog_mod_names", "local_site"."application_email_admins", "local_site"."slur_filter_regex", "local_site"."actor_name_max_length", "local_site"."federation_enabled", "local_site"."captcha_enabled", "local_site"."captcha_difficulty", "local_site"."published", "local_site"."updated", "local_site"."registration_mode", "local_site"."reports_email_admins" FROM "local_site" LIMIT $1

22

1

Admin tools (lemmy.mayes.io)

submitted 2 years ago by code@lemmy.mayes.io to c/lemmyperformance@lemmy.ml

0 comments fedilink

Has anyone come up with some admin tools to display anything helpful regarding servers like

Community federation status last sync pass fail etc

General db stats size etc

23

1

lemmy.world feedback is that 0.17.4 performs better than 0.18.1 - is this true? What is slower? (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

Specifically the database backend. Is the polling for notifications causing more database load? From my personal testing, lemmy.ml has the same performance problems with 0.17.4 that it does with 0.18.1 and I haven't seen anything in the code changes that are that significant with database.

24

1

lemmy_server API for Clients and Federation alike, concurrency self-awareness, load sheding, and self-tuning (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

Heavy loads have been brought up several times, major social events where people flock to social media. Such as a terrorist bombing, submarine sinking, earthquake, nuclear meltdown, celebrity airplane crash, etc.

Low-budget hosting to compete with the tech giants is also a general concern for the project. Trying not to have a server that uses tons of expensive resources during some peak usage.

Load shedding and self-tuning within the code and for SysOps to have some idea if their Lemmy server is nearing overload thresholds.

See comments:

25

1

lemmy-ui seems to be doing database search for Post Title matches while typing every single character or edit, for performance reasons suggest an option to disable this feature be available to admins (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

My concern is that Lemmy is not scaling and was not tested with enough postings in the database. These "nice to have" slick UI features might have worked when the quantity of postings was much smaller, but it puts a heavy real-time load on the database to search postings that keep growing in table size every day.

I also suggest that this kind of feature be discussed with smartphone app and laternate webapp creators - as it can really busy up a server dong text pattern matches on all prior posting content.