#1523 MAM uses too much CPU

Reporter kousu
Owner Nobody
Created
Updated
Stars ★ (1)
Tags
  • Status-New
  • Type-Defect
  • Priority-Medium
  1. kousu on

    What steps will reproduce the problem? 1. Construct a large (~2 year) long MAM archive 2. Install and turn on Dino, which syncs the entire MAM archive https://github.com/dino/dino/issues/521 3. Watch top and general connection quality. My MAM configuration is: ``` archive_expires_after = "3y" default_archive_policy = "roster" archive_store = "archive2" -- the old data, todo: migrate to 'archive' ``` What is the expected output? Prosody should remain stable for all my users. What do you see instead? My connections are disrupted (my peers report getting "recipient-unavailable" from their servers), my users bounce in and out of MUCs, and prosody has 70% - 98% CPU usage. What version of the product are you using? On what operating system? ArchLinux: # pacman -Qi prosody Name : prosody Version : 1:0.11.5-1 Description : Lightweight and extensible Jabber/XMPP server written in Lua Architecture : x86_64 URL : https://prosody.im/ Licenses : MIT Groups : None Provides : prosody Depends On : lua52 lua52-socket lua52-expat lua52-filesystem libidn openssl Optional Deps : lua52-sec: TLS encryption support [installed] lua52-dbi: SQL storage support Required By : None Optional For : None Conflicts With : prosody Replaces : None Installed Size : 1463.54 KiB Packager : Sergej Pupykin <pupykin.s+arch@gmail.com> Build Date : Wed 25 Mar 2020 03:51:44 PM CET Install Date : Sun 29 Mar 2020 12:00:39 AM CET Install Reason : Explicitly installed Install Script : Yes Validated By : Signature Please provide any additional information below. This bug is mostly on Dino's side; they're going to work on reducing the load soon. But it shouldn't be possible for one user with a large archive to disrupt an entire server. You could imagine someone making a malicious bot that fills up their MAM archive and then re-requests it all, from several clients, all at once. Here's the log from a single isolated client on my server, which is enough to spike the CPU: Apr 06 16:43:05 comms2.kousu.ca prosody[50073]: kousu.ca:mam: Archive query, id nil with anyone from the dawn of time until now) # sleeps here for ~2s Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: kousu.ca:mam: Archive query nil completed Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <iq to='nick@kousu.ca/dino.106396' id='ec2fe0fb-b63d-431d-97cf-5cda2023cdf8' type='result'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Received[c2s]: <a xmlns='urn:xmpp:sm:3' h='751'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: #queue = 21 Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Queuing <r> (in a moment) from handle_a - #queue=21 Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending <r> (inside timer, before send) from handle_a - #queue=22 Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <r xmlns='urn:xmpp:sm:3'> Apr 06 16:43:09 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending <r> (inside timer, after send) from handle_a - #queue=22 Apr 06 16:43:10 comms2.kousu.ca prosody[50073]: c2s564227361440: Handled 103 incoming stanzas Apr 06 16:43:10 comms2.kousu.ca prosody[50073]: c2s564227361440: Received[c2s]: <iq type='set' id='a56482f6-8137-4b3c-9082-79992e6d3121'> # loops here: Apr 06 16:43:10 comms2.kousu.ca prosody[50073]: kousu.ca:mam: Archive query, id nil with anyone from the dawn of time until now) # sleeps here for ~2s Apr 06 16:43:14 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:14 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> Apr 06 16:43:14 comms2.kousu.ca prosody[50073]: c2s564227361440: Sending[c2s]: <message to='nick@kousu.ca/dino.106396'> [...] This seems to be a simple paged download as designed, since the log rate is constant and the iq at the end looks like this: <iq to='nick@kousu.ca/dino.106396' id='ec2fe0fb-b63d-431d-97cf-5cda2023cdf8' type='result'> <fin xmlns='urn:xmpp:mam:2'> <set xmlns='http://jabber.org/protocol/rsm'> <first> h4xvoaXgQkywVLGE </first> <count> 136716 </count> <last> D7Uu42gOgs9QfxsO </last> </set> </fin> </iq> I see https://prosody.im/doc/modules/mod_mam warns "Setting large number is not recommended, as Prosody will be blocked while processing the request and will not be able to do anything else.", but I have it set to the default page size of 20 and I'm still having problems, because my client decided to download the entire archive. Could MAM downloading be simply rate-limited?

  2. Menel on

    The new c2s limits reslove this for me..(I think its them) But of course that's not a real fix.

  3. Stephen Paul Weber on

    I had similar issues until switching to a sql storage backend

  4. kousu on

    Several months ago I also switched to ``` storage = "sql" ``` I used https://prosody.im/doc/migrator to move everything from "archive2" to "sql" and now everything is much lower-resource. Phew! Prosody was getting kind of unusable.

New comment

Not published. Used for spam prevention and optional update notifications.