#1545 mod_mam and mod_muc_mam lag significantly on internal storage
Reporter
ge0rg
Owner
Zash
Created
Updated
Stars
★ (1)
Tags
Type-Defect
Status-Fixed
Milestone-0.12
Priority-Medium
ge0rg
on
Running prosody-trunk with mod_mam and mod_muc_mam (both built-in), the server will lag significantly (read: multiple seconds) each time a user or MUC hits the MAM quota, which causes the history to be "shifted" by one item, causing a full rewrite of the backend file.
One of the affected files on my server is 52MB, and it takes 3~5s of blocking wall-clock time to load, deserialize, remove the first item, serialize, and to store it back. On *each* *new* *message* to that MUC:
May 05 16:53:13 chat.yax.im:storage_internal debug someroom reached or over quota, not adding to store
May 05 16:53:13 chat.yax.im:muc_mam debug User 'someroom' over quota, truncating archive
May 05 16:53:15 chat.yax.im:storage_internal debug someroom has 9999 items out of 10000 limit in store muc_log
This is a very nice DoS to the server, which can't do anything else during that activity. While I understand that storage_internal is not your favorite child, it looks like it's the default (which it shouldn't be), and the modules don't refuse to load on it (which they probably should).
A quick workaround would be to change the quota enforcement from one-message-at-a-time to something like 1%-a-time or 5%-a-time, here:
https://hg.prosody.im/trunk/file/tip/plugins/mod_muc_mam.lua#l400 and in the respective place in mod_mam.
Thank you.
I haven't noticed lags so far, but OTOH I have grepped my log file of the last two weeks for "reached or over quota, not adding to store" from the MUC component and there were zero cases. Did anything else change in the last weeks affecting the storage? At least I can't see the MUC domain in my sqlite file...
Zash
on
I forget if we figured this out in a chat somewhere, but you were using internal storage according to the log snippet in the initial comment
> chat.yax.im:storage_internal
I'm going to assume that the current 1% deletion strategy is good enough for now.
If a config option is desired, file a feature request for that.
Running prosody-trunk with mod_mam and mod_muc_mam (both built-in), the server will lag significantly (read: multiple seconds) each time a user or MUC hits the MAM quota, which causes the history to be "shifted" by one item, causing a full rewrite of the backend file. One of the affected files on my server is 52MB, and it takes 3~5s of blocking wall-clock time to load, deserialize, remove the first item, serialize, and to store it back. On *each* *new* *message* to that MUC: May 05 16:53:13 chat.yax.im:storage_internal debug someroom reached or over quota, not adding to store May 05 16:53:13 chat.yax.im:muc_mam debug User 'someroom' over quota, truncating archive May 05 16:53:15 chat.yax.im:storage_internal debug someroom has 9999 items out of 10000 limit in store muc_log This is a very nice DoS to the server, which can't do anything else during that activity. While I understand that storage_internal is not your favorite child, it looks like it's the default (which it shouldn't be), and the modules don't refuse to load on it (which they probably should). A quick workaround would be to change the quota enforcement from one-message-at-a-time to something like 1%-a-time or 5%-a-time, here: https://hg.prosody.im/trunk/file/tip/plugins/mod_muc_mam.lua#l400 and in the respective place in mod_mam. Thank you.
Thanks. https://hg.prosody.im/trunk/rev/62794e065e33 sets the MAM modules to delete 1% of messages. I'll leave this as Started while we evaluate how that performs.
ChangesSo, how does it behave?
Changesstatus=needinfoStatus-NeedInfoI haven't noticed lags so far, but OTOH I have grepped my log file of the last two weeks for "reached or over quota, not adding to store" from the MUC component and there were zero cases. Did anything else change in the last weeks affecting the storage? At least I can't see the MUC domain in my sqlite file...
I forget if we figured this out in a chat somewhere, but you were using internal storage according to the log snippet in the initial comment > chat.yax.im:storage_internal I'm going to assume that the current 1% deletion strategy is good enough for now. If a config option is desired, file a feature request for that.
Changes