#1986 mod_csi_simple broken in smacks hibernation mode
Reporter
Thilo Molitor
Owner
Nobody
Created
Updated
Stars
★★★ (4)
Tags
Priority-Medium
Status-New
Type-Defect
Thilo Molitor
on
What steps will reproduce the problem?
1. put your client in csi inactive mode (for Monal: just put it into the background)
2. hibernate the smacks session of your client (for Monal: just wait > 30 seconds after putting it into the background)
3. send a chatstate typing message stanza using any other client
4. your client will receive a push although the csi queue should hold the typing chatstate back as not important
What is the expected output?
No push, because csi should queue the stanza and not put it into the smacks queue
What do you see instead?
Push, because the stanza reaches the smacks queue
What version of the product are you using? On what operating system?
Prosody trunk/0.13 (not sure, I'm sure you know what Ge0rG is running better than me) on yax.im
Thilo Molitor
on
Ge0rG also has a detailed logfile, if you need more infos.
MattJ
on
mod_csi_simple is working as designed. It operates between mod_smacks and the network. Therefore this is not a bug in mod_csi_simple.
That's not to say we're not interested in fixing whatever the root problem is for you. My understanding (based on conversations a couple of years ago) is that Monal *wanted* more push notifications, so it could maintain a connection to the server. Based on this, you're saying you don't want pushes for typing notifications, or you only don't want them while in the background?
The claims about battery life being affected are likely exaggerated. Most typing notifications are followed by a real message within seconds, and that would generate a push notification regardless of foreground/background state. So the client wakes up slightly sooner before you receive a message. I don't think that's going to turn into "significant battery drain".
Maybe you could clarify the behaviour you expect from Prosody. It's difficult because Monal is basically unique among iOS clients in its approach towards push notifications and hibernation. So it would be helpful to better understand the requirements.
Thilo Molitor
on
Its not only typing notifications but presences etc. as well (when I initially reported this issue, I didn't yet read the source and wasn't sure what caused this, so I reported only the typing notification case I used to reproduce the issue).
Csi is basically off, once prosody is in smacks hibernation mode.
And of course Monal doesn't need pushes for things it would also not receive if the TCP connection was alive.
So the requirement is to have csi behave the same for TCP alive versus smacks hibernation.
That's what the csi implementation of ejabberd does as well.
Other apps might not want pushes for IQ stanzas (Monal does), but the distinction between "important/unimportant in mod push" versus "important/unimportant" in csi is different and of course should be different. You don't want a high priority push for IQs, but of course you want IQs to flush the csi queue.
MattJ
on
Thanks for the explanation.
If Monal is offline, it is almost certainly in the background, and therefore CSI inactive, right?
In other words: is there a reason to ever push unimportant things such as typing notifications?
My reasoning is that If Monal is CSI active it's in the foreground, and it should be connected (or connecting/resuming), and can receive everything via the XMPP connection in the usual way.
If it's connected but CSI inactive, traffic optimizations will apply as usual.
When it's disconnected, it's either in the foreground (maybe a network issue) and can resume when the network comes back, and catch up with all stanzas via XMPP. If it's in the background, it doesn't want to receive the unimportant things anyway.
Is my logic correct? Thanks in advance - I'm just trying to get my mental model set up correctly.
Here's the flow I'm thinking of, visualized: https://matthewwild.co.uk/uploads/xmpp-stanza-push-flow.png
Thilo Molitor
on
> If Monal is offline, it is almost certainly in the background, and therefore CSI inactive, right?
Correct.
> In other words: is there a reason to ever push unimportant things such as typing notifications?
No, there is not, but the definition of "important"/"unimportant" is...ahem...important. Different parts of our protocol define different things as "important".
For our push module, important means usually "is a message stanza with body", everything else is "unimportant" and triggers a push marked as unimportant (push servers can then use that distinction to trigger a low priority background push or don't trigger a push at all etc. rather than triggering a high priority push).
On the other hand, for csi important equally means "is a message stanza with body", but also "is an outgoing display marker from another device", "is an iq", "is some encrypted message stanza (eme)", "is an MDS update" etc. (non exhaustive list, see my csi_battery_saver module for a more complete list).
> My reasoning is that If Monal is CSI active it's in the foreground, and it should be connected (or connecting/resuming), and can receive everything via the XMPP connection in the usual way.
Yes, right, it would be CSI-active (Monal sends csi-active as soon as it either brought to foreground (and still connected) or directly after resuming the smacks session), so no CSI filtering needed nor wanted.
> If it's connected but CSI inactive, traffic optimizations will apply as usual.
Yes, also correct. As soon as Monal is put into background, it sends a CSI-inactive nonza. And we don't want typing notifications or other stuff like presences needlessly pushed through the active tcp connection .
> When it's disconnected, it's either in the foreground (maybe a network issue) and can resume when the network comes back, and catch up with all stanzas via XMPP. If it's in the background, it doesn't want to receive the unimportant things anyway.
Yes, that's why the CSI-queue should hold up unimportant things so that these don't trigger a push (important/unimportant distinction for pushes doesn't matter for Monal, even an "umportant" push triggered by an IQ will wake up Monal). You have to make an important distinction between "offline" and "smcaks hibernation", see below.
> Is my logic correct? Thanks in advance - I'm just trying to get my mental model set up correctly.
Yes, it seems to be correct, but not fully describing the interaction between the push module and smacks or mam. Let me try to explain this interaction in more detail:
If Monal is disconnected (e.g. no active tcp connection), it can be either in smacks hibernation or "really" offline. If it is really offline, pushes will be triggered by stanzas reaching the mam archive. Presences and messages NOT reaching the mam archive won't trigger any pushes. We don't need CSI filtering here, because usually only CSI-important stuff ever reaches the mam archive.
If Monal is in smacks hibernation things are different: Every stanza hitting the smacks queue will trigger a push (the push module will decide if its an important body-message or something unimportant like an IQ and trigger an important/unimportant push accordingly). And that's the thing broken in prosody right now: prosody does not do any CSI-queueing while in smacks hibernation state. Your flow diagram shows that clearly. But smacks is designed as "a virtual connection, even if the underlying tcp connection is flaky and might be disconnected for some time". Thus smacks hibernation should behave *exactly* as if the tcp connection was connected (the online branch of your diagram should be replicated there, but instead of ending in the "send stanza" block it should end in "Queue (SM)".
Sidenotes:
the push module defines an "urgent stanza" that triggers a push as soon as reaching the smacks queue even if the underlying tcp connection *seems* to be still active. Without that, the push would only be triggered once the underlying tcp connection times out (which could take several minutes) and smacks enters hibernated state (which will trigger at most one important and one unimportant push for all still unacked stanzas in the smacks queue).
An optimisation would be to only trigger that push after, say, 5 seconds without the JMI stanza not being smacks-acked.
Currently the smacks module will trigger an smacks-ack-delayed event after smacks_max_ack_delay seconds (default: 30 seconds), so without that special JMI urgent handling of the push module, a stuck tcp connection would still trigger a push after 30 seconds, even if the tcp connection didn't time out yes (and might even recover).
Setting that smacks_max_ack_delay to a lower value, say 5 seconds would trigger a push for stuck tcp connections more or less without any noticable delay and would render the urgent handling of JMI in the push module unnecessary.
MattJ
on
Thanks. I think this is the main part I was overlooking:
> If Monal is in smacks hibernation things are different: Every stanza hitting the smacks queue will trigger a push (the push module will decide if its an important body-message or something unimportant like an IQ and trigger an important/unimportant push accordingly).
If Monal is in hibernation mode, it's also very likely to be in the background (or if it is a genuine network disconnect, it may be in the foreground but it will reconnect soon). Therefore we can assume it is equivalent to CSI-inactive if it is hibernating.
In that case, why would we ever send non-important stanzas via push? It seems like we should just remove the code that does that?
Thilo Molitor
on
> Therefore we can assume it is equivalent to CSI-inactive if it is hibernating.
I'd even assume it remains in the exact same CSI state it was in before hibernation started.
> In that case, why would we ever send non-important stanzas via push? It seems like we should just remove the code that does that?
What definition of "unimportant" are you referring here? CSI-unimportant or Push-unimportant? This is very different and we need to make sure to keep our terminology right here.
To explain a bit more: XEP-0357 originally did not have a distinction between "important" and "unimportant". It then occurred, that apps like Siskin (and by extension Snikket iOS) need a distinction between "pushes triggered by something transporting a message body" and "pushes triggered by something else" (for example an IQ).
I/we dubbed the first type "important" while we called the second type "unimportant". But maybe that's a bit misleading: a push triggered by an IQ isn't unimportant in the sense that it could be ignored. Its just so that Siskin uses a push design forcing the app to show a notification for every incoming push delivered through Apple's high priority push lane not subject to throttling, so that Siskin (and no other app, neither Monal, nor Conversations or anything else) needed a distinction between "important" and "unimportant". We (I and Holger on the ejabberd side) then decided that we could shoehorn this distinction into the current XEP by using the "last - message - body" of the XEP and fill it with a dummy message (not the real one for privacy reasons) in the "important" case and don't submit it to the appserver in the "unimportant" case. That way Siskin was happy and could display a user-facing notification for those "important" pushes while using another background push mechanism thats unreliable for all other pushes (or ignore them altogether).
But that's a heuristic, it might be right in many cases, but not necessarily in all: as I already talked about briefly at FOSDEM: some message-stanza based things like XEP-0333 or XEP-0490 read markers removing a notification on device 2 if I read the message on device 1 and all IQ based flows like XEP-0202 entity time (recently implemented by Daniel in Conversations) will never(!) work in Siskin using this push design. As I said: its a heuristic, but imho not a very good one.
Maybe we should change the terminology here and use "body-push" (previously called "important") and "normal-push" (previously called "unimportant") and leave the "important"/"unimportant" terminology for CSI exclusively.
As for "Monal does it different than all other iOS apps": I'd argue that there are only 2 iOS apps out there and Monal-based apps doing it one way while Siskin-based ones using the heuristic described above, doesn't constitute an "all other apps" ;)
For that matter: the Signal app on iOS uses exactly the same approach as Monal, you can check that in the code (WhatsApp, too, but obviously there isn't any publicly available code to check): the app leverages the notification filtering entitlement to always send pushes using Apple's high priority lane, then connecting to the signal server to retrieve the actual message (if any) and suppress the user-facing notification using the notification filtering entitlement, if needed.
So I'd argue that Monal's way is rather the standard way to do messaging on iOS and the Siskin way is the outlier here ;)
Signal references:
https://github.com/signalapp/Signal-iOS/blob/main/SignalNSE/SignalNSE-AppStore.entitlements#L7https://github.com/signalapp/Signal-iOS/blob/main/SignalNSE/NotificationService.swift#L160https://github.com/signalapp/Signal-iOS/blob/main/SignalNSE/NotificationService.swift#L121
(Using an empty UNMutableNotificationContent() like Signal an Monal do is only allowed if you were granted the "com.apple.developer.usernotifications.filtering" entitlement.)
So there are two concrete things I'm interested in:
1) Does anyone ever *need* unimportant pushes? If so, why?
Based on what you're saying, I think the answer is something like:
- Nobody needs pushes of unimportant stanzas
- Important != "has a body". Some body-less stanzas are important (outgoing markers, MDS, etc. ?)
- Siskin-style apps don't want any pushes of body-less stanzas (so the XMPP server or push gateway needs to be able to suppress them)
2) Are there things that we would want to push, but not trigger a CSI flush, or vice versa? (is one a subset of the other? or are both sets equal?)
Re. (2) I know you're separating push-important vs csi-important, but I'm curious if there really is a distinction there, or if we can unify the rules.
What steps will reproduce the problem? 1. put your client in csi inactive mode (for Monal: just put it into the background) 2. hibernate the smacks session of your client (for Monal: just wait > 30 seconds after putting it into the background) 3. send a chatstate typing message stanza using any other client 4. your client will receive a push although the csi queue should hold the typing chatstate back as not important What is the expected output? No push, because csi should queue the stanza and not put it into the smacks queue What do you see instead? Push, because the stanza reaches the smacks queue What version of the product are you using? On what operating system? Prosody trunk/0.13 (not sure, I'm sure you know what Ge0rG is running better than me) on yax.im
Ge0rG also has a detailed logfile, if you need more infos.
mod_csi_simple is working as designed. It operates between mod_smacks and the network. Therefore this is not a bug in mod_csi_simple. That's not to say we're not interested in fixing whatever the root problem is for you. My understanding (based on conversations a couple of years ago) is that Monal *wanted* more push notifications, so it could maintain a connection to the server. Based on this, you're saying you don't want pushes for typing notifications, or you only don't want them while in the background? The claims about battery life being affected are likely exaggerated. Most typing notifications are followed by a real message within seconds, and that would generate a push notification regardless of foreground/background state. So the client wakes up slightly sooner before you receive a message. I don't think that's going to turn into "significant battery drain". Maybe you could clarify the behaviour you expect from Prosody. It's difficult because Monal is basically unique among iOS clients in its approach towards push notifications and hibernation. So it would be helpful to better understand the requirements.
Its not only typing notifications but presences etc. as well (when I initially reported this issue, I didn't yet read the source and wasn't sure what caused this, so I reported only the typing notification case I used to reproduce the issue). Csi is basically off, once prosody is in smacks hibernation mode. And of course Monal doesn't need pushes for things it would also not receive if the TCP connection was alive. So the requirement is to have csi behave the same for TCP alive versus smacks hibernation. That's what the csi implementation of ejabberd does as well. Other apps might not want pushes for IQ stanzas (Monal does), but the distinction between "important/unimportant in mod push" versus "important/unimportant" in csi is different and of course should be different. You don't want a high priority push for IQs, but of course you want IQs to flush the csi queue.
Thanks for the explanation. If Monal is offline, it is almost certainly in the background, and therefore CSI inactive, right? In other words: is there a reason to ever push unimportant things such as typing notifications? My reasoning is that If Monal is CSI active it's in the foreground, and it should be connected (or connecting/resuming), and can receive everything via the XMPP connection in the usual way. If it's connected but CSI inactive, traffic optimizations will apply as usual. When it's disconnected, it's either in the foreground (maybe a network issue) and can resume when the network comes back, and catch up with all stanzas via XMPP. If it's in the background, it doesn't want to receive the unimportant things anyway. Is my logic correct? Thanks in advance - I'm just trying to get my mental model set up correctly. Here's the flow I'm thinking of, visualized: https://matthewwild.co.uk/uploads/xmpp-stanza-push-flow.png
> If Monal is offline, it is almost certainly in the background, and therefore CSI inactive, right? Correct. > In other words: is there a reason to ever push unimportant things such as typing notifications? No, there is not, but the definition of "important"/"unimportant" is...ahem...important. Different parts of our protocol define different things as "important". For our push module, important means usually "is a message stanza with body", everything else is "unimportant" and triggers a push marked as unimportant (push servers can then use that distinction to trigger a low priority background push or don't trigger a push at all etc. rather than triggering a high priority push). On the other hand, for csi important equally means "is a message stanza with body", but also "is an outgoing display marker from another device", "is an iq", "is some encrypted message stanza (eme)", "is an MDS update" etc. (non exhaustive list, see my csi_battery_saver module for a more complete list). > My reasoning is that If Monal is CSI active it's in the foreground, and it should be connected (or connecting/resuming), and can receive everything via the XMPP connection in the usual way. Yes, right, it would be CSI-active (Monal sends csi-active as soon as it either brought to foreground (and still connected) or directly after resuming the smacks session), so no CSI filtering needed nor wanted. > If it's connected but CSI inactive, traffic optimizations will apply as usual. Yes, also correct. As soon as Monal is put into background, it sends a CSI-inactive nonza. And we don't want typing notifications or other stuff like presences needlessly pushed through the active tcp connection . > When it's disconnected, it's either in the foreground (maybe a network issue) and can resume when the network comes back, and catch up with all stanzas via XMPP. If it's in the background, it doesn't want to receive the unimportant things anyway. Yes, that's why the CSI-queue should hold up unimportant things so that these don't trigger a push (important/unimportant distinction for pushes doesn't matter for Monal, even an "umportant" push triggered by an IQ will wake up Monal). You have to make an important distinction between "offline" and "smcaks hibernation", see below. > Is my logic correct? Thanks in advance - I'm just trying to get my mental model set up correctly. Yes, it seems to be correct, but not fully describing the interaction between the push module and smacks or mam. Let me try to explain this interaction in more detail: If Monal is disconnected (e.g. no active tcp connection), it can be either in smacks hibernation or "really" offline. If it is really offline, pushes will be triggered by stanzas reaching the mam archive. Presences and messages NOT reaching the mam archive won't trigger any pushes. We don't need CSI filtering here, because usually only CSI-important stuff ever reaches the mam archive. If Monal is in smacks hibernation things are different: Every stanza hitting the smacks queue will trigger a push (the push module will decide if its an important body-message or something unimportant like an IQ and trigger an important/unimportant push accordingly). And that's the thing broken in prosody right now: prosody does not do any CSI-queueing while in smacks hibernation state. Your flow diagram shows that clearly. But smacks is designed as "a virtual connection, even if the underlying tcp connection is flaky and might be disconnected for some time". Thus smacks hibernation should behave *exactly* as if the tcp connection was connected (the online branch of your diagram should be replicated there, but instead of ending in the "send stanza" block it should end in "Queue (SM)". Sidenotes: the push module defines an "urgent stanza" that triggers a push as soon as reaching the smacks queue even if the underlying tcp connection *seems* to be still active. Without that, the push would only be triggered once the underlying tcp connection times out (which could take several minutes) and smacks enters hibernated state (which will trigger at most one important and one unimportant push for all still unacked stanzas in the smacks queue). An optimisation would be to only trigger that push after, say, 5 seconds without the JMI stanza not being smacks-acked. Currently the smacks module will trigger an smacks-ack-delayed event after smacks_max_ack_delay seconds (default: 30 seconds), so without that special JMI urgent handling of the push module, a stuck tcp connection would still trigger a push after 30 seconds, even if the tcp connection didn't time out yes (and might even recover). Setting that smacks_max_ack_delay to a lower value, say 5 seconds would trigger a push for stuck tcp connections more or less without any noticable delay and would render the urgent handling of JMI in the push module unnecessary.
Thanks. I think this is the main part I was overlooking: > If Monal is in smacks hibernation things are different: Every stanza hitting the smacks queue will trigger a push (the push module will decide if its an important body-message or something unimportant like an IQ and trigger an important/unimportant push accordingly). If Monal is in hibernation mode, it's also very likely to be in the background (or if it is a genuine network disconnect, it may be in the foreground but it will reconnect soon). Therefore we can assume it is equivalent to CSI-inactive if it is hibernating. In that case, why would we ever send non-important stanzas via push? It seems like we should just remove the code that does that?
> Therefore we can assume it is equivalent to CSI-inactive if it is hibernating. I'd even assume it remains in the exact same CSI state it was in before hibernation started. > In that case, why would we ever send non-important stanzas via push? It seems like we should just remove the code that does that? What definition of "unimportant" are you referring here? CSI-unimportant or Push-unimportant? This is very different and we need to make sure to keep our terminology right here. To explain a bit more: XEP-0357 originally did not have a distinction between "important" and "unimportant". It then occurred, that apps like Siskin (and by extension Snikket iOS) need a distinction between "pushes triggered by something transporting a message body" and "pushes triggered by something else" (for example an IQ). I/we dubbed the first type "important" while we called the second type "unimportant". But maybe that's a bit misleading: a push triggered by an IQ isn't unimportant in the sense that it could be ignored. Its just so that Siskin uses a push design forcing the app to show a notification for every incoming push delivered through Apple's high priority push lane not subject to throttling, so that Siskin (and no other app, neither Monal, nor Conversations or anything else) needed a distinction between "important" and "unimportant". We (I and Holger on the ejabberd side) then decided that we could shoehorn this distinction into the current XEP by using the "last - message - body" of the XEP and fill it with a dummy message (not the real one for privacy reasons) in the "important" case and don't submit it to the appserver in the "unimportant" case. That way Siskin was happy and could display a user-facing notification for those "important" pushes while using another background push mechanism thats unreliable for all other pushes (or ignore them altogether). But that's a heuristic, it might be right in many cases, but not necessarily in all: as I already talked about briefly at FOSDEM: some message-stanza based things like XEP-0333 or XEP-0490 read markers removing a notification on device 2 if I read the message on device 1 and all IQ based flows like XEP-0202 entity time (recently implemented by Daniel in Conversations) will never(!) work in Siskin using this push design. As I said: its a heuristic, but imho not a very good one. Maybe we should change the terminology here and use "body-push" (previously called "important") and "normal-push" (previously called "unimportant") and leave the "important"/"unimportant" terminology for CSI exclusively. As for "Monal does it different than all other iOS apps": I'd argue that there are only 2 iOS apps out there and Monal-based apps doing it one way while Siskin-based ones using the heuristic described above, doesn't constitute an "all other apps" ;) For that matter: the Signal app on iOS uses exactly the same approach as Monal, you can check that in the code (WhatsApp, too, but obviously there isn't any publicly available code to check): the app leverages the notification filtering entitlement to always send pushes using Apple's high priority lane, then connecting to the signal server to retrieve the actual message (if any) and suppress the user-facing notification using the notification filtering entitlement, if needed. So I'd argue that Monal's way is rather the standard way to do messaging on iOS and the Siskin way is the outlier here ;) Signal references: https://github.com/signalapp/Signal-iOS/blob/main/SignalNSE/SignalNSE-AppStore.entitlements#L7 https://github.com/signalapp/Signal-iOS/blob/main/SignalNSE/NotificationService.swift#L160 https://github.com/signalapp/Signal-iOS/blob/main/SignalNSE/NotificationService.swift#L121 (Using an empty UNMutableNotificationContent() like Signal an Monal do is only allowed if you were granted the "com.apple.developer.usernotifications.filtering" entitlement.)
Some more context: I wrote a detailed summary of iOS push modes over here: https://wiki.xmpp.org/web/Push_notifications The VoIP push mode isn't used by Signal and Monal at all (both use reportNewIncomingVoIPPushPayload to wake up the main app and handle the incoming call): https://github.com/signalapp/Signal-iOS/blob/main/SignalNSE/NSECallMessageHandler.swift#L189 The notification filtering entitlement is available since December 2019 and not likely to go away soon, and it was specifically created for Instant Messaging apps and once even mentioned as such in Apple's own documentation over here (but they removed the Instant Messaging part later on and I couldn't find an archive.org entry dating back far enough to still prove that): https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.developer.usernotifications.filtering
So there are two concrete things I'm interested in: 1) Does anyone ever *need* unimportant pushes? If so, why? Based on what you're saying, I think the answer is something like: - Nobody needs pushes of unimportant stanzas - Important != "has a body". Some body-less stanzas are important (outgoing markers, MDS, etc. ?) - Siskin-style apps don't want any pushes of body-less stanzas (so the XMPP server or push gateway needs to be able to suppress them) 2) Are there things that we would want to push, but not trigger a CSI flush, or vice versa? (is one a subset of the other? or are both sets equal?) Re. (2) I know you're separating push-important vs csi-important, but I'm curious if there really is a distinction there, or if we can unify the rules.