#1995 Failure to send <stream:stream> to components if they sent theirs in multiple packets
Reporter
Link Mauve
Owner
Nobody
Created
Updated
Stars
★★ (2)
Tags
Type-Defect
Priority-Medium
Status-New
Link Mauve
on
What steps will reproduce the problem?
1. Run tokio-xmpp’s echo_component example against a local Prosody
2. To make extra sure the kernel won’t merge the TCP packets, set `RUST_LOG=trace` or run it in strace.
What is the expected output?
The component would connect, send the stream:stream, then receive the server’s stream:stream.
What do you see instead?
The component connects, sends the stream:stream, then timeouts waiting for Prosody’s which never arrives.
What version of the product are you using? On what operating system?
- Prosody 55a7143a58ec
- tokio-xmpp e72394c658c243e9f6d34022fa775a827a4b12fb.
Please provide any additional information below.
We tested both the epoll and prosody-rs network backends, both reproduce the issue.
Jonas Schäfer
on
I can reproduce this against prosody 2ec517d8d43e.
strace from prosody clearly shows the separate reads:
epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1
recvfrom(22, "<stream:stream xmlns='jabber:com"..., 8192, 0, NULL, NULL) = 94
recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1
recvfrom(22, " to='test.localhost'", 8192, 0, NULL, NULL) = 20
recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1
recvfrom(22, " version='1.0'", 8192, 0, NULL, NULL) = 14
recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1
recvfrom(22, ">", 8192, 0, NULL, NULL) = 1
recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
To increase the hit rate while prosody was running under strace, I added 10ms sleeps between each send invocation in tokio-xmpp.
So the issue is indeed that prosody chokes when the pieces are read separately.
Jonas Schäfer
on
I added debug logging to xmppstream.lua:
diff --git a/util/xmppstream.lua b/util/xmppstream.lua
--- a/util/xmppstream.lua
+++ b/util/xmppstream.lua
@@ -10,6 +10,8 @@ local lxp = require "lxp";
local st = require "prosody.util.stanza";
local stanza_mt = st.stanza_mt;
+local log = require "prosody.util.logger".init("xmppstream");
+
local error = error;
local tostring = tostring;
local t_insert = table.insert;
@@ -72,6 +74,7 @@ local function new_sax_handlers(session,
local child_quota = 0;
local non_streamns_depth = 0;
function xml_handlers:StartElement(tagname, attr)
+ log("debug", "StartElement(%q, %q)", tagname, attr)
if stanza and #chardata > 0 then
-- We have some character data in the buffer
t_insert(stanza, t_concat(chardata));
@@ -311,7 +314,9 @@ local function new(session, stream_callb
n_outstanding_bytes = n_outstanding_bytes + #data;
end
local _parser = parser;
+ log("debug", "xmppstream:feed(%q)", data);
local ok, err = parse(_parser, data);
+ log("debug", "xmppstream:feed(..) -> %q, %q", ok, err);
if lxp_supports_bytecount and n_outstanding_bytes > stanza_size_limit then
return nil, "stanza-too-large";
end
The result (for a broken stream) is this:
xmppstream debug xmppstream:feed("<?xml version=\'1.0\' encoding=\'utf-8\'?>\n")
xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil)
xmppstream debug xmppstream:feed("<stream:stream xmlns=\'jabber:component:accept\' xmlns:stream=\'http://etherx.jabber.org/streams\'")
xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil)
xmppstream debug xmppstream:feed(" to=\'test.localhost\'")
xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil)
xmppstream debug xmppstream:feed(" version=\'1.0\'")
xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil)
xmppstream debug xmppstream:feed(">")
xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil)
The result (for a working stream) is:
xmppstream debug xmppstream:feed("<?xml version=\'1.0\' encoding=\'utf-8\'?>\n<stream:stream xmlns=\'jabber:component:accept\' xmlns:stream=\'http://etherx.jabber.org/streams\' to=\'test.localhost\' version=\'1.0\'>")
xmppstream debug StartElement("http://etherx.jabber.org/streams\001stream", {"to";"version";to="test.localhost";version="1.0"})
xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x557f378c8298]] , (nil)
So.... feeding the `>` alone somehow breaks stuff? Even in the working case the stream header is not read in a complete block.
Jonas Schäfer
on
As an experiment, I replaced util/xmppstream.lua and utli/stanza.lua with the experimental Rust implementation and that does not exhibit the issue (note that I'm not running the rust network backend, only the stanza impl).
That narrows down the issue to libexpat IMO.
What steps will reproduce the problem? 1. Run tokio-xmpp’s echo_component example against a local Prosody 2. To make extra sure the kernel won’t merge the TCP packets, set `RUST_LOG=trace` or run it in strace. What is the expected output? The component would connect, send the stream:stream, then receive the server’s stream:stream. What do you see instead? The component connects, sends the stream:stream, then timeouts waiting for Prosody’s which never arrives. What version of the product are you using? On what operating system? - Prosody 55a7143a58ec - tokio-xmpp e72394c658c243e9f6d34022fa775a827a4b12fb. Please provide any additional information below. We tested both the epoll and prosody-rs network backends, both reproduce the issue.
I can reproduce this against prosody 2ec517d8d43e. strace from prosody clearly shows the separate reads: epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1 recvfrom(22, "<stream:stream xmlns='jabber:com"..., 8192, 0, NULL, NULL) = 94 recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1 recvfrom(22, " to='test.localhost'", 8192, 0, NULL, NULL) = 20 recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1 recvfrom(22, " version='1.0'", 8192, 0, NULL, NULL) = 14 recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) epoll_wait(5, [{events=EPOLLIN, data=0x557100000016}], 256, 839999) = 1 recvfrom(22, ">", 8192, 0, NULL, NULL) = 1 recvfrom(22, 0x55711269cc58, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) To increase the hit rate while prosody was running under strace, I added 10ms sleeps between each send invocation in tokio-xmpp. So the issue is indeed that prosody chokes when the pieces are read separately.
I added debug logging to xmppstream.lua: diff --git a/util/xmppstream.lua b/util/xmppstream.lua --- a/util/xmppstream.lua +++ b/util/xmppstream.lua @@ -10,6 +10,8 @@ local lxp = require "lxp"; local st = require "prosody.util.stanza"; local stanza_mt = st.stanza_mt; +local log = require "prosody.util.logger".init("xmppstream"); + local error = error; local tostring = tostring; local t_insert = table.insert; @@ -72,6 +74,7 @@ local function new_sax_handlers(session, local child_quota = 0; local non_streamns_depth = 0; function xml_handlers:StartElement(tagname, attr) + log("debug", "StartElement(%q, %q)", tagname, attr) if stanza and #chardata > 0 then -- We have some character data in the buffer t_insert(stanza, t_concat(chardata)); @@ -311,7 +314,9 @@ local function new(session, stream_callb n_outstanding_bytes = n_outstanding_bytes + #data; end local _parser = parser; + log("debug", "xmppstream:feed(%q)", data); local ok, err = parse(_parser, data); + log("debug", "xmppstream:feed(..) -> %q, %q", ok, err); if lxp_supports_bytecount and n_outstanding_bytes > stanza_size_limit then return nil, "stanza-too-large"; end The result (for a broken stream) is this: xmppstream debug xmppstream:feed("<?xml version=\'1.0\' encoding=\'utf-8\'?>\n") xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil) xmppstream debug xmppstream:feed("<stream:stream xmlns=\'jabber:component:accept\' xmlns:stream=\'http://etherx.jabber.org/streams\'") xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil) xmppstream debug xmppstream:feed(" to=\'test.localhost\'") xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil) xmppstream debug xmppstream:feed(" version=\'1.0\'") xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil) xmppstream debug xmppstream:feed(">") xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x562e3294d268]] , (nil) The result (for a working stream) is: xmppstream debug xmppstream:feed("<?xml version=\'1.0\' encoding=\'utf-8\'?>\n<stream:stream xmlns=\'jabber:component:accept\' xmlns:stream=\'http://etherx.jabber.org/streams\' to=\'test.localhost\' version=\'1.0\'>") xmppstream debug StartElement("http://etherx.jabber.org/streams\001stream", {"to";"version";to="test.localhost";version="1.0"}) xmppstream debug xmppstream:feed(..) -> _[[Expat: 0x557f378c8298]] , (nil) So.... feeding the `>` alone somehow breaks stuff? Even in the working case the stream header is not read in a complete block.
As an experiment, I replaced util/xmppstream.lua and utli/stanza.lua with the experimental Rust implementation and that does not exhibit the issue (note that I'm not running the rust network backend, only the stanza impl). That narrows down the issue to libexpat IMO.
I ran a bisect on Expat and it pointed towards this commit: https://github.com/libexpat/libexpat/commit/9cdf9b8d77d5c2c2a27d15fb68dd3f83cafb45a1