There is a problem with using cyrilic roster and group names
In mysql database:
{"name":"Проверка","groups":{"Ð¡Ð¾Ð±ÐµÑ ÐµÐ´Ð½Ð¸ÐºÐ¸":true},"subscription":"both"}
but should be:
{"name":"Проверка","groups":{"Собеседники":true},"subscription":"both"}
Seems, that there is some encoding problems with database communication
sql dump from prosody table:
CREATE TABLE IF NOT EXISTS `prosody` (
`host` text,
`user` text,
`store` text,
`key` text,
`type` text,
`value` mediumtext,
KEY `prosody_index` (`host`(20),`user`(20),`store`(20),`key`(20))
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
There was latin1 in character set client, at the moment when prosody
installed, now I set collactions and charsets, so it looks like this:
character set client utf8
character set connection utf8
character set database utf8
character set filesystem binary
character set results utf8
character set server utf8
character set system utf8
collation connection utf8_general_ci
collation database utf8_general_ci
collation server utf8_general_ci
I restart prosody server, and try to test. But unfortunately, problem
do not fix.
When I change wrong Проверка to correct Проверка in database,
in client it looks like ???????
MattJ
on
Thanks for the report. I'm looking into this.
I guess I know the answer, but could you give the output of `mysqld --help --verbose | grep ^default-character-set`?
Thanks!
Changes
owner MattJ
menkovich
on
110922 10:40:27 [Warning] '--default-character-set' is deprecated and will be removed in a future release. Please use '--character-set-server' instead.
110922 10:40:27 [Warning] '--default-collation' is deprecated and will be removed in a future release. Please use '--collation-server' instead.
110922 10:40:27 [Warning] Changed limits: max_open_files: 1024 max_connections: 886 table_cache: 64
110922 10:40:27 [Warning] Can't create test file /var/lib/mysql/apollo.lower-test
110922 10:40:27 [Warning] Can't create test file /var/lib/mysql/apollo.lower-test
mysqld: Can't change dir to '/var/lib/mysql/' (Errcode: 13)
110922 10:40:27 [Warning] One can only use the --user switch if running as root
110922 10:40:27 [Note] Plugin 'FEDERATED' is disabled.
mysqld: Table 'mysql.plugin' doesn't exist
110922 10:40:27 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
default-character-set utf8
menkovich
on
I have an report, that in latest version of sourcecode this bug was fixed(in gentoo prosody-9999 ebuild), I could not check this in my environment, but if you have additional information about problem - could you say which commit fixed this?
MattJ
on
Changes
tags Milestone-0.9 Status-Accepted
MattJ
on
Sorry for the long wait on this issue. I failed to reproduce it until recently. It seems the problem appears in Debian, but not in Ubuntu (at least in mine).
We are working on a fix for the problem, and it will be available in 0.9. It will also fix any existing data in the database automatically.
Note to myself: This also affects prosody-migrator.
MattJ
on
Changes
tags Status-Started
MattJ
on
Officially pushed to 0.10 for now, to avoid delaying the 0.9 release further.
Changes
tagsMilestone-0.9 Milestone-0.10
hill.sobaka
on
the problem is in util/sql.lua inside engine:setEncoding function
if driver == "MySQL" then
set_names_query = set_names_query:gsub(";$", " COLLATE 'utf8_bin';");
local ok, charsets = self:transaction(function()
return self:select"SELECT `CHARACTER_SET_NAME` FROM `information_schema`.`CHARACTER_SETS` WHERE `CHARACTER_SET_NAME` LIKE 'utf8%' ORDER BY MAXLEN DESC LIMIT 1;";
end);
local row = ok and charsets();
charset = row and row[1] or charset;
end
self.charset = charset;
return self:transaction(function() return self:execute(set_names_query:format(charset)); end);
what if i have two utf8 charset: utf8mb4 and utf8?
utf8-bin is not a valid collaction for ut8mb4 charset it should be utf8mb4_bin, but accroding to this code you will get a query "SET NAMES 'utf8mb4' COLLATE 'utf8_bin';" which is not valid and mysql just fallback to latin1.
the fix is:
if driver == "MySQL" then
- set_names_query = set_names_query:gsub(";$", " COLLATE 'utf8_bin';");
local ok, charsets = self:transaction(function()
return self:select"SELECT `CHARACTER_SET_NAME` FROM `information_schema`.`CHARACTER_SETS` WHERE `CHARACTER_SET_NAME` LIKE 'utf8%' ORDER BY MAXLEN DESC LIMIT 1;";
end);
local row = ok and charsets();
charset = row and row[1] or charset;
+set_names_query = set_names_query:gsub(";$", (" COLLATE '%s_bin';"):format(charset));
MattJ
on
Just thinking aloud... will that query always pick the '*_bin' collation?
hill.sobaka
on
this query will just select appropriate utf8*_bin collation. utf8mb4 - is for japanese language. hardcoding simple utf8 charset and utf8_bin collation would be enough, but i'm not egoistic, so i found a universal fix ))))
hill.sobaka
on
Another case and final fix. Well previously i wrote about wrong collation for utf8 and how to fix it, but it doesn't resolve problem totally. What will happen if after some time db connection goes wrong (ex. after some time of inactivity)? Evidently you will get smth like "mysql has gone away" and of course sql util honestly restores sql connection, BUT this time it just restores it and do not call engine:setEncoding function. So i think the best way will to call engine:setEncoding function just after
local dbh, err = DBI.Connect(
params.driver, params.database,
params.username, params.password,
params.host, params.port
);
if not dbh then return nil, err; end
dbh:autocommit(false); -- don't commit automatically
self.conn = dbh;
in engine:connect function, and making engine:setEncoding a private one.
MattJ
on
Hi folks, we pushed a bunch of changes to 0.10 to fix this issue (and automatically upgrade existing databases to the correct encoding).
If you have *not* manually changed Prosody's code or the database, there should be no lost or corrupt data as a result of this issue, the migration stage moves everything over to the correct encoding safely.
In any case we recommend you have backups of your database before performing any upgrades.
Right now the fixes are only in the 'sql2' module, but we will soon be renaming this to replace the original 'sql' module.
More information at https://groups.google.com/d/msg/prosody-users/QtUg7TgSAjk/m4yuiLMs-SQJ
I'm marking this issue as fixed, but feel free to contact us if you experience this issue after upgrading.
There is a problem with using cyrilic roster and group names In mysql database: {"name":"Проверка","groups":{"Ð¡Ð¾Ð±ÐµÑ ÐµÐ´Ð½Ð¸ÐºÐ¸":true},"subscription":"both"} but should be: {"name":"Проверка","groups":{"Собеседники":true},"subscription":"both"} Seems, that there is some encoding problems with database communication sql dump from prosody table: CREATE TABLE IF NOT EXISTS `prosody` ( `host` text, `user` text, `store` text, `key` text, `type` text, `value` mediumtext, KEY `prosody_index` (`host`(20),`user`(20),`store`(20),`key`(20)) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; There was latin1 in character set client, at the moment when prosody installed, now I set collactions and charsets, so it looks like this: character set client utf8 character set connection utf8 character set database utf8 character set filesystem binary character set results utf8 character set server utf8 character set system utf8 collation connection utf8_general_ci collation database utf8_general_ci collation server utf8_general_ci I restart prosody server, and try to test. But unfortunately, problem do not fix. When I change wrong Проверка to correct Проверка in database, in client it looks like ???????
Thanks for the report. I'm looking into this. I guess I know the answer, but could you give the output of `mysqld --help --verbose | grep ^default-character-set`? Thanks!
Changes110922 10:40:27 [Warning] '--default-character-set' is deprecated and will be removed in a future release. Please use '--character-set-server' instead. 110922 10:40:27 [Warning] '--default-collation' is deprecated and will be removed in a future release. Please use '--collation-server' instead. 110922 10:40:27 [Warning] Changed limits: max_open_files: 1024 max_connections: 886 table_cache: 64 110922 10:40:27 [Warning] Can't create test file /var/lib/mysql/apollo.lower-test 110922 10:40:27 [Warning] Can't create test file /var/lib/mysql/apollo.lower-test mysqld: Can't change dir to '/var/lib/mysql/' (Errcode: 13) 110922 10:40:27 [Warning] One can only use the --user switch if running as root 110922 10:40:27 [Note] Plugin 'FEDERATED' is disabled. mysqld: Table 'mysql.plugin' doesn't exist 110922 10:40:27 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it. default-character-set utf8
I have an report, that in latest version of sourcecode this bug was fixed(in gentoo prosody-9999 ebuild), I could not check this in my environment, but if you have additional information about problem - could you say which commit fixed this?
Sorry for the long wait on this issue. I failed to reproduce it until recently. It seems the problem appears in Debian, but not in Ubuntu (at least in mine). We are working on a fix for the problem, and it will be available in 0.9. It will also fix any existing data in the database automatically. Note to myself: This also affects prosody-migrator.
Officially pushed to 0.10 for now, to avoid delaying the 0.9 release further.
ChangesMilestone-0.9Milestone-0.10the problem is in util/sql.lua inside engine:setEncoding function if driver == "MySQL" then set_names_query = set_names_query:gsub(";$", " COLLATE 'utf8_bin';"); local ok, charsets = self:transaction(function() return self:select"SELECT `CHARACTER_SET_NAME` FROM `information_schema`.`CHARACTER_SETS` WHERE `CHARACTER_SET_NAME` LIKE 'utf8%' ORDER BY MAXLEN DESC LIMIT 1;"; end); local row = ok and charsets(); charset = row and row[1] or charset; end self.charset = charset; return self:transaction(function() return self:execute(set_names_query:format(charset)); end); what if i have two utf8 charset: utf8mb4 and utf8? utf8-bin is not a valid collaction for ut8mb4 charset it should be utf8mb4_bin, but accroding to this code you will get a query "SET NAMES 'utf8mb4' COLLATE 'utf8_bin';" which is not valid and mysql just fallback to latin1. the fix is: if driver == "MySQL" then - set_names_query = set_names_query:gsub(";$", " COLLATE 'utf8_bin';"); local ok, charsets = self:transaction(function() return self:select"SELECT `CHARACTER_SET_NAME` FROM `information_schema`.`CHARACTER_SETS` WHERE `CHARACTER_SET_NAME` LIKE 'utf8%' ORDER BY MAXLEN DESC LIMIT 1;"; end); local row = ok and charsets(); charset = row and row[1] or charset; +set_names_query = set_names_query:gsub(";$", (" COLLATE '%s_bin';"):format(charset));
Just thinking aloud... will that query always pick the '*_bin' collation?
this query will just select appropriate utf8*_bin collation. utf8mb4 - is for japanese language. hardcoding simple utf8 charset and utf8_bin collation would be enough, but i'm not egoistic, so i found a universal fix ))))
Another case and final fix. Well previously i wrote about wrong collation for utf8 and how to fix it, but it doesn't resolve problem totally. What will happen if after some time db connection goes wrong (ex. after some time of inactivity)? Evidently you will get smth like "mysql has gone away" and of course sql util honestly restores sql connection, BUT this time it just restores it and do not call engine:setEncoding function. So i think the best way will to call engine:setEncoding function just after local dbh, err = DBI.Connect( params.driver, params.database, params.username, params.password, params.host, params.port ); if not dbh then return nil, err; end dbh:autocommit(false); -- don't commit automatically self.conn = dbh; in engine:connect function, and making engine:setEncoding a private one.
Hi folks, we pushed a bunch of changes to 0.10 to fix this issue (and automatically upgrade existing databases to the correct encoding). If you have *not* manually changed Prosody's code or the database, there should be no lost or corrupt data as a result of this issue, the migration stage moves everything over to the correct encoding safely. In any case we recommend you have backups of your database before performing any upgrades. Right now the fixes are only in the 'sql2' module, but we will soon be renaming this to replace the original 'sql' module. More information at https://groups.google.com/d/msg/prosody-users/QtUg7TgSAjk/m4yuiLMs-SQJ I'm marking this issue as fixed, but feel free to contact us if you experience this issue after upgrading.
Changes