Cannot initialize cmap service

Содержание

Failed to start Corosync Cluster Engine After PVE Reboot
Error
Fix
References

Failed to start Corosync Cluster Engine After PVE Reboot

Error

 # systemctl status pve-cluster
...
Jun 26 09:33:26 pve01 pmxcfs[3506]: [quorum] crit: quorum_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [quorum] crit: can't initialize service
Jun 26 09:33:26 pve01 pmxcfs[3506]: [confdb] crit: cmap_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [confdb] crit: can't initialize service
Jun 26 09:33:26 pve01 pmxcfs[3506]: [dcdb] crit: cpg_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [dcdb] crit: can't initialize service
Jun 26 09:33:26 pve01 pmxcfs[3506]: [status] crit: cpg_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [status] crit: can't initialize service
...
# journalctl -u corosync.service
...
Jun 26 09:26:17 pve01 corosync[1826]: [MAIN ] failed to parse node address 'pve01.xx.xx'
Jun 26 09:26:17 pve01 corosync[1826]: [MAIN ] Corosync Cluster Engine exiting with status 8 at main.c:1417.
Jun 26 09:26:17 pve01 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
Jun 26 09:26:17 pve01 systemd[1]: corosync.service: Failed with result 'exit-code'.
Jun 26 09:26:17 pve01 systemd[1]: Failed to start Corosync Cluster Engine.
...

Fix

Change ring0_addr pve01.xx.xx
of node in pve01 corosync.conf
to IP address.

References

cluster node cant sync after reboot

 При создании кластера командой pvecm create mypve в выводе содержатся следующие сообщения:
Job for corosync.service failed because the control process exited with error code.
See "systemctl status corosync.service" and "journalctl -xe" for details.
command 'systemctl restart corosync pve-cluster' failed: exit code 1
При этом pvecm status возвращает:
Cannot initialize CMAP service
Если подождать некоторое время pvecm status возвращает интформацию о кластере, и после этого можно добавлять другие ноды.

 В выводе journalctl содержатся следующие строки:
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [quorum] crit: quorum_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [quorum] crit: can't initialize service
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [confdb] crit: cmap_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [confdb] crit: can't initialize service
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [dcdb] crit: cpg_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [dcdb] crit: can't initialize service
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [status] crit: cpg_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [status] crit: can't initialize service

2022-12-20 10:59:17 MSK

 Добрый день!
Пожалуйста, дополнительно предоставьте следующую информацию:
1. Операционная система, версия, на которой воспроизвелась ошибка.
2. Выводы следующих команд: $ uname -a $ cat /etc/os-release $ apt-repo
3. Версии программ: # rpm -q pve-manager pve-cluster corosync
4. Последовательность шагов, приводящих к ошибке перед командой # pvecm create mypve.

2023-01-03 18:03:11 MSK

 Не смогу воспроизвести.

 On 09/30/2013 01:45 PM, Patrick Hemmer wrote:

I'm running corosync 2.3.2 on ubuntu precise. I'm playing with a 3
node cluster, and whenever I try to start corosync on one of the
nodes, it fails to start properly.
I just do a simple start with `corosync -f`, and whenever I try to
use any of the tools, they error:
 
 # corosync-cmapctl
Failed to initialize the cmap API. Error CS_ERR_TRY_AGAIN
# corosync-quorumtool
Cannot initialize CMAP service 
If I wait long enough (about 9 minutes or 530 seconds), it does end up
starting, and the tools work, but corosync-quorumtool shows the only
member is itself.
 
However if I start corosync with `strace -f corosync -f` the tools
work fine immediately upon start (though it still doesn't show the
other nodes). Smells like race condition, but dunno where to begin.
 

My guess is something is wrong with your network relating to multicast.

Try using udpu mode - it is very stable now and removes multicast from

the list of things that can go wrong.

 Regards
-steve

 This is the output from `corosync -f` (this node is 10.20.0.212):
notice [TOTEM ] Initializing transport (UDP/IP Unicast). 
notice [TOTEM ] Initializing transmit/receive security (NSS) crypto:
none hash: none
 notice [TOTEM ] The network interface [10.20.0.212] is now up.
notice [TOTEM ] adding new UDPU member {10.20.0.127}
notice [TOTEM ] adding new UDPU member {10.20.0.212}
notice [TOTEM ] adding new UDPU member {10.20.2.124} 
notice [TOTEM ] A new membership (10.20.0.212:1122820) was formed.
Members joined: 2
notice [TOTEM ] A new membership (10.20.0.127:1122824) was formed.
Members joined: 1 3
 ### here is where it pauses for almost 9 minutes ###
error [TOTEM ] FAILED TO RECEIVE 
notice [TOTEM ] A new membership (10.20.0.212:1122876) was formed.
Members left: 1 3
notice [TOTEM ] A new membership (10.20.0.212:1122936) was formed.
Members
notice [TOTEM ] A new membership (10.20.0.212:1123008) was formed.
Members
notice [TOTEM ] A new membership (10.20.0.212:1123064) was formed.
Members
notice [TOTEM ] A new membership (10.20.0.212:1123124) was formed.
Members
notice [TOTEM ] A new membership (10.20.0.212:1123180) was formed.
Members
notice [TOTEM ] A new membership (10.20.0.212:1123248) was formed.
Members
notice [TOTEM ] A new membership (10.20.0.127:1123256) was formed.
Members joined: 1 3
 
This is the config (created by `pcs` utility), it's exactly the same
on all 3 nodes, and the other 2 nodes work fine:
 ----
totem {
version: 2
secauth: off
cluster_name: hapi-server
transport: udpu
}
nodelist { node { ring0_addr: i-74eb9c2f nodeid: 1 } node { ring0_addr: i-a3bf0df9 nodeid: 2 } node { ring0_addr: i-ebcfcbb0 nodeid: 3 }
}
quorum {
provider: corosync_votequorum
}
logging {
to_syslog: yes
}
----
-Patrick
_______________________________________________
discuss mailing list
discuss@corosync.org  http://lists.corosync.org/mailman/listinfo/discuss  

 _______________________________________________
discuss mailing list
discuss@corosync.org  http://lists.corosync.org/mailman/listinfo/discuss

Пытался организовать репликацию zfs которая появилась в пятом РVE ничего так и не получилось, не работает.

На сайте forum.proxmox.com сказали: You do not run Promox VE, you are using Alt Linux please contact them. This is a bug and can’t fixed by us.

Читайте также: База знаний

Пытался организовать репликацию zfs которая появилась в пятом РVE ничего так и не получилось, не работает.

На сайте forum.proxmox.com сказали: You do not run Promox VE, you are using Alt Linux please contact them. This is a bug and can’t fixed by us.

Обе машины с ZFS? Какая ошибка?

NAME STATE READ WRITE CKSUM

zfs209 ONLINE 0 0 0

sdd ONLINE 0 0 0

unable to open file — No such file or directory

и вторая ошибка:

« Последнее редактирование: 26.10.2020 08:30:06 от kln2004
»

2017-09-14 13:10:24 108-0: end replication job with error: command ‘/usr/bin/ssh -o ‘BatchMode=yes’ -o ‘HostKeyAlias=srv-209NJ’ root@172.15.11.202 — pvesr prepare-local-job 108-0 zfs209nw:vm-108-disk-3 —last_sync 0′ failed: exit code 29

У меня аналогичная проблема. Причем одна машина реплицирует, а остальные нет. Вам удалось как-либо решить проблему?

В оригинальном проксмосе zfs нормально реплицирует, а в альте видимо не хватает людских ресурсов ни баг исправить ни в форуме ответить что людям делать.

Так что пока отложил до лучших времён.

have you repaired it or can you not even try to unfold it?

have you repaired it or can you not even try to turn it around?

Is that p8 or Sisyphus?

As a result, how to look at the replication log and where to look for the cause of non-working replication
?

I also remembered a nuance, I don’t know if this is important for solving this issue: there was a line in the alt pve instructions that you need to start the cluster services
# systemctl start syslogd
I need to manually install it and configure /etc/systemd/journald.conf
ForwardToSyslog=yes ?

« Last Edit: 12/05/2018 11:20:49 AM by dezzzm
»

Fixed zfs replication, already in Sisyphus.
+ still needed
systemctl start pvesr.timer
systemctl enable pvesr.timer
and in my case I had to comment out «ENV=$HOME/.bashrc» in /root/.bashrc

ovs-vsctl[2340]: ovs|00002|db_ctl_base|ERR|unix: /var/run/openvswitch/db.sock: database connection failed (No such file or directory)

I’m not sure that the author of the topic will be able to answer, because topic is old. Got the same error. Did you manage to solve this problem?

ovs-vsctl[2340]: ovs|00002|db_ctl_base|ERR|unix: /var/run/openvswitch/db.sock: database connection failed (No such file or directory)
I’m not sure that the author of the topic will be able to answer, tk. topic is old. Got the same error. Did you manage to solve this problem?

And I will answer myself. The service was not running.

# systemctl start openvswitch # systemctl enable openvswitch