Cannot initialize cmap service


Failed to start Corosync Cluster Engine After PVE Reboot

Error

   # systemctl status pve-cluster
...
Jun 26 09:33:26 pve01 pmxcfs[3506]: [quorum] crit: quorum_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [quorum] crit: can't initialize service
Jun 26 09:33:26 pve01 pmxcfs[3506]: [confdb] crit: cmap_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [confdb] crit: can't initialize service
Jun 26 09:33:26 pve01 pmxcfs[3506]: [dcdb] crit: cpg_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [dcdb] crit: can't initialize service
Jun 26 09:33:26 pve01 pmxcfs[3506]: [status] crit: cpg_initialize failed: 2
Jun 26 09:33:26 pve01 pmxcfs[3506]: [status] crit: can't initialize service
...
# journalctl -u corosync.service
...
Jun 26 09:26:17 pve01 corosync[1826]:   [MAIN  ] failed to parse node address 'pve01.xx.xx'
Jun 26 09:26:17 pve01 corosync[1826]:   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1417.
Jun 26 09:26:17 pve01 systemd[1]: corosync.service: Main process exited, code=exited, status=8/n/a
Jun 26 09:26:17 pve01 systemd[1]: corosync.service: Failed with result 'exit-code'.
Jun 26 09:26:17 pve01 systemd[1]: Failed to start Corosync Cluster Engine.
...  
  

Fix

Change ring0_addr pve01.xx.xx
of node in pve01 corosync.conf
to IP address.

References

cluster node cant sync after reboot


   
  При создании кластера командой pvecm create mypve в выводе содержатся следующие сообщения:

Job for corosync.service failed because the control process exited with error code.
See "systemctl status corosync.service" and "journalctl -xe" for details.
command 'systemctl restart corosync pve-cluster' failed: exit code 1

При этом pvecm status возвращает:
Cannot initialize CMAP service

Если подождать некоторое время pvecm status возвращает интформацию о кластере, и после этого можно добавлять другие ноды.    
  В выводе journalctl содержатся следующие строки:
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [quorum] crit: quorum_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [quorum] crit: can't initialize service
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [confdb] crit: cmap_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [confdb] crit: can't initialize service
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [dcdb] crit: cpg_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [dcdb] crit: can't initialize service
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [status] crit: cpg_initialize failed: 2
Jun 10 18:03:12 pvev01 pmxcfs[2431]: [status] crit: can't initialize service  



2022-12-20 10:59:17 MSK

  Добрый день!

Пожалуйста, дополнительно предоставьте следующую информацию:

1.  Операционная система, версия, на которой воспроизвелась ошибка.
2.  Выводы следующих команд:

 $ uname -a
 $ cat /etc/os-release
 $ apt-repo

3.  Версии программ:

 # rpm -q pve-manager pve-cluster corosync

4.  Последовательность шагов, приводящих к ошибке перед командой

 # pvecm create mypve.    



2023-01-03 18:03:11 MSK

  Не смогу воспроизвести.    
  On 09/30/2013 01:45 PM, Patrick Hemmer wrote:
  

I'm running corosync 2.3.2 on ubuntu precise. I'm playing with a 3

node cluster, and whenever I try to start corosync on one of the

nodes, it fails to start properly.

I just do a simple start with `corosync -f`, and whenever I try to

use any of the tools, they error:

   
  # corosync-cmapctl
Failed to initialize the cmap API.  Error CS_ERR_TRY_AGAIN
# corosync-quorumtool
Cannot initialize CMAP service

  

If I wait long enough (about 9 minutes or 530 seconds), it does end up

starting, and the tools work, but corosync-quorumtool shows the only

member is itself.

   

However if I start corosync with `strace -f corosync -f` the tools

work fine immediately upon start (though it still doesn't show the

other nodes). Smells like race condition, but dunno where to begin.

   
   

My guess is something is wrong with your network relating to multicast.

Читайте также:  Как бесплатно зарегистрировать хостинг и домен за 3 шага | Медиа Нетологии

Try using udpu mode - it is very stable now and removes multicast from

the list of things that can go wrong.

  Regards
-steve

  
  This is the output from `corosync -f` (this node is 10.20.0.212):
notice [TOTEM ] Initializing transport (UDP/IP Unicast).
  

notice [TOTEM ] Initializing transmit/receive security (NSS) crypto:

none hash: none

  notice [TOTEM ] The network interface [10.20.0.212] is now up.
notice [TOTEM ] adding new UDPU member {10.20.0.127}
notice [TOTEM ] adding new UDPU member {10.20.0.212}
notice [TOTEM ] adding new UDPU member {10.20.2.124}
  

notice [TOTEM ] A new membership (10.20.0.212:1122820) was formed.

Members joined: 2

notice [TOTEM ] A new membership (10.20.0.127:1122824) was formed.

Members joined: 1 3

  ### here is where it pauses for almost 9 minutes ###
error [TOTEM ] FAILED TO RECEIVE
  

notice [TOTEM ] A new membership (10.20.0.212:1122876) was formed.

Members left: 1 3

notice [TOTEM ] A new membership (10.20.0.212:1122936) was formed.

Members

notice [TOTEM ] A new membership (10.20.0.212:1123008) was formed.

Members

notice [TOTEM ] A new membership (10.20.0.212:1123064) was formed.
Members

notice [TOTEM ] A new membership (10.20.0.212:1123124) was formed.

Members

notice [TOTEM ] A new membership (10.20.0.212:1123180) was formed.

Members

notice [TOTEM ] A new membership (10.20.0.212:1123248) was formed.

Members

notice [TOTEM ] A new membership (10.20.0.127:1123256) was formed.

Members joined: 1 3

  

  

This is the config (created by `pcs` utility), it's exactly the same

on all 3 nodes, and the other 2 nodes work fine:

  ----
totem {
version: 2
secauth: off
cluster_name: hapi-server
transport: udpu
}

nodelist {
 node {
 ring0_addr: i-74eb9c2f
 nodeid: 1
 }
 node {
 ring0_addr: i-a3bf0df9
 nodeid: 2
 }
 node {
 ring0_addr: i-ebcfcbb0
 nodeid: 3
 }
}

quorum {
provider: corosync_votequorum
}

logging {
to_syslog: yes
}
----



-Patrick


_______________________________________________
discuss mailing list
discuss@corosync.org
    http://lists.corosync.org/mailman/listinfo/discuss    

  
   
  _______________________________________________
discuss mailing list
discuss@corosync.org
    http://lists.corosync.org/mailman/listinfo/discuss    

  

Пытался организовать репликацию zfs которая появилась в пятом РVE ничего так и не получилось, не работает.

На сайте forum.proxmox.com сказали: You do not run Promox VE, you are using Alt Linux please contact them. This is a bug and can’t fixed by us.

Читайте также:  Арендуйте игровой сервер для непрерывного игрового удовольствия

Пытался организовать репликацию zfs которая появилась в пятом РVE ничего так и не получилось, не работает.

На сайте forum.proxmox.com сказали: You do not run Promox VE, you are using Alt Linux please contact them. This is a bug and can’t fixed by us.

Обе машины с ZFS? Какая ошибка?


        NAME        STATE     READ WRITE CKSUM

        zfs209      ONLINE       0     0     0

          sdd       ONLINE       0     0     0

        NAME        STATE     READ WRITE CKSUM

        zfs209      ONLINE       0     0     0

          sdd       ONLINE       0     0     0

unable to open file — No such file or directory

и вторая ошибка:

« Последнее редактирование: 26.10.2020 08:30:06 от kln2004
»


2017-09-14 13:10:24 108-0: end replication job with error: command ‘/usr/bin/ssh -o ‘BatchMode=yes’ -o ‘HostKeyAlias=srv-209NJ’ root@172.15.11.202 — pvesr prepare-local-job 108-0 zfs209nw:vm-108-disk-3 —last_sync 0′ failed: exit code 29

У меня аналогичная проблема. Причем одна машина реплицирует, а остальные нет. Вам удалось как-либо решить проблему?


В оригинальном проксмосе zfs нормально реплицирует, а в альте видимо не хватает людских ресурсов ни баг исправить ни в форуме ответить что людям делать.

Так что пока отложил до лучших времён.


have you repaired it or can you not even try to unfold it?



As a result, how to look at the replication log and where to look for the cause of non-working replication
?

I also remembered a nuance, I don’t know if this is important for solving this issue: there was a line in the alt pve instructions that you need to start the cluster services
# systemctl start syslogd
I need to manually install it and configure /etc/systemd/journald.conf
ForwardToSyslog=yes ?

« Last Edit: 12/05/2018 11:20:49 AM by dezzzm
»


Fixed zfs replication, already in Sisyphus.
+ still needed
systemctl start pvesr.timer
systemctl enable pvesr.timer
and in my case I had to comment out «ENV=$HOME/.bashrc» in /root/.bashrc


ovs-vsctl[2340]: ovs|00002|db_ctl_base|ERR|unix: /var/run/openvswitch/db.sock: database connection failed (No such file or directory)

I’m not sure that the author of the topic will be able to answer, because topic is old. Got the same error. Did you manage to solve this problem?


ovs-vsctl[2340]: ovs|00002|db_ctl_base|ERR|unix: /var/run/openvswitch/db.sock: database connection failed (No such file or directory)

I’m not sure that the author of the topic will be able to answer, tk. topic is old. Got the same error. Did you manage to solve this problem?

And I will answer myself. The service was not running.

# systemctl start openvswitch
# systemctl enable openvswitch


Оцените статью
Хостинги