티스토리 뷰
아래에 퍼온 원본 글이 있었지만 지금은 존재하지 않는 URL:
http://www.beekhof.net/blog/2013/debugging-pengine/
아래에 퍼온 원본 글이 백업된 글을 퍼온 URL:
https://web.archive.org/web/20201202084628/http://www.beekhof.net/blog/2013/pacemaker-logging/
- 다음은 퍼온 글 -
Andrew Beekhof
Cluster guy to the rich and famous.
Follow
Tag Cloud
general tips humor announce internals documentation Site News performance Feature Spotlight openstack compute node pacemaker remotecluster concepts fencing architecture heat high availability database kubernetes galera cinder watchdog softdog best practices
Normal operation
Pacemaker inherits most of its logging setting from either CMAN or Corosync - depending on what its running on top of.
In order to avoid spamming syslog, Pacemaker only logs a summary of its actions (NOTICE and above) to syslog.
If the level of detail in syslog is insufficient, you should enable a cluster log file. Normally one is configured by default and it contains everything except debug and trace messages.
To find the location of this file, either examine your CMAN (cluster.conf) or Corosync (corosync.conf) configuration file or look for syslog entries such as:
pacemakerd[1823]: notice: crm_add_logfile: Additional logging available in /var/log/cluster/corosync.log
If you do not see a line like this, either update the cluster configuration or set PCMK_debugfile in /etc/sysconfig/pacemaker
crm_report also knows how to find all the Pacemaker related logs and blackbox files
If the level of detail in the cluster log file is still insufficient, or you simply wish to go blind, you can turn on debugging in Corosync/CMAN, or set PCMK_debug in /etc/sysconfig/pacemaker.
A minor advantage of setting PCMK_debug is that the value can be a comma-separated list of processes which should produce debug logging instead of a global yes/no.
When an ERROR occurs
Pacemaker includes support for a blackbox.
When enabled, the blackbox contains a rolling buffer of all logs (not just those sent to syslog or a file) and is written to disk after a crash or assertion failure.
The blackbox recorder can be enabled by setting PCMK_blackbox in /etc/sysconfig/pacemaker or at runtime by sending SIGUSR1. Eg.
killall -USR1 crmd
When enabled you’ll see a log such as:
crmd[1811]: notice: crm_enable_blackbox: Initiated blackbox recorder: /var/lib/pacemaker/blackbox/crmd-1811
If a crash occurs, the blackbox will be available at that location. To extract the contents, pass it to qb-blackbox:
qb-blackbox /var/lib/pacemaker/blackbox/crmd-1811
Which produces output like:
Dumping the contents of /var/lib/pacemaker/blackbox/crmd-1811
[debug] shm size:5242880; real_size:5242880; rb->word_size:1310720
[debug] read total of: 5242892
Ringbuffer:
->NORMAL
->write_pt [5588]
->read_pt [0]
->size [1310720 words]
=>free [5220524 bytes]
=>used [22352 bytes]
...
trace May 19 23:20:55 gio_read_socket(368):0: 0x11ab920.5 1 (ref=1)
trace May 19 23:20:55 pcmk_ipc_accept(458):0: Connection 0x11aee00
info May 19 23:20:55 crm_client_new(302):0: Connecting 0x11aee00 for uid=0 gid=0 pid=24425 id=0e943a2a-dd64-49bc-b9d5-10fa6c6cb1bd
debug May 19 23:20:55 handle_new_connection(465):2147483648: IPC credentials authenticated (24414-24425-14)
...
[debug] Free'ing ringbuffer: /dev/shm/qb-create_from_file-header
When an ERROR occurs you’ll also see the function and line number that produced it such as:
crmd[1811]: Problem detected at child_death_dispatch:872 (mainloop.c), please see /var/lib/pacemaker/blackbox/crmd-1811.1 for additional details
crmd[1811]: Problem detected at main:94 (crmd.c), please see /var/lib/pacemaker/blackbox/crmd-1811.2 for additional details
Again, simply pass the files to qb-blackbox to extract and query the contents.
Note the a counter is added to the end so as to avoid name collisions.
Diving into files and functions
In case you have not already guessed, all logs include the name of the function that generated them. So:
crmd[1811]: notice: crm_update_peer_state: cman_event_callback: Node corosync-host-1[1] - state is now lost (was member)
came from the function crm_update_peer_state().
To obtain more detail from that or any other function, you can set PCMK_trace_functions in /etc/sysconfig/pacemaker to a comma separated list of function names. Eg.
PCMK_trace_functions=crm_update_peer_state,run_graph
For a bigger stick, you may also activate trace logging for all the functions in a particular source file or files by setting PCMK_trace_files as well.
PCMK_trace_files=cluster.c,election.c
These additional logs are sent to the cluster log file. Note that enabling tracing options also alters the output format.
Instead of:
crmd: notice: crm_cluster_connect: Connecting to cluster infrastructure: cman
the output includes file and line information:
crmd: ( cluster.c:215 ) notice: crm_cluster_connect: Connecting to cluster infrastructure: cman
But wait there’s still more
Still need more detail? You’re in luck! The blackbox can be dumped at any time, not just when an error occurs.
First, make sure the blackbox is active (we’ll assume its the crmd that needs to be debugged):
killall -USR1 crmd
Next, discard any previous contents by dumping them to disk
killall -TRAP crmd
now cause whatever condition you’re trying to debug, and send -TRAP when you’re ready to see the result.
killall -TRAP crmd
You can now look for the result in syslog:
grep -e crm_write_blackbox: /var/log/messages
This will include a filename containing the trace logging:
crmd[1811]: notice: crm_write_blackbox: Blackbox dump requested, please see /var/lib/pacemaker/blackbox/crmd-1811.1 for contents
crmd[1811]: notice: crm_write_blackbox: Blackbox dump requested, please see /var/lib/pacemaker/blackbox/crmd-1811.2 for contents
To extract the trace loging for our test, pass the most recent file to qb-blackbox:
qb-blackbox /var/lib/pacemaker/blackbox/crmd-1811.2
At this point you’ll probably want to use grep :)
Tags: documentation tips
Updated: May 20, 2013
'개발' 카테고리의 다른 글
구글링해서 찾은 Pacemaker 디버깅 관련 글들 (0) | 2021.11.04 |
---|---|
[오류해결] svn cleanup failed-previous operation has not finished; run cleanup if it was interrupted (0) | 2019.03.19 |