티스토리 뷰

아래에 퍼온 원본 글이 있었지만 지금은 존재하지 않는 URL:

http://www.beekhof.net/blog/2013/debugging-pengine/

아래에 퍼온 원본 글이 백업된 글을 퍼온 URL:

https://web.archive.org/web/20201202084628/http://www.beekhof.net/blog/2013/pacemaker-logging/

 

Pacemaker Logging

Normal operation

web.archive.org

 

- 다음은 퍼온 글 - 

 

Andrew Beekhof

Cluster guy to the rich and famous.

Follow

Tag Cloud

general tips humor announce internals documentation Site News performance Feature Spotlight openstack compute node pacemaker remotecluster concepts fencing architecture heat high availability database kubernetes galera cinder watchdog softdog best practices

 

Normal operation

Pacemaker inherits most of its logging setting from either CMAN or Corosync - depending on what its running on top of.

In order to avoid spamming syslog, Pacemaker only logs a summary of its actions (NOTICE and above) to syslog.

If the level of detail in syslog is insufficient, you should enable a cluster log file. Normally one is configured by default and it contains everything except debug and trace messages.

To find the location of this file, either examine your CMAN (cluster.conf) or Corosync (corosync.conf) configuration file or look for syslog entries such as:

 

pacemakerd[1823]: notice: crm_add_logfile: Additional logging available in /var/log/cluster/corosync.log

 

If you do not see a line like this, either update the cluster configuration or set PCMK_debugfile in /etc/sysconfig/pacemaker

crm_report also knows how to find all the Pacemaker related logs and blackbox files

If the level of detail in the cluster log file is still insufficient, or you simply wish to go blind, you can turn on debugging in Corosync/CMAN, or set PCMK_debug in /etc/sysconfig/pacemaker.

A minor advantage of setting PCMK_debug is that the value can be a comma-separated list of processes which should produce debug logging instead of a global yes/no.

 

When an ERROR occurs

Pacemaker includes support for a blackbox.

When enabled, the blackbox contains a rolling buffer of all logs (not just those sent to syslog or a file) and is written to disk after a crash or assertion failure.

The blackbox recorder can be enabled by setting PCMK_blackbox in /etc/sysconfig/pacemaker or at runtime by sending SIGUSR1. Eg.

 

killall -USR1 crmd

 

When enabled you’ll see a log such as:

 

crmd[1811]: notice: crm_enable_blackbox: Initiated blackbox recorder: /var/lib/pacemaker/blackbox/crmd-1811

 

If a crash occurs, the blackbox will be available at that location. To extract the contents, pass it to qb-blackbox:

 

qb-blackbox /var/lib/pacemaker/blackbox/crmd-1811

 

Which produces output like:

 

Dumping the contents of /var/lib/pacemaker/blackbox/crmd-1811

[debug] shm size:5242880; real_size:5242880; rb->word_size:1310720

[debug] read total of: 5242892

Ringbuffer:

  ->NORMAL

  ->write_pt [5588]

  ->read_pt [0]

  ->size [1310720 words]

  =>free [5220524 bytes]

  =>used [22352 bytes]

...

trace May 19 23:20:55 gio_read_socket(368):0: 0x11ab920.5 1 (ref=1)

trace May 19 23:20:55 pcmk_ipc_accept(458):0: Connection 0x11aee00

info May 19 23:20:55 crm_client_new(302):0: Connecting 0x11aee00 for uid=0 gid=0 pid=24425 id=0e943a2a-dd64-49bc-b9d5-10fa6c6cb1bd

debug May 19 23:20:55 handle_new_connection(465):2147483648: IPC credentials authenticated (24414-24425-14)

...

[debug] Free'ing ringbuffer: /dev/shm/qb-create_from_file-header

 

When an ERROR occurs you’ll also see the function and line number that produced it such as:

 

crmd[1811]: Problem detected at child_death_dispatch:872 (mainloop.c), please see /var/lib/pacemaker/blackbox/crmd-1811.1 for additional details

crmd[1811]: Problem detected at main:94 (crmd.c), please see /var/lib/pacemaker/blackbox/crmd-1811.2 for additional details

 

Again, simply pass the files to qb-blackbox to extract and query the contents.

Note the a counter is added to the end so as to avoid name collisions.

 

Diving into files and functions

In case you have not already guessed, all logs include the name of the function that generated them. So:

 

crmd[1811]: notice: crm_update_peer_state: cman_event_callback: Node corosync-host-1[1] - state is now lost (was member)

 

came from the function crm_update_peer_state().

 

To obtain more detail from that or any other function, you can set PCMK_trace_functions in /etc/sysconfig/pacemaker to a comma separated list of function names. Eg.

 

PCMK_trace_functions=crm_update_peer_state,run_graph

 

For a bigger stick, you may also activate trace logging for all the functions in a particular source file or files by setting PCMK_trace_files as well.

 

PCMK_trace_files=cluster.c,election.c

 

These additional logs are sent to the cluster log file. Note that enabling tracing options also alters the output format.

 

Instead of:

 

crmd: notice: crm_cluster_connect: Connecting to cluster infrastructure: cman

 

the output includes file and line information:

 

crmd: ( cluster.c:215 ) notice: crm_cluster_connect: Connecting to cluster infrastructure: cman

 

But wait there’s still more

Still need more detail? You’re in luck! The blackbox can be dumped at any time, not just when an error occurs.

 

First, make sure the blackbox is active (we’ll assume its the crmd that needs to be debugged):

 

killall -USR1 crmd

 

Next, discard any previous contents by dumping them to disk

 

killall -TRAP crmd

 

now cause whatever condition you’re trying to debug, and send -TRAP when you’re ready to see the result.

 

killall -TRAP crmd

 

You can now look for the result in syslog:

 

grep -e crm_write_blackbox: /var/log/messages

 

This will include a filename containing the trace logging:

 

crmd[1811]: notice: crm_write_blackbox: Blackbox dump requested, please see /var/lib/pacemaker/blackbox/crmd-1811.1 for contents

crmd[1811]: notice: crm_write_blackbox: Blackbox dump requested, please see /var/lib/pacemaker/blackbox/crmd-1811.2 for contents

 

To extract the trace loging for our test, pass the most recent file to qb-blackbox:

 

qb-blackbox /var/lib/pacemaker/blackbox/crmd-1811.2

 

At this point you’ll probably want to use grep :)

 Tags: documentation tips

 Updated: May 20, 2013

최근에 올라온 글
최근에 달린 댓글
«   2024/11   »
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30