| .\" This file was originally generated by help2man 1.36. |
| .TH WATCHQUAGGA 8 "July 2010" |
| .SH NAME |
| watchquagga \- a program to monitor the status of quagga daemons |
| .SH SYNOPSIS |
| .B watchquagga |
| .RI [ option ...] |
| .IR daemon ... |
| .br |
| .B watchquagga |
| .BR \-h " | " \-v |
| .SH DESCRIPTION |
| .B watchquagga |
| is a watchdog program that monitors the status of supplied quagga |
| .IR daemon s |
| and tries to restart them in case they become unresponsive or shut down. |
| .PP |
| To determine whether a daemon is running, it tries to connect to the |
| daemon's VTY UNIX stream socket, and send echo commands to ensure the |
| daemon responds. When the daemon crashes, EOF is received from the socket, |
| so that watchquagga can react immediately. |
| .PP |
| This program can run in one of the following 5 modes: |
| .TP |
| .B Mode 0: monitor |
| In this mode, the program serves as a monitor and reports status changes. |
| .IP |
| Example usage: watchquagga \-d zebra ospfd bgpd |
| .TP |
| .B Mode 1: global restart |
| In this mode, whenever a daemon hangs or crashes, the given command is used |
| to restart all watched daemons. |
| .IP |
| Example usage: watchquagga \-dz \e |
| .br |
| -R '/sbin/service zebra restart; /sbin/service ospfd restart' \e |
| .br |
| zebra ospfd |
| .TP |
| .B Mode 2: individual daemon restart |
| In this mode, whenever a single daemon hangs or crashes, the given command |
| is used to restart this daemon only. |
| .IP |
| Example usage: watchquagga \-dz \-r '/sbin/service %s restart' \e |
| .br |
| zebra ospfd bgpd |
| .TP |
| .B Mode 3: phased zebra restart |
| In this mode, whenever a single daemon hangs or crashes, the given command |
| is used to restart this daemon only. The only exception is the zebra |
| daemon; in this case, the following steps are taken: (1) all other daemons |
| are stopped, (2) zebra is restarted, and (3) other daemons are started |
| again. |
| .IP |
| Example usage: watchquagga \-adz \-r '/sbin/service %s restart' \e |
| .br |
| \-s '/sbin/service %s start' \e |
| .br |
| \-k '/sbin/service %s stop' zebra ospfd bgpd |
| .TP |
| .B Mode 4: phased global restart for any failure |
| In this mode, whenever a single daemon hangs or crashes, the following |
| steps are taken: (1) all other daemons are stopped, (2) zebra is restarted, |
| and (3) other daemons are started again. |
| .IP |
| Example usage: watchquagga \-Adz \-r '/sbin/service %s restart' \e |
| .br |
| \-s '/sbin/service %s start' \e |
| .br |
| \-k '/sbin/service %s stop' zebra ospfd bgpd |
| .PP |
| Important: It is believed that mode 2 (individual daemon restart) is not |
| safe, and mode 3 (phased zebra restart) may not be safe with certain |
| routing daemons. |
| .PP |
| In order to avoid restarting the daemons in quick succession, you can |
| supply the |
| .B \-m |
| and |
| .B \-M |
| options to set the minimum and maximum delay between the restart commands. |
| The minimum restart delay is recalculated each time a restart is attempted. |
| If the time since the last restart attempt exceeds twice the value of |
| .BR \-M , |
| the restart delay is set to the value of |
| .BR \-m , |
| otherwise the interval is doubled (but capped at the value of |
| .BR \-M ). |
| .SH OPTIONS |
| .TP |
| .BR \-d ", " \-\-daemon |
| Run in daemon mode. When supplied, error messages are sent to Syslog |
| instead of standard output (stdout). |
| .TP |
| .BI \-S " directory" "\fR, \fB\-\-statedir " directory |
| Set the VTY socket |
| .I directory |
| (the default value is "/var/run/quagga"). |
| .TP |
| .BR \-e ", " \-\-no\-echo |
| Do not ping the daemons to test whether they respond. This option is |
| necessary if one or more daemons do not support the echo command. |
| .TP |
| .BI \-l " level" "\fR, \fB\-\-loglevel " level |
| Set the logging |
| .I level |
| (the default value is "6"). The value should range from 0 (LOG_EMERG) to 7 |
| (LOG_DEBUG), but higher number can be supplied if extra debugging messages |
| are required. |
| .TP |
| .BI \-m " number" "\fR, \fB\-\-min\-restart\-interval " number |
| Set the minimum |
| .I number |
| of seconds to wait between invocations of the daemon restart commands (the |
| default value is "60"). |
| .TP |
| .BI \-M " number" "\fR, \fB\-\-max\-restart\-interval " number |
| Set the maximum |
| .I number |
| of seconds to wait between invocations of the daemon restart commands (the |
| default value is "600"). |
| .TP |
| .BI \-i " number" "\fR, \fB\-\-interval " number |
| Set the status polling interval in seconds (the default value is "5"). |
| .TP |
| .BI \-t " number" "\fR, \fB\-\-timeout " number |
| Set the unresponsiveness timeout in seconds (the default value is "10"). |
| .TP |
| .BI \-T " number" "\fR, \fB\-\-restart\-timeout " number |
| Set the restart (kill) timeout in seconds (the default value is "20"). If |
| any background jobs are still running after this period has elapsed, they |
| will be killed. |
| .TP |
| .BI \-r " command" "\fR, \fB\-\-restart " command |
| Supply a Bourne shell |
| .I command |
| to restart a single daemon. The command string should contain the '%s' |
| placeholder to be substituted with the daemon name. |
| .IP |
| Note that |
| .B \-r |
| and |
| .B \-R |
| options are not compatible. |
| .TP |
| .BI \-s " command" "\fR, \fB\-\-start\-command " command |
| Supply a Bourne shell |
| .I command |
| to start a single daemon. The command string should contain the '%s' |
| placeholder to be substituted with the daemon name. |
| .TP |
| .BI \-k " command" "\fR, \fB\-\-kill\-command " command |
| Supply a Bourne shell |
| .I command |
| to stop a single daemon. The command string should contain the '%s' |
| placeholder to be substituted with the daemon name. |
| .TP |
| .BR \-R ", " \-\-restart\-all |
| When one or more daemons are shut down, try to restart them using the |
| Bourne shell command supplied on the command line. |
| .IP |
| Note that |
| .B \-r |
| and |
| .B \-R |
| options are not compatible. |
| .TP |
| .BR \-z ", " \-\-unresponsive\-restart |
| When a daemon is in an unresponsive state, treat it as being shut down for |
| the restart purposes. |
| .TP |
| .BR \-a ", " \-\-all\-restart |
| When zebra hangs or crashes, restart all daemons taking the following |
| steps: (1) stop all other daemons, (2) restart zebra, and (3) start other |
| daemons again. |
| .IP |
| Note that this option also requires |
| .BR \-r , |
| .BR \-s , |
| and |
| .B \-k |
| options to be specified. |
| .TP |
| .BR \-A ", " \-\-always\-all\-restart |
| When any daemon (i.e., not just zebra) hangs or crashes, restart all |
| daemons taking the following steps: (1) stop all other daemons, (2) restart |
| zebra, and (3) start other daemons again. |
| .IP |
| Note that this option also requires |
| .BR \-r , |
| .BR \-s , |
| and |
| .B \-k |
| options to be specified. |
| .TP |
| .BI \-p " filename" "\fR, \fB\-\-pid\-file " filename |
| Set the process identifier |
| .I filename |
| (the default value is "/var/run/quagga/watchquagga.pid"). |
| .TP |
| .BI \-b " string" "\fR, \fB\-\-blank\-string " string |
| When the supplied |
| .I string |
| is found in any of the command line option arguments (i.e., |
| .BR \-r , |
| .BR \-s , |
| .BR \-k , |
| or |
| .BR \-R ), |
| replace it with a space. |
| .IP |
| This is an ugly hack to circumvent problems with passing the command line |
| arguments containing embedded spaces. |
| .TP |
| .BR \-v ", " \-\-version |
| Display the version information and exit. |
| .TP |
| .BR \-h ", " \-\-help |
| Display the usage information and exit. |
| .SH SEE ALSO |
| .BR zebra (8), |
| .BR bgpd (8), |
| .BR isisd (8), |
| .BR ospfd (8), |
| .BR ospf6d (8), |
| .BR ripd (8), |
| .BR ripngd (8) |
| .PP |
| See the project homepage at <http://www.quagga.net/>. |
| .SH AUTHORS |
| Copyright 2004 Andrew J. Schorr |