Discussion:
[Check_mk (english)] generate one alert for service monitored by multiple host
(too old to reply)
Schwab, Bradley
2017-12-20 13:56:23 UTC
Permalink
Excuse me if this has been asked before, but I am looking for a solution to this problem.

On two servers, a primary and fail over, I have installed the same local check monitor script on both servers, both monitoring a common process, a ceph storage array.

When the common process has problems, the check fires on both servers, and both trigger a notification, sending two messages.

Is there a way I can have both systems monitor the common process, so both get related plotting data, but only receive one notification?

We are running 1.2.6 CRE, but we working on a business case to upgrade our installations to the latest CRE.

Thank you,
Scott Schwab
CenturyLink
This communication is the property of CenturyLink and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
Dmitry Makovey
2017-12-20 14:25:54 UTC
Permalink
On 12/20/2017 05:56 AM, Schwab, Bradley wrote:
> Excuse me if this has been asked before, but I am looking for a
> solution to this problem.
>
> On two servers, a primary and fail over, I have installed the same
> local check monitor script on both servers, both monitoring a common
> process, a ceph storage array.
>
> When the common process has problems, the check fires on both
> servers, and both trigger a notification, sending two messages.
>
> Is there a way I can have both systems monitor the common process, so
> both get related plotting data, but only receive one notification?
>
> We are running 1.2.6 CRE, but we working on a business case to
> upgrade our installations to the latest CRE.
>
> Thank you, Scott Schwab CenturyLink This communication is the
> property of CenturyLink and may contain confidential or privileged
> information. Unauthorized use of this communication is strictly
> prohibited and may be unlawful. If you have received this
> communication in error, please immediately notify the sender by reply
> e-mail and destroy all copies of the communication and any
> attachments.

you can try to piggiback check results and use them on a single service.
So on each server you'd have a local check for *local* portion of the
service and then use piggiback data to define "global" state of the service?


--
Sr System and DevOps Engineer SoM IRT
Gregg Hine
2017-12-20 16:24:34 UTC
Permalink
Just create a cluster host and monitor the service from there. It will only show down if down on both side.

On 12/20/17, 9:28 AM, "checkmk-en on behalf of Dmitry Makovey" <checkmk-en-***@lists.mathias-kettner.de on behalf of ***@stanford.edu> wrote:

On 12/20/2017 05:56 AM, Schwab, Bradley wrote:
> Excuse me if this has been asked before, but I am looking for a
> solution to this problem.
>
> On two servers, a primary and fail over, I have installed the same
> local check monitor script on both servers, both monitoring a common
> process, a ceph storage array.
>
> When the common process has problems, the check fires on both
> servers, and both trigger a notification, sending two messages.
>
> Is there a way I can have both systems monitor the common process, so
> both get related plotting data, but only receive one notification?
>
> We are running 1.2.6 CRE, but we working on a business case to
> upgrade our installations to the latest CRE.
>
> Thank you, Scott Schwab CenturyLink This communication is the
> property of CenturyLink and may contain confidential or privileged
> information. Unauthorized use of this communication is strictly
> prohibited and may be unlawful. If you have received this
> communication in error, please immediately notify the sender by reply
> e-mail and destroy all copies of the communication and any
> attachments.

you can try to piggiback check results and use them on a single service.
So on each server you'd have a local check for *local* portion of the
service and then use piggiback data to define "global" state of the service?


--
Sr System and DevOps Engineer SoM IRT





CONFIDENTIALITY NOTICE: This communication and any attachments may contain confidential and/or privileged information for the use of the designated recipients named above. If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or its contents is prohibited. If you have received this communication in error, please notify the sender immediately by telephone or email and destroy all copies of this communication and any attachments.
Tost, Lance
2017-12-20 16:40:44 UTC
Permalink
This is normally what I do... but I need to put some work-arounds in so CMK does not flag the cluster as down because it cannot ping it (in many cases it is not a real cluster and therefore does not have a cluster IP). How do others handle this?

Lance

________________________________________
From: checkmk-en [checkmk-en-***@lists.mathias-kettner.de] on behalf of Gregg Hine [***@sentryds.com]
Sent: Wednesday, December 20, 2017 11:24 AM
To: Dmitry Makovey; Schwab, Bradley; checkmk-***@lists.mathias-kettner.de
Subject: Re: [Check_mk (english)] generate one alert for service monitored by multiple host

Just create a cluster host and monitor the service from there. It will only show down if down on both side.

On 12/20/17, 9:28 AM, "checkmk-en on behalf of Dmitry Makovey" <checkmk-en-***@lists.mathias-kettner.de on behalf of ***@stanford.edu> wrote:

On 12/20/2017 05:56 AM, Schwab, Bradley wrote:
> Excuse me if this has been asked before, but I am looking for a
> solution to this problem.
>
> On two servers, a primary and fail over, I have installed the same
> local check monitor script on both servers, both monitoring a common
> process, a ceph storage array.
>
> When the common process has problems, the check fires on both
> servers, and both trigger a notification, sending two messages.
>
> Is there a way I can have both systems monitor the common process, so
> both get related plotting data, but only receive one notification?
>
> We are running 1.2.6 CRE, but we working on a business case to
> upgrade our installations to the latest CRE.
>
> Thank you, Scott Schwab CenturyLink This communication is the
> property of CenturyLink and may contain confidential or privileged
> information. Unauthorized use of this communication is strictly
> prohibited and may be unlawful. If you have received this
> communication in error, please immediately notify the sender by reply
> e-mail and destroy all copies of the communication and any
> attachments.

you can try to piggiback check results and use them on a single service.
So on each server you'd have a local check for *local* portion of the
service and then use piggiback data to define "global" state of the service?


--
Sr System and DevOps Engineer SoM IRT





CONFIDENTIALITY NOTICE: This communication and any attachments may contain confidential and/or privileged information for the use of the designated recipients named above. If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or its contents is prohibited. If you have received this communication in error, please notify the sender immediately by telephone or email and destroy all copies of this communication and any attachments.

_______________________________________________
checkmk-en mailing list
checkmk-***@lists.mathias-kettner.de
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en



The information contained in this e-mail and any attachments is confidential and
intended only for the recipient. If you are not the intended recipient, the
information contained in this message may not be used, copied, or forwarded to
third parties or otherwise distributed for any other purpose. Please notify the
sender if you received this e-mail in error and delete the e-mail and its
attachments promptly. Nothing in this e-mail may be used or deemed to form the
basis of a contractual or any other legally binding obligation unless separately
confirmed in writing by an authorized representative of ARMADA.
Paul
2017-12-21 06:10:30 UTC
Permalink
You could just assign localhost (127.0.0.1) to the cluster and also add a rule to Always Assume the host to be up. (Host check command)

> On Dec 20, 2017, at 08:40, Tost, Lance <***@armada.net> wrote:
>
> This is normally what I do... but I need to put some work-arounds in so CMK does not flag the cluster as down because it cannot ping it (in many cases it is not a real cluster and therefore does not have a cluster IP). How do others handle this?
>
> Lance
>
> ________________________________________
> From: checkmk-en [checkmk-en-***@lists.mathias-kettner.de] on behalf of Gregg Hine [***@sentryds.com]
> Sent: Wednesday, December 20, 2017 11:24 AM
> To: Dmitry Makovey; Schwab, Bradley; checkmk-***@lists.mathias-kettner.de
> Subject: Re: [Check_mk (english)] generate one alert for service monitored by multiple host
>
> Just create a cluster host and monitor the service from there. It will only show down if down on both side.
>
> On 12/20/17, 9:28 AM, "checkmk-en on behalf of Dmitry Makovey" <checkmk-en-***@lists.mathias-kettner.de on behalf of ***@stanford.edu> wrote:
>
> On 12/20/2017 05:56 AM, Schwab, Bradley wrote:
>> Excuse me if this has been asked before, but I am looking for a
>> solution to this problem.
>>
>> On two servers, a primary and fail over, I have installed the same
>> local check monitor script on both servers, both monitoring a common
>> process, a ceph storage array.
>>
>> When the common process has problems, the check fires on both
>> servers, and both trigger a notification, sending two messages.
>>
>> Is there a way I can have both systems monitor the common process, so
>> both get related plotting data, but only receive one notification?
>>
>> We are running 1.2.6 CRE, but we working on a business case to
>> upgrade our installations to the latest CRE.
>>
>> Thank you, Scott Schwab CenturyLink This communication is the
>> property of CenturyLink and may contain confidential or privileged
>> information. Unauthorized use of this communication is strictly
>> prohibited and may be unlawful. If you have received this
>> communication in error, please immediately notify the sender by reply
>> e-mail and destroy all copies of the communication and any
>> attachments.
>
> you can try to piggiback check results and use them on a single service.
> So on each server you'd have a local check for *local* portion of the
> service and then use piggiback data to define "global" state of the service?
>
>
> --
> Sr System and DevOps Engineer SoM IRT
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This communication and any attachments may contain confidential and/or privileged information for the use of the designated recipients named above. If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or its contents is prohibited. If you have received this communication in error, please notify the sender immediately by telephone or email and destroy all copies of this communication and any attachments.
>
> _______________________________________________
> checkmk-en mailing list
> checkmk-***@lists.mathias-kettner.de
> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
>
>
>
> The information contained in this e-mail and any attachments is confidential and
> intended only for the recipient. If you are not the intended recipient, the
> information contained in this message may not be used, copied, or forwarded to
> third parties or otherwise distributed for any other purpose. Please notify the
> sender if you received this e-mail in error and delete the e-mail and its
> attachments promptly. Nothing in this e-mail may be used or deemed to form the
> basis of a contractual or any other legally binding obligation unless separately
> confirmed in writing by an authorized representative of ARMADA.
> _______________________________________________
> checkmk-en mailing list
> checkmk-***@lists.mathias-kettner.de
> http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Loading...