Discussion:
[Check_mk (english)] sporadic check failures of check_mk-oracle_rman
Björn Kaltofen
2018-11-19 10:59:45 UTC
Permalink
Hello,

I got no real feedback the last weeks.
One asked me for more data, but never answered my question in return, how to get the requested data.

Does anyone have more ideas?
The issue is comes up ever few days without any logic.

Best regards,
Björn


--
Björn Kaltofen
Senior Database Consultant

[cid:MCS_Logo_2_aa2c829e-ae6b-4f81-9be7-eef858b01aff.png]
MCS GmbH

Essener Bogen 17 | 22419 Hamburg
T +49 40 53773-0 | F +49 40 53773-200
***@mcs.de | www.mcs.de<http://www.mcs.de/>

Eingetragen im Handelsregister B des Amtsgerichts Hamburg HRB 144607
Geschäftsführer: Eckard Kabel




Von: Björn Kaltofen
Gesendet: Mittwoch, 17. Oktober 2018 15:48
An: 'checkmk-***@lists.mathias-kettner.de' <checkmk-***@lists.mathias-kettner.de>
Betreff: sporadic check failures of check_mk-oracle_rman

Hello,

I'm using the mk_oracle plugin to monitor several databases on a two node RAC cluster.
The mk_oracle script comes from Check_MK version 1.4.0p34.

On one database the check for RMAN INC0 Backup fails sporadically:


Host: orarac01-scan

Alias: orarac01-scan

Address: 0.0.0.0

Service: ORA FNDBP8.DB_INCR_0 RMAN Backup

State: UNKNOWN -> UNKNOWN (PROBLEM)

Command: check_mk-oracle_rman

Output: UNKNOWN - check failed - please submit a crash report!

Perfdata:
Crash dump: <I'll send the string, if it's needed for analysis>

There are several other databases with the same RMAN backup type, that never fail with a crash dump.
The recovery always comes with the following agent check.


Host: orarac01-scan

Alias: orarac01-scan

Address: 0.0.0.0

Service: ORA FNDBP8.DB_INCR_0 RMAN Backup

State: UNKNOWN -> OK (RECOVERY)

Command: check_mk-oracle_rman

Output: OK - Last backup 16 hours ago

Perfdata: age=58800;;;;

I executed the SQL from the mk_oracle script against the affected database. It takes 1 second to execute, but it's the same on the other databases. So performance does not seem to be the issue.

Do you have any idea, why the check crashes sometimes?

Best regards,
Björn
Thorsten Bruhns via checkmk-en
2018-11-19 17:34:05 UTC
Permalink
Hello Björn,
sorry, but I didn't saw any response from you.

How is your backup strategy? Is it 1 Level 0 Backup per week?
How is the following parameter in Oracle configured?
show parameter keep

What version of Oracle is in use?
Oracle changed the behavior from 11.2 to 12.1 and the checkcode had some
problems with Oracle 12.1.
I can check that but I need informations from your environment before I'll
do the analysis.

I think it is 12.1+ with the default value for the controlfile_record_keep
of 7 days with a weekly Level0-Backup.

Sorry for the short answer but I am not at the office and working without
big external monitor is not easy for me.


Kind Regards
Thorsten

Am Mo., 19. Nov. 2018 um 12:04 Uhr schrieb Björn Kaltofen <
Post by Björn Kaltofen
Hello,
I got no real feedback the last weeks.
One asked me for more data, but never answered my question in return, how
to get the requested data.
Does anyone have more ideas?
The issue is comes up ever few days without any logic.
Best regards,
Björn
--
*Björn Kaltofen*
Senior Database Consultant
*MCS GmbH*
Essener Bogen 17 | 22419 Hamburg
T +49 40 53773-0 | F +49 40 53773-200
Eingetragen im Handelsregister B des Amtsgerichts Hamburg HRB 144607
GeschÀftsfÌhrer: Eckard Kabel
*Von:* Björn Kaltofen
*Gesendet:* Mittwoch, 17. Oktober 2018 15:48
*Betreff:* sporadic check failures of check_mk-oracle_rman
Hello,
I’m using the mk_oracle plugin to monitor several databases on a two node
RAC cluster.
The mk_oracle script comes from Check_MK version 1.4.0p34.
Host: orarac01-scan
Alias: orarac01-scan
Address: 0.0.0.0
Service: ORA FNDBP8.DB_INCR_0 RMAN Backup
State: UNKNOWN -> UNKNOWN (PROBLEM)
Command: check_mk-oracle_rman
Output: UNKNOWN - check failed - please submit a crash report!
Crash dump: <I’ll send the string, if it’s needed for analysis>
There are several other databases with the same RMAN backup type, that
never fail with a crash dump.
The recovery always comes with the following agent check.
Host: orarac01-scan
Alias: orarac01-scan
Address: 0.0.0.0
Service: ORA FNDBP8.DB_INCR_0 RMAN Backup
State: UNKNOWN -> OK (RECOVERY)
Command: check_mk-oracle_rman
Output: OK - Last backup 16 hours ago
Perfdata: age=58800;;;;
I executed the SQL from the mk_oracle script against the affected
database. It takes 1 second to execute, but it’s the same on the other
databases. So performance does not seem to be the issue.
Do you have any idea, why the check crashes sometimes?
Best regards,
Björn
_______________________________________________
checkmk-en mailing list
Manage your subscription or unsubscribe
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Björn Kaltofen
2018-11-27 07:16:14 UTC
Permalink
Hello Thorsten,

sorry for my late answer. Have been busy days


I’ll answer your question:

- Database is 11.2.0.4

- control_file_record_keep_time = 7

- daily Level 0 Backup

In your first Email you wrote:

I need the agent output from the report for doing some analysis. The last couple of lines from the callstack with lines from oracle_rman are welcome as well.
I answered:
This is the agent output for RMAN Backups of the corresponding database:

<<<oracle_rman:sep(124)>>>
FNDBP8|COMPLETED|2018-10-18_04:00:40|2018-10-18_04:00:40|DB_INCR|0|788|0
FNDBP8|COMPLETED||2018-10-18_16:45:16|CONTROLFILE||24|0
FNDBP8|COMPLETED|2018-10-18_17:09:11|2018-10-18_16:45:08|ARCHIVELOG||24|

What do you mean with callstack? Where do I get that from?

Thanks for your help.

Best regards,
Björn
--
Björn Kaltofen
Senior Database Consultant

[cid:MCS_Logo_2_aa2c829e-ae6b-4f81-9be7-eef858b01aff.png]
MCS GmbH

Essener Bogen 17 | 22419 Hamburg
T +49 40 53773-0 | F +49 40 53773-200
***@mcs.de | www.mcs.de<http://www.mcs.de/>

Eingetragen im Handelsregister B des Amtsgerichts Hamburg HRB 144607
GeschÀftsfÌhrer: Eckard Kabel

[cid:9108652210_1023737_7a122487-6a6e-4d1c-9dc6-107434f1b521.png]
Von: Thorsten Bruhns <***@googlemail.com>
Gesendet: Montag, 19. November 2018 18:34
An: Björn Kaltofen <***@mcs.de>
Cc: checkmk-en mailing list <checkmk-***@lists.mathias-kettner.de>
Betreff: Re: [Check_mk (english)] sporadic check failures of check_mk-oracle_rman

Hello Björn,
sorry, but I didn't saw any response from you.

How is your backup strategy? Is it 1 Level 0 Backup per week?
How is the following parameter in Oracle configured?
show parameter keep

What version of Oracle is in use?
Oracle changed the behavior from 11.2 to 12.1 and the checkcode had some problems with Oracle 12.1.
I can check that but I need informations from your environment before I'll do the analysis.

I think it is 12.1+ with the default value for the controlfile_record_keep of 7 days with a weekly Level0-Backup.

Sorry for the short answer but I am not at the office and working without big external monitor is not easy for me.


Kind Regards
Thorsten

Am Mo., 19. Nov. 2018 um 12:04 Uhr schrieb Björn Kaltofen <***@mcs.de<mailto:***@mcs.de>>:
Hello,

I got no real feedback the last weeks.
One asked me for more data, but never answered my question in return, how to get the requested data.

Does anyone have more ideas?
The issue is comes up ever few days without any logic.

Best regards,
Björn

Von: Björn Kaltofen
Gesendet: Mittwoch, 17. Oktober 2018 15:48
An: 'checkmk-***@lists.mathias-kettner.de<mailto:checkmk-***@lists.mathias-kettner.de>' <checkmk-***@lists.mathias-kettner.de<mailto:checkmk-***@lists.mathias-kettner.de>>
Betreff: sporadic check failures of check_mk-oracle_rman

Hello,

I’m using the mk_oracle plugin to monitor several databases on a two node RAC cluster.
The mk_oracle script comes from Check_MK version 1.4.0p34.

On one database the check for RMAN INC0 Backup fails sporadically:


Host: orarac01-scan

Alias: orarac01-scan

Address: 0.0.0.0

Service: ORA FNDBP8.DB_INCR_0 RMAN Backup

State: UNKNOWN -> UNKNOWN (PROBLEM)

Command: check_mk-oracle_rman

Output: UNKNOWN - check failed - please submit a crash report!

Perfdata:
Crash dump: <I’ll send the string, if it’s needed for analysis>

There are several other databases with the same RMAN backup type, that never fail with a crash dump.
The recovery always comes with the following agent check.


Host: orarac01-scan

Alias: orarac01-scan

Address: 0.0.0.0

Service: ORA FNDBP8.DB_INCR_0 RMAN Backup

State: UNKNOWN -> OK (RECOVERY)

Command: check_mk-oracle_rman

Output: OK - Last backup 16 hours ago

Perfdata: age=58800;;;;

I executed the SQL from the mk_oracle script against the affected database. It takes 1 second to execute, but it’s the same on the other databases. So performance does not seem to be the issue.

Do you have any idea, why the check crashes sometimes?

Best regards,
Björn
_______________________________________________
checkmk-en mailing list
checkmk-***@lists.mathias-kettner.de<mailto:checkmk-***@lists.mathias-kettner.de>
Manage your subscription or unsubscribe
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Loading...