Discussion:
[Check_mk (english)] Info request about check_MK and linux shared memory alert
mlist via checkmk-en
2017-12-13 13:23:23 UTC
Permalink
Hy guys

I would like your opinion about a memory alert related to a linux host:

I have a linux server running Oracle and, with the check_MK default thresholds, I get:

CRIT - RAM used: 6.55 GB of 31.35 GB, Swap used: 895.76 MB of 2.00 GB, Total virtual memory used: 7.42 GB of 33.35 GB (22.3%), Shared memory: 14.63 GB (46.7% of RAM, warn/crit at 20.0%/30.0%)CRIT,

Effectively the /proc/meminfo does report:


[***@radora1 ~]# cat /proc/meminfo
MemTotal: 32872552 kB
MemFree: 1833276 kB
Buffers: 586884 kB
Cached: 22609092 kB
SwapCached: 44060 kB
Active: 19674812 kB
Inactive: 7460500 kB
Active(anon): 16988784 kB
Inactive(anon): 2353340 kB
Active(file): 2686028 kB
Inactive(file): 5107160 kB
Unevictable: 1541296 kB
Mlocked: 492724 kB
SwapTotal: 2097144 kB
SwapFree: 1179888 kB
Dirty: 3136 kB
Writeback: 0 kB
AnonPages: 5436636 kB
Mapped: 11630920 kB
Shmem: 15340328 kB
Slab: 1493928 kB
SReclaimable: 934316 kB
SUnreclaim: 559612 kB
KernelStack: 7752 kB
PageTables: 548052 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 18533420 kB
Committed_AS: 25169800 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 282264 kB
VmallocChunk: 34359413760 kB
HardwareCorrupted: 0 kB
AnonHugePages: 1820672 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 8192 kB
DirectMap2M: 2088960 kB
DirectMap1G: 31457280 kB

This is the output of free

[***@myserver ~]# free -m
total used free shared buffers cached
Mem: 32102 30327 1774 0 573 22066
-/+ buffers/cache: 7688 24413
Swap: 2047 895 1152

The question is:

why, in your opinion check_MK did configure warn/crit at 20.0%/30.0%? I mean: I know I could change threshold but, because I suppose that check_MK developers know very well linux, I suspect I'm missing some good knowledge about shared memory on linux and, before to do that (or completely ignore the alarm), I would know if it could be really a problem or not the fact that the shared memory is above 30% of physical ram. What do you think?
Gerd Radecke
2017-12-20 21:45:09 UTC
Permalink
reply to shared memory question:

Hi,

I spent some time researching this a few months back and came to
understand that very little good explanations exist on this.

On most systems you will see none to very little shared memory in use,
simply because a lot of applications don't use it. The very few
applications who do (Oracle and SAP being two of them) usually use a
lot of it.

Many old nagios checks don't even monitor shared memory, so it gets
overlooked regularly in analyzing performance problems. ("free -m"
doesN't really tell you what's happening either)

Maybe the check_mk guys wanted to make sure that it is not overlooked?
Also, 20/30% will prevent many false alerts as (mentioned above) only
very few applications use such an amount of shared memory on purpose.
In many other cases, this much shared memory can be a sign of an
application going rogue..

Not the most insightful answer, but maybe it helps a little..

Gerd

On Wed, Dec 13, 2017 at 2:23 PM, mlist via checkmk-en
Post by mlist via checkmk-en
Hy guys
CRIT - RAM used: 6.55 GB of 31.35 GB, Swap used: 895.76 MB of 2.00 GB, Total
virtual memory used: 7.42 GB of 33.35 GB (22.3%), Shared memory: 14.63 GB
(46.7% of RAM, warn/crit at 20.0%/30.0%)CRIT,
MemTotal: 32872552 kB
MemFree: 1833276 kB
Buffers: 586884 kB
Cached: 22609092 kB
SwapCached: 44060 kB
Active: 19674812 kB
Inactive: 7460500 kB
Active(anon): 16988784 kB
Inactive(anon): 2353340 kB
Active(file): 2686028 kB
Inactive(file): 5107160 kB
Unevictable: 1541296 kB
Mlocked: 492724 kB
SwapTotal: 2097144 kB
SwapFree: 1179888 kB
Dirty: 3136 kB
Writeback: 0 kB
AnonPages: 5436636 kB
Mapped: 11630920 kB
Shmem: 15340328 kB
Slab: 1493928 kB
SReclaimable: 934316 kB
SUnreclaim: 559612 kB
KernelStack: 7752 kB
PageTables: 548052 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 18533420 kB
Committed_AS: 25169800 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 282264 kB
VmallocChunk: 34359413760 kB
HardwareCorrupted: 0 kB
AnonHugePages: 1820672 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 8192 kB
DirectMap2M: 2088960 kB
DirectMap1G: 31457280 kB
This is the output of free
total used free shared buffers cached
Mem: 32102 30327 1774 0 573 22066
-/+ buffers/cache: 7688 24413
Swap: 2047 895 1152
why, in your opinion check_MK did configure warn/crit at 20.0%/30.0%? I
mean: I know I could change threshold but, because I suppose that check_MK
developers know very well linux, I suspect I'm missing some good knowledge
about shared memory on linux and, before to do that (or completely ignore
the alarm), I would know if it could be really a problem or not the fact
that the shared memory is above 30% of physical ram. What do you think?
_______________________________________________
checkmk-en mailing list
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
mlist via checkmk-en
2017-12-21 08:20:18 UTC
Permalink
Hi Gerd

this is the same thing I thought. Probably it is just something like: could be or not a problem, just keep your eyes on it and make and analysis
Post by Gerd Radecke
Hi,
I spent some time researching this a few months back and came to
understand that very little good explanations exist on this.
On most systems you will see none to very little shared memory in use,
simply because a lot of applications don't use it. The very few
applications who do (Oracle and SAP being two of them) usually use a
lot of it.
Many old nagios checks don't even monitor shared memory, so it gets
overlooked regularly in analyzing performance problems. ("free -m"
doesN't really tell you what's happening either)
Maybe the check_mk guys wanted to make sure that it is not overlooked?
Also, 20/30% will prevent many false alerts as (mentioned above) only
very few applications use such an amount of shared memory on purpose.
In many other cases, this much shared memory can be a sign of an
application going rogue..
Not the most insightful answer, but maybe it helps a little..
Gerd
On Wed, Dec 13, 2017 at 2:23 PM, mlist via checkmk-en
Post by mlist via checkmk-en
Hy guys
I have a linux server running Oracle and, with the check_MK default
CRIT - RAM used: 6.55 GB of 31.35 GB, Swap used: 895.76 MB of 2.00 GB, Total
virtual memory used: 7.42 GB of 33.35 GB (22.3%), Shared memory: 14.63 GB
(46.7% of RAM, warn/crit at 20.0%/30.0%)CRIT,
MemTotal: 32872552 kB
MemFree: 1833276 kB
Buffers: 586884 kB
Cached: 22609092 kB
SwapCached: 44060 kB
Active: 19674812 kB
Inactive: 7460500 kB
Active(anon): 16988784 kB
Inactive(anon): 2353340 kB
Active(file): 2686028 kB
Inactive(file): 5107160 kB
Unevictable: 1541296 kB
Mlocked: 492724 kB
SwapTotal: 2097144 kB
SwapFree: 1179888 kB
Dirty: 3136 kB
Writeback: 0 kB
AnonPages: 5436636 kB
Mapped: 11630920 kB
Shmem: 15340328 kB
Slab: 1493928 kB
SReclaimable: 934316 kB
SUnreclaim: 559612 kB
KernelStack: 7752 kB
PageTables: 548052 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 18533420 kB
Committed_AS: 25169800 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 282264 kB
VmallocChunk: 34359413760 kB
HardwareCorrupted: 0 kB
AnonHugePages: 1820672 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 8192 kB
DirectMap2M: 2088960 kB
DirectMap1G: 31457280 kB
This is the output of free
total used free shared buffers cached
Mem: 32102 30327 1774 0 573 22066
-/+ buffers/cache: 7688 24413
Swap: 2047 895 1152
why, in your opinion check_MK did configure warn/crit at 20.0%/30.0%? I
mean: I know I could change threshold but, because I suppose that check_MK
developers know very well linux, I suspect I'm missing some good knowledge
about shared memory on linux and, before to do that (or completely ignore
the alarm), I would know if it could be really a problem or not the fact
that the shared memory is above 30% of physical ram. What do you think?
_______________________________________________
checkmk-en mailing list
http://lists.mathias-kettner.de/mailman/listinfo/checkmk-en
Loading...