Difference between revisions of "SSH Monitoring with Zabbix"

From regify WIKI
Jump to navigation Jump to search
Line 40: Line 40:
  
 
== Zabbix optimization and bugs==
 
== Zabbix optimization and bugs==
We found that SSH calls fail randomly. To fix this, we increased two values in /etc/zabbix/zabbix_server.conf:
+
If you encounter strange errors, please try to increase the timeout value in '''/etc/zabbix/zabbix_server.conf''' and '''/etc/zabbix/zabbix_agent.conf''':
  
 
  timeout=30
 
  timeout=30
StartPollers=100
 
  
We also started [https://www.zabbix.com/forum/zabbix-troubleshooting-and-problems/402879-ssh-items-randomly-working a thread in zabbix forum] to find out why the SSH calls only working every now and then.
+
Sometimes the result does not come back fast enough for Zabbix and this will fix such issues.

Revision as of 07:47, 18 June 2020

The ZBX agent is not installed on the regify appliance and you are not allowed to install third party software. But it is easy to monitor the regify appliance with Zabbix using SSH agent.

IMPORTANT: The SSH check in Zabbix 5 is currently very buggy and not well documented. Currently, a specific bug hinders you from utilizing these tips. (June 2020). As soon as this is fixed by the Zabbix team, you may be able to monitor your regify appliance like this.


Allow appliance login

Login with SSH to the regify appliance with root user. Then, create a user for zabbix monitoring:

adduser zabbix
passwd zabbix 

It will ask you for a password. Please use a very secure password (>= 12 characters).

In most cases, the permisions of the new zabbix user will be sufficient.

Configure Zabbix monitoring items

In Zabbix, create a new host (eg "regify provider"). For the new host, you need to add SSH items for tests.

The following image is showing an item for checking current SSH user count on the machine every 10 minutes: Zabbix configuration example

useful item commands

Free appliance memory in percent:

free|grep "Mem:"|awk '{print ($4+$6)/($2/100)}'

Used disk space on / in percent:

df|grep "/$"|awk '{print $5}'|tr -d "%"

CPU load average of the last 5 minutes:

cat /proc/loadavg|cut -d " " -f2

Number of active SSH logins:

who|wc -l

Current MariaDB status:

systemctl status mariadb|grep active

Zabbix optimization and bugs

If you encounter strange errors, please try to increase the timeout value in /etc/zabbix/zabbix_server.conf and /etc/zabbix/zabbix_agent.conf:

timeout=30

Sometimes the result does not come back fast enough for Zabbix and this will fix such issues.