]>
arthur.barton.de Git - netdata.git/log
Costa Tsaousis (ktsaou) [Sun, 21 Aug 2016 00:11:32 +0000 (03:11 +0300)]
fixed context for python.d charts
Costa Tsaousis [Sat, 20 Aug 2016 21:50:04 +0000 (00:50 +0300)]
Merge pull request #804 from ktsaou/master
properly reset green and red thresholds after health configuration reload
Costa Tsaousis [Sat, 20 Aug 2016 21:35:00 +0000 (00:35 +0300)]
properly reset green and red thresholds after health configuration reload
Costa Tsaousis [Sat, 20 Aug 2016 21:26:11 +0000 (00:26 +0300)]
Merge pull request #775 from ktsaou/health
Health monitoring (alarms) & other improvements
Costa Tsaousis [Sat, 20 Aug 2016 21:20:04 +0000 (00:20 +0300)]
renamed alarm.sh to alarm-email.sh
Costa Tsaousis [Sat, 20 Aug 2016 21:05:30 +0000 (00:05 +0300)]
fix re-setting green/red threshold when reloading health configuration
Costa Tsaousis [Sat, 20 Aug 2016 20:38:20 +0000 (23:38 +0300)]
fixes left-overs from moving context index to family index
Costa Tsaousis [Sat, 20 Aug 2016 20:14:07 +0000 (23:14 +0300)]
updated config signatures
Costa Tsaousis [Sat, 20 Aug 2016 20:13:17 +0000 (23:13 +0300)]
added alarms for running named, nginx, redis, squid
Paweł Krupa [Sat, 20 Aug 2016 20:08:39 +0000 (22:08 +0200)]
Update base.py
Lower SocketService timeout
Costa Tsaousis [Sat, 20 Aug 2016 19:30:27 +0000 (22:30 +0300)]
check all alarms have an update frequency; check that alarms do have an update frequency less than the chart they are linked; charts expose their update frequency as variables; added apache health check
Costa Tsaousis [Sat, 20 Aug 2016 19:05:19 +0000 (22:05 +0300)]
check that alarms have update frequency
Costa Tsaousis [Sat, 20 Aug 2016 18:28:00 +0000 (21:28 +0300)]
updated config.sigratures
Costa Tsaousis [Sat, 20 Aug 2016 18:27:05 +0000 (21:27 +0300)]
Merge remote-tracking branch 'upstream/master' into health
Costa Tsaousis [Sat, 20 Aug 2016 18:26:26 +0000 (21:26 +0300)]
updated health configurations
Costa Tsaousis [Sat, 20 Aug 2016 18:17:08 +0000 (21:17 +0300)]
context index converted to family index; alarm.sh now knows the total time an alarm is problematic
Costa Tsaousis (ktsaou) [Thu, 18 Aug 2016 01:31:51 +0000 (04:31 +0300)]
reset green/red thresholds on all charts when reloading health configuration
Costa Tsaousis (ktsaou) [Thu, 18 Aug 2016 00:59:39 +0000 (03:59 +0300)]
updated conf signatures
Costa Tsaousis (ktsaou) [Thu, 18 Aug 2016 00:58:49 +0000 (03:58 +0300)]
do not restore green/red values when loading the database from disks
Costa Tsaousis (ktsaou) [Thu, 18 Aug 2016 00:58:16 +0000 (03:58 +0300)]
to prevent false alarms use 5min of data instead of 2min of data
Costa Tsaousis [Wed, 17 Aug 2016 23:33:52 +0000 (02:33 +0300)]
updated config signatures
Costa Tsaousis [Wed, 17 Aug 2016 23:33:02 +0000 (02:33 +0300)]
fixed typo in health.d/disks.conf
Costa Tsaousis [Wed, 17 Aug 2016 23:27:54 +0000 (02:27 +0300)]
more aesthetic changes to email template
Costa Tsaousis [Wed, 17 Aug 2016 23:21:09 +0000 (02:21 +0300)]
more aesthetic changes to email template
Costa Tsaousis [Wed, 17 Aug 2016 23:15:27 +0000 (02:15 +0300)]
aesthetic changes to email template
Costa Tsaousis [Wed, 17 Aug 2016 23:04:15 +0000 (02:04 +0300)]
properly handle interactions between CLEAR, WARNING, CRITICAL
Costa Tsaousis [Wed, 17 Aug 2016 22:50:04 +0000 (01:50 +0300)]
each alarm now has one status with the following possible values: UNINITIALIZED, UNDEFINED, CLEAR, WARNING, CRITICAL
Costa Tsaousis [Wed, 17 Aug 2016 21:35:17 +0000 (00:35 +0300)]
removed striped header; fixed severity in recovery emails; attempt to fix broken colors on certain email clients; dynamically check the path of the sendmail command
Costa Tsaousis [Wed, 17 Aug 2016 21:11:43 +0000 (00:11 +0300)]
fixed typo the reported all warning alarms as critical and vice versa
Costa Tsaousis (ktsaou) [Wed, 17 Aug 2016 07:25:00 +0000 (10:25 +0300)]
fixed logrotate
Costa Tsaousis [Tue, 16 Aug 2016 23:58:17 +0000 (02:58 +0300)]
fixed typo that resulted in CRITICAL alarms send by email as WARNING
Costa Tsaousis [Tue, 16 Aug 2016 23:50:58 +0000 (02:50 +0300)]
updated config signatures
Costa Tsaousis [Tue, 16 Aug 2016 23:50:30 +0000 (02:50 +0300)]
updated alarms
Costa Tsaousis [Tue, 16 Aug 2016 23:29:24 +0000 (02:29 +0300)]
updated config signatures
Costa Tsaousis [Tue, 16 Aug 2016 23:25:18 +0000 (02:25 +0300)]
track the duration of all alarms
Costa Tsaousis [Tue, 16 Aug 2016 23:23:35 +0000 (02:23 +0300)]
alarm script that sends notifications by email
Costa Tsaousis [Tue, 16 Aug 2016 20:36:38 +0000 (23:36 +0300)]
added the ability to execute scripts on alarms
Paweł Krupa [Tue, 16 Aug 2016 19:44:53 +0000 (21:44 +0200)]
Merge pull request #793 from paulfantom/master
minor changes + modules descriptions
paulfantom [Tue, 16 Aug 2016 19:40:37 +0000 (21:40 +0200)]
updated readme
paulfantom [Tue, 16 Aug 2016 19:32:43 +0000 (21:32 +0200)]
better tomcat chart naming
paulfantom [Tue, 16 Aug 2016 18:57:14 +0000 (20:57 +0200)]
better dimension names in nginx_log module
Costa Tsaousis [Tue, 16 Aug 2016 17:23:47 +0000 (20:23 +0300)]
updated config.signatures
Costa Tsaousis [Tue, 16 Aug 2016 17:21:41 +0000 (20:21 +0300)]
updated disk alarms
Paweł Krupa [Tue, 16 Aug 2016 17:06:37 +0000 (19:06 +0200)]
Update sensors.chart.py
Costa Tsaousis [Tue, 16 Aug 2016 16:14:14 +0000 (19:14 +0300)]
switched atomic operations from legacy __sync to new __atomic ones
Costa Tsaousis [Tue, 16 Aug 2016 14:49:07 +0000 (17:49 +0300)]
fix for the title of qemu/kvm cgroup titles
Costa Tsaousis [Tue, 16 Aug 2016 14:37:27 +0000 (17:37 +0300)]
qemu and kvm cgroups are reported as VMs; fixes #753
Costa Tsaousis [Tue, 16 Aug 2016 13:51:16 +0000 (16:51 +0300)]
netdata can reload health configuration, at runtime, with SIGUSR2
Costa Tsaousis [Tue, 16 Aug 2016 12:15:58 +0000 (15:15 +0300)]
update entropy alarm
Costa Tsaousis [Tue, 16 Aug 2016 12:14:44 +0000 (15:14 +0300)]
added variable $now that resolves to the current timestamp; added entropy alarm
Costa Tsaousis [Tue, 16 Aug 2016 10:36:44 +0000 (13:36 +0300)]
the API now also support "min" group method
Costa Tsaousis [Mon, 15 Aug 2016 23:36:12 +0000 (02:36 +0300)]
cleanup netdata.service - fixes #773; netdata now verifies it has to its required directories before starting
Costa Tsaousis [Mon, 15 Aug 2016 18:15:34 +0000 (21:15 +0300)]
updated netdata.service to remove the pid file; fixes #773
Costa Tsaousis [Mon, 15 Aug 2016 18:08:49 +0000 (21:08 +0300)]
added abs() function to expressions; added health.d/net.conf
Costa Tsaousis [Mon, 15 Aug 2016 16:33:58 +0000 (19:33 +0300)]
Merge remote-tracking branch 'upstream/master' into health
Costa Tsaousis [Mon, 15 Aug 2016 16:27:33 +0000 (19:27 +0300)]
allow expressions to test for inf and nan values
Costa Tsaousis [Mon, 15 Aug 2016 13:55:07 +0000 (16:55 +0300)]
health now properly tracks alarm transitions; code cleanup;
Costa Tsaousis [Mon, 15 Aug 2016 00:57:23 +0000 (03:57 +0300)]
updated configs.signatures
Costa Tsaousis [Mon, 15 Aug 2016 00:54:43 +0000 (03:54 +0300)]
info logs when alarms are raised
Costa Tsaousis [Mon, 15 Aug 2016 00:01:09 +0000 (03:01 +0300)]
more tracing info for health
Costa Tsaousis [Sun, 14 Aug 2016 23:39:03 +0000 (02:39 +0300)]
operational health monitoring - we got alarms! - no notifications yet though
Paweł Krupa [Sun, 14 Aug 2016 21:54:41 +0000 (23:54 +0200)]
Merge pull request #774 from paulfantom/master
New python modules + bug fixes
paulfantom [Sun, 14 Aug 2016 21:52:13 +0000 (23:52 +0200)]
wrapper on check() in SocketService
paulfantom [Sun, 14 Aug 2016 17:51:06 +0000 (19:51 +0200)]
use sensors.py library from Pavel Rojtberg instead of PySensors. fix #781
Costa Tsaousis [Sun, 14 Aug 2016 16:28:04 +0000 (19:28 +0300)]
health configuration variables are checked for invalid characters
Costa Tsaousis [Sun, 14 Aug 2016 15:48:54 +0000 (18:48 +0300)]
added command line option without-files to disable files, pipes, sockets processing; fixes #744
Costa Tsaousis [Sun, 14 Aug 2016 15:31:19 +0000 (18:31 +0300)]
operational health templatizer
paulfantom [Sun, 14 Aug 2016 14:12:35 +0000 (16:12 +0200)]
more timestamps
paulfantom [Sun, 14 Aug 2016 14:11:32 +0000 (16:11 +0200)]
introduce timestamp to logs
paulfantom [Sun, 14 Aug 2016 13:17:20 +0000 (15:17 +0200)]
add nginx_log module
Costa Tsaousis [Sun, 14 Aug 2016 13:11:19 +0000 (16:11 +0300)]
disable capabilities at systemd.service; #773
Costa Tsaousis [Sun, 14 Aug 2016 12:33:48 +0000 (15:33 +0300)]
apps.plugin logs repeating errors, only once per process; fixes #779
paulfantom [Sun, 14 Aug 2016 11:41:19 +0000 (13:41 +0200)]
Changing hddtemp.chart.py to accept specified devices. Same way as [telegraph does it](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/hddtemp)
paulfantom [Sun, 14 Aug 2016 11:34:16 +0000 (13:34 +0200)]
add log flood prevention
Costa Tsaousis [Sun, 14 Aug 2016 11:20:17 +0000 (14:20 +0300)]
fixes for TAB to spaces replacement
paulfantom [Sun, 14 Aug 2016 10:21:24 +0000 (12:21 +0200)]
Merge remote-tracking branch 'origin/master'
# Conflicts:
# python.d/python_modules/msg.py
paulfantom [Sun, 14 Aug 2016 10:17:23 +0000 (12:17 +0200)]
prevent IOError on msg.fatal
paulfantom [Sun, 14 Aug 2016 10:17:23 +0000 (12:17 +0200)]
prevent IOError on msg.fatal
Costa Tsaousis [Sun, 14 Aug 2016 02:05:15 +0000 (05:05 +0300)]
allow health configuration to be expressed in a way that will allow us to have additional charts from the calculated values; alarms that do not do database lookups can now have an update every setting
Costa Tsaousis [Sat, 13 Aug 2016 22:25:35 +0000 (01:25 +0300)]
the disks health file
Costa Tsaousis [Sat, 13 Aug 2016 22:24:45 +0000 (01:24 +0300)]
first health configuration file to monitor disk space - not operational yet
Costa Tsaousis [Sat, 13 Aug 2016 21:39:31 +0000 (00:39 +0300)]
health configuration now parses templates too
Costa Tsaousis [Sat, 13 Aug 2016 19:08:15 +0000 (22:08 +0300)]
converted tabs for C, JS, BASH to 4 spaces
Costa Tsaousis [Sat, 13 Aug 2016 18:54:14 +0000 (21:54 +0300)]
health parsing almost complete
Costa Tsaousis [Sat, 13 Aug 2016 01:31:11 +0000 (04:31 +0300)]
preparing health configuration file parsing
Costa Tsaousis [Fri, 12 Aug 2016 23:58:53 +0000 (02:58 +0300)]
variables are now parsed into expressions
Costa Tsaousis [Fri, 12 Aug 2016 23:13:58 +0000 (02:13 +0300)]
Merge remote-tracking branch 'upstream/master' into health
Costa Tsaousis [Fri, 12 Aug 2016 23:13:34 +0000 (02:13 +0300)]
Merge pull request #776 from Busindre/patch-1
Fixed path to netdata pid file
Costa Tsaousis [Fri, 12 Aug 2016 23:02:28 +0000 (02:02 +0300)]
disable gcc atomic operations on non-supported compilers and gcc versions
Costa Tsaousis [Fri, 12 Aug 2016 22:44:31 +0000 (01:44 +0300)]
global statistics now also report max API response time; global statistics implemented with alternative lock-free code; more code cleanups
Busindre [Fri, 12 Aug 2016 20:07:41 +0000 (22:07 +0200)]
Fixed path to netdata pid file
Fixed path to netdata pid file: /var/run
File modified: system/netdata-init-d.in
The installation displays the following paths (CentOS 6.X)
- the daemon at /usr/sbin/netdata
- config files at /etc/netdata
- web files at /usr/share/netdata
- plugins at /usr/libexec/netdata
- cache files at /var/cache/netdata
- db files at /var/lib/netdata
- log files at /var/log/netdata
- pid file at /var/run
grep "pid" system/netdata-init-d
PIDFILE=/var/$DAEMON.pid
Costa Tsaousis [Fri, 12 Aug 2016 18:18:57 +0000 (21:18 +0300)]
proper log file management; re-opening logs on SIGHUP; updated logrotate; updated systemd.service
paulfantom [Fri, 12 Aug 2016 16:54:24 +0000 (18:54 +0200)]
better check in ExecutableService
Costa Tsaousis [Fri, 12 Aug 2016 11:05:50 +0000 (14:05 +0300)]
netdata now adjusts its scheduling priority to IDLE and its Out-Of-Memory score to 1000
Costa Tsaousis [Fri, 12 Aug 2016 11:04:53 +0000 (14:04 +0300)]
RRDCALC management completed
Costa Tsaousis [Thu, 11 Aug 2016 17:47:13 +0000 (20:47 +0300)]
better description of load average and better link too
Costa Tsaousis [Thu, 11 Aug 2016 17:33:30 +0000 (20:33 +0300)]
Merge remote-tracking branch 'upstream/master' into health
Costa Tsaousis [Thu, 11 Aug 2016 17:32:59 +0000 (20:32 +0300)]
code cleanup by replacing all memory allocation functions with ones that handle exceptions
Costa Tsaousis [Thu, 11 Aug 2016 13:36:12 +0000 (16:36 +0300)]
detect excess characters at expression
Costa Tsaousis [Thu, 11 Aug 2016 11:28:17 +0000 (14:28 +0300)]
expression parser now re-generates the expression showing the precedence it applied to it; more code cleanups