X-Git-Url: https://arthur.barton.de/gitweb/?p=netdata.git;a=blobdiff_plain;f=python.d%2FREADME.md;h=7df6e3e8689d1f8b6ff705cdfbbf9cba9275dd46;hp=cb06f67fe75266ddbf0ea902e3efeff215129437;hb=8679670bdbe3c5928ec2e266d9c72e1a758fdf37;hpb=6be563674218d92af7904c11cc368834d26e8455 diff --git a/python.d/README.md b/python.d/README.md index cb06f67f..7df6e3e8 100644 --- a/python.d/README.md +++ b/python.d/README.md @@ -125,12 +125,82 @@ If no configuration is given, module will attempt to read log file at `/var/log/ --- +# bind_rndc + +Module parses bind dump file to collect real-time performance metrics + +**Requirements:** + * Version of bind must be 9.6 + + * Netdata must have permissions to run `rndc status` + +It produces: + +1. **Name server statistics** + * requests + * responses + * success + * auth_answer + * nonauth_answer + * nxrrset + * failure + * nxdomain + * recursion + * duplicate + * rejections + +2. **Incoming queries** + * RESERVED0 + * A + * NS + * CNAME + * SOA + * PTR + * MX + * TXT + * X25 + * AAAA + * SRV + * NAPTR + * A6 + * DS + * RSIG + * DNSKEY + * SPF + * ANY + * DLV + +3. **Outgoing queries** + * Same as Incoming queries + + +### configuration + +Sample: + +```yaml +local: + named_stats_path : '/var/log/bind/named.stats' +``` + +If no configuration is given, module will attempt to read named.stats file at `/var/log/bind/named.stats` + +--- + # cpufreq -Module shows current cpu frequency by looking at appropriate files in /sys/devices +This module shows the current CPU frequency as set by the cpufreq kernel +module. **Requirement:** -Processor which presents data scaling frequency data +You need to have `CONFIG_CPU_FREQ` and (optionally) `CONFIG_CPU_FREQ_STAT` +enabled in your kernel. + +This module tries to read from one of two possible locations. On +initialization, it tries to read the `time_in_state` files provided by +cpufreq\_stats. If this file does not exist, or doesn't contain valid data, it +falls back to using the more inaccurate `scaling_cur_freq` file (which only +represents the **current** CPU frequency, and doesn't account for any state +changes which happen between updates). It produces one chart with multiple lines (one line per core). @@ -142,11 +212,23 @@ Sample: sys_dir: "/sys/devices" ``` -If no configuration is given, module will search for `scaling_cur_freq` files in `/sys/devices` directory. +If no configuration is given, module will search for cpufreq files in `/sys/devices` directory. Directory is also prefixed with `NETDATA_HOST_PREFIX` if specified. --- +# cpuidle + +This module monitors the usage of CPU idle states. + +**Requirement:** +Your kernel needs to have `CONFIG_CPU_IDLE` enabled. + +It produces one stacked chart per CPU, showing the percentage of time spent in +each state. + +--- + # dovecot This module provides statistics information from dovecot server. @@ -221,6 +303,67 @@ If no configuration is given, module will attempt to connect to dovecot using un --- +# elasticsearch + +Module monitor elasticsearch performance and health metrics + +It produces: + +1. **Search performance** charts: + * Number of queries, fetches + * Time spent on queries, fetches + * Query and fetch latency + +2. **Indexing performance** charts: + * Number of documents indexed, index refreshes, flushes + * Time spent on indexing, refreshing, flushing + * Indexing and flushing latency + +3. **Memory usage and garbace collection** charts: + * JVM heap currently in use, commited + * Count of garbage collections + * Time spent on garbage collections + +4. **Host metrics** charts: + * Available file descriptors in percent + * Opened HTTP connections + * Cluster communication transport metrics + +5. **Queues and rejections** charts: + * Number of queued/rejected threads in thread pool + +6. **Fielddata cache** charts: + * Fielddata cache size + * Fielddata evictions and circuit breaker tripped count + +7. **Cluster health API** charts: + * Cluster status + * Nodes and tasks statistics + * Shards statistics + +8. **Cluster stats API** charts: + * Nodes statistics + * Query cache statistics + * Docs statistics + * Store statistics + * Indices and shards statistics + +### configuration + +Sample: + +```yaml +local: + host : 'ipaddress' # Server ip address or hostname + port : 'password' # Port on which elasticsearch listed + cluster_health : True/False # Calls to cluster health elasticsearch API. Enabled by default. + cluster_stats : True/False # Calls to cluster stats elasticsearch API. Enabled by default. +``` + +If no configuration is given, module will fail to run. + +--- + # exim Simple module executing `exim -bpc` to grab exim queue. @@ -235,6 +378,151 @@ Configuration is not needed. --- +# fail2ban + +Module monitor fail2ban log file to show all bans for all active jails + +**Requirements:** + * fail2ban.log file MUST BE readable by netdata (A good idea is to add **create 0640 root netdata** to fail2ban conf at logrotate.d) + +It produces one chart with multiple lines (one line per jail) + +### configuration + +Sample: + +```yaml +local: + log_path: '/var/log/fail2ban.log' + conf_path: '/etc/fail2ban/jail.local' + exclude: 'dropbear apache' +``` +If no configuration is given, module will attempt to read log file at `/var/log/fail2ban.log` and conf file at `/etc/fail2ban/jail.local`. +If conf file is not found default jail is `ssh`. + +--- + +# freeradius + +Uses the `radclient` command to provide freeradius statistics. It is not recommended to run it every second. + +It produces: + +1. **Authentication counters:** + * access-accepts + * access-rejects + * auth-dropped-requests + * auth-duplicate-requests + * auth-invalid-requests + * auth-malformed-requests + * auth-unknown-types + +2. **Accounting counters:** [optional] + * accounting-requests + * accounting-responses + * acct-dropped-requests + * acct-duplicate-requests + * acct-invalid-requests + * acct-malformed-requests + * acct-unknown-types + +3. **Proxy authentication counters:** [optional] + * proxy-access-accepts + * proxy-access-rejects + * proxy-auth-dropped-requests + * proxy-auth-duplicate-requests + * proxy-auth-invalid-requests + * proxy-auth-malformed-requests + * proxy-auth-unknown-types + +4. **Proxy accounting counters:** [optional] + * proxy-accounting-requests + * proxy-accounting-responses + * proxy-acct-dropped-requests + * proxy-acct-duplicate-requests + * proxy-acct-invalid-requests + * proxy-acct-malformed-requests + * proxy-acct-unknown-typesa + + +### configuration + +Sample: + +```yaml +local: + host : 'localhost' + port : '18121' + secret : 'adminsecret' + acct : False # Freeradius accounting statistics. + proxy_auth : False # Freeradius proxy authentication statistics. + proxy_acct : False # Freeradius proxy accounting statistics. +``` + +**Freeradius server configuration:** + +The configuration for the status server is automatically created in the sites-available directory. +By default, server is enabled and can be queried from every client. +FreeRADIUS will only respond to status-server messages, if the status-server virtual server has been enabled. + +To do this, create a link from the sites-enabled directory to the status file in the sites-available directory: + * cd sites-enabled + * ln -s ../sites-available/status status + +and restart/reload your FREERADIUS server. + +--- + +# haproxy + +Module monitors frontend and backend metrics such as bytes in, bytes out, sessions current, sessions in queue current. +And health metrics such as backend servers status (server check should be used). + +Plugin can obtain data from url **OR** unix socket. + +**Requirement:** +Socket MUST be readable AND writable by netdata user. + +It produces: + +1. **Frontend** family charts + * Kilobytes in/s + * Kilobytes out/s + * Sessions current + * Sessions in queue current + +2. **Backend** family charts + * Kilobytes in/s + * Kilobytes out/s + * Sessions current + * Sessions in queue current + +3. **Health** chart + * number of failed servers for every backend (in DOWN state) + + +### configuration + +Sample: + +```yaml +via_url: + user : 'username' # ONLY IF stats auth is used + pass : 'password' # # ONLY IF stats auth is used + url : 'http://ip.address:port/url;csv;norefresh' +``` + +OR + +```yaml +via_socket: + socket : 'path/to/haproxy/sock' +``` + +If no configuration is given, module will fail to run. + +--- + # hddtemp Module monitors disk temperatures from one or more hddtemp daemons. @@ -283,6 +571,69 @@ localhost: --- +# isc_dhcpd + +Module monitor leases database to show all active leases for given pools. + +**Requirements:** + * dhcpd leases file MUST BE readable by netdata + * pools MUST BE in CIDR format + +It produces: + +1. **Pools utilization** Aggregate chart for all pools. + * utilization in percent + +2. **Total leases** + * leases (overall number of leases for all pools) + +3. **Active leases** for every pools + * leases (number of active leases in pool) + + +### configuration + +Sample: + +```yaml +local: + leases_path : '/var/lib/dhcp/dhcpd.leases' + pools : '192.168.3.0/24 192.168.4.0/24 192.168.5.0/24' +``` + +In case of python2 you need to install `py2-ipaddress` to make plugin work. +The module will not work If no configuration is given. + +--- + + +# mdstat + +Module monitor /proc/mdstat + +It produces: + +1. **Health** Number of failed disks in every array (aggregate chart). + +2. **Disks stats** + * total (number of devices array ideally would have) + * inuse (number of devices currently are in use) + +3. **Current status** + * resync in percent + * recovery in percent + * reshape in percent + * check in percent + +4. **Operation status** (if resync/recovery/reshape/check is active) + * finish in minutes + * speed in megabytes/s + +### configuration +No configuration is needed. + +--- + # memcached Memcached monitoring module. Data grabbed from [stats interface](https://github.com/memcached/memcached/wiki/Commands#stats). @@ -353,6 +704,149 @@ If no configuration is given, module will attempt to connect to memcached instan --- +# mongodb + +Module monitor mongodb performance and health metrics + +**Requirements:** + * `python-pymongo` package. + +You need to install it manually. + + +Number of charts depends on mongodb version, storage engine and other features (replication): + +1. **Read requests**: + * query + * getmore (operation the cursor executes to get additional data from query) + +2. **Write requests**: + * insert + * delete + * update + +3. **Active clients**: + * readers (number of clients with read operations in progress or queued) + * writers (number of clients with write operations in progress or queued) + +4. **Journal transactions**: + * commits (count of transactions that have been written to the journal) + +5. **Data written to the journal**: + * volume (volume of data) + +6. **Background flush** (MMAPv1): + * average ms (average time taken by flushes to execute) + * last ms (time taken by the last flush) + +8. **Read tickets** (WiredTiger): + * in use (number of read tickets in use) + * available (number of available read tickets remaining) + +9. **Write tickets** (WiredTiger): + * in use (number of write tickets in use) + * available (number of available write tickets remaining) + +10. **Cursors**: + * opened (number of cursors currently opened by MongoDB for clients) + * timedOut (number of cursors that have timed) + * noTimeout (number of open cursors with timeout disabled) + +11. **Connections**: + * connected (number of clients currently connected to the database server) + * unused (number of unused connections available for new clients) + +12. **Memory usage metrics**: + * virtual + * resident (amount of memory used by the database process) + * mapped + * non mapped + +13. **Page faults**: + * page faults (number of times MongoDB had to request from disk) + +14. **Cache metrics** (WiredTiger): + * percentage of bytes currently in the cache (amount of space taken by cached data) + * percantage of tracked dirty bytes in the cache (amount of space taken by dirty data) + +15. **Pages evicted from cache** (WiredTiger): + * modified + * unmodified + +16. **Queued requests**: + * readers (number of read request currently queued) + * writers (number of write request currently queued) + +17. **Errors**: + * msg (number of message assertions raised) + * warning (number of warning assertions raised) + * regular (number of regular assertions raised) + * user (number of assertions corresponding to errors generated by users) + +18. **Storage metrics** (one chart for every database) + * dataSize (size of all documents + padding in the database) + * indexSize (size of all indexes in the database) + * storageSize (size of all extents in the database) + +19. **Documents in the database** (one chart for all databases) + * documents (number of objects in the database among all the collections) + +20. **tcmalloc metrics** + * central cache free + * current total thread cache + * pageheap free + * pageheap unmapped + * thread cache free + * transfer cache free + * heap size + +21. **Commands total/failed rate** + * count + * createIndex + * delete + * eval + * findAndModify + * insert + +22. **Locks metrics** (acquireCount metrics - number of times the lock was acquired in the specified mode) + * Global lock + * Database lock + * Collection lock + * Metadata lock + * oplog lock + +23. **Replica set members state** + * state + +24. **Oplog window** + * window (interval of time between the oldest and the latest entries in the oplog) + +25. **Replication lag** + * member (time when last entry from the oplog was applied for every member) + +26. **Replication set member heartbeat latency** + * member (time when last heartbeat was received from replica set member) + + +### configuration + +Sample: + +```yaml +local: + name : 'local' + host : '127.0.0.1' + port : 27017 + user : 'netdata' + pass : 'netdata' + +``` + +If no configuration is given, module will attempt to connect to mongodb daemon on `127.0.0.1:27017` address + +--- + + # mysql Module monitors one or more mysql servers @@ -490,31 +984,92 @@ Without configuration, module attempts to connect to `http://localhost/stub_stat --- -# nginx_log +# nsd + +Module uses the `nsd-control stats_noreset` command to provide `nsd` statistics. + +**Requirements:** + * Version of `nsd` must be 4.0+ + * Netdata must have permissions to run `nsd-control stats_noreset` + +It produces: + +1. **Queries** + * queries + +2. **Zones** + * master + * slave + +3. **Protocol** + * udp + * udp6 + * tcp + * tcp6 + +4. **Query Type** + * A + * NS + * CNAME + * SOA + * PTR + * HINFO + * MX + * NAPTR + * TXT + * AAAA + * SRV + * ANY + +5. **Transfer** + * NOTIFY + * AXFR + +6. **Return Code** + * NOERROR + * FORMERR + * SERVFAIL + * NXDOMAIN + * NOTIMP + * REFUSED + * YXDOMAIN + + +Configuration is not needed. + +--- + +# ovpn_status_log + +Module monitor openvpn-status log file. -Module monitors nginx access log and produces only one chart: +**Requirements:** + + * If you are running multiple OpenVPN instances out of the same directory, MAKE SURE TO EDIT DIRECTIVES which create output files + so that multiple instances do not overwrite each other's output files. -1. **nginx status codes** in requests/s - * 2xx - * 3xx - * 4xx - * 5xx + * Make sure NETDATA USER CAN READ openvpn-status.log + + * Update_every interval MUST MATCH interval on which OpenVPN writes operational status to log file. + +It produces: +1. **Users** OpenVPN active users + * users + +2. **Traffic** OpenVPN overall bandwidth usage in kilobit/s + * in + * out + ### configuration -Sample for two vhosts: +Sample: ```yaml -site_A: - path: '/var/log/nginx/access-A.log' - -site_B: - name: 'local' - path: '/var/log/nginx/access-B.log' +default + log_path : '/var/log/openvpn-status.log' ``` -When no configuration file is found, module tries to parse `/var/log/nginx/access.log` file. - --- # phpfpm @@ -574,6 +1129,75 @@ Configuration is not needed. --- +# postgres + +Module monitors one or more postgres servers. + +**Requirements:** + + * `python-psycopg2` package. You have to install to manually. + +Following charts are drawn: + +1. **Database size** MB + * size + +2. **Current Backend Processes** processes + * active + +3. **Write-Ahead Logging Statistics** files/s + * total + * ready + * done + +4. **Checkpoints** writes/s + * scheduled + * requested + +5. **Current connections to db** count + * connections + +6. **Tuples returned from db** tuples/s + * sequential + * bitmap + +7. **Tuple reads from db** reads/s + * disk + * cache + +8. **Transactions on db** transactions/s + * commited + * rolled back + +9. **Tuples written to db** writes/s + * inserted + * updated + * deleted + * conflicts + +10. **Locks on db** count per type + * locks + +### configuration + +```yaml +socket: + name : 'socket' + user : 'postgres' + database : 'postgres' + +tcp: + name : 'tcp' + user : 'postgres' + database : 'postgres' + host : 'localhost' + port : 5432 +``` + +When no configuration file is found, module tries to connect to TCP/IP socket: `localhost:5432`. + +--- + # redis Get INFO data from redis instance. @@ -668,6 +1292,45 @@ Without any configuration module will try to autodetect where squid presents its --- +# smartd_log + +Module monitor `smartd` log files to collect HDD/SSD S.M.A.R.T attributes. + +It produces following charts (you can add additional attributes in the module configuration file): + +1. **Read Error Rate** attribute 1 + +2. **Start/Stop Count** attribute 4 + +3. **Reallocated Sectors Count** attribute 5 + +4. **Seek Error Rate** attribute 7 + +5. **Power-On Hours Count** attribute 9 + +6. **Power Cycle Count** attribute 12 + +7. **Load/Unload Cycles** attribute 193 + +8. **Temperature** attribute 194 + +9. **Current Pending Sectors** attribute 197 + +10. **Off-Line Uncorrectable** attribute 198 + +11. **Write Error Rate** attribute 200 + +### configuration + +```yaml +local: + log_path : '/var/log/smartd/' +``` + +If no configuration is given, module will attempt to read log files in /var/log/smartd/ directory. + +--- + # tomcat Present tomcat containers memory utilization. @@ -701,3 +1364,134 @@ Without configuration, module attempts to connect to `http://localhost:8080/mana So it will probably fail. --- + +# varnish cache + +Module uses the `varnishstat` command to provide varnish cache statistics. + +It produces: + +1. **Client metrics** + * session accepted + * session dropped + * good client requests received + +2. **All history hit rate ratio** + * cache hits in percent + * cache miss in percent + * cache hits for pass percent + +3. **Curent poll hit rate ratio** + * cache hits in percent + * cache miss in percent + * cache hits for pass percent + +4. **Thread-related metrics** (only for varnish version 4+) + * total number of threads + * threads created + * threads creation failed + * threads hit max + * length os session queue + * sessions queued for thread + +5. **Backend health** + * backend conn. success + * backend conn. not attempted + * backend conn. too many + * backend conn. failures + * backend conn. reuses + * backend conn. recycles + * backend conn. retry + * backend requests made + +6. **Memory usage** + * memory available in megabytes + * memory allocated in megabytes + +7. **Problems summary** + * session dropped + * session accept failures + * session pipe overflow + * backend conn. not attempted + * fetch failed (all causes) + * backend conn. too many + * threads hit max + * threads destroyed + * length of session queue + * HTTP header overflows + * ESI parse errors + * ESI parse warnings + +8. **Uptime** + * varnish instance uptime in seconds + +### configuration + +No configuration is needed. + +--- + +# web_log + +Tails the apache/nginx/lighttpd/gunicorn log files to collect real-time web-server statistics. + +It produces following charts: + +1. **Response by type** requests/s + * success (1xx, 2xx, 304) + * error (5xx) + * redirect (3xx except 304) + * bad (4xx) + * other (all other responses) + +2. **Response by code family** requests/s + * 1xx (informational) + * 2xx (successful) + * 3xx (redirect) + * 4xx (bad) + * 5xx (internal server errors) + * other (non-standart responses) + * unmatched (the lines in the log file that are not matched) + +3. **Detailed Response Codes** requests/s (number of responses for each response code family individually) + +4. **Bandwidth** KB/s + * received (bandwidth of requests) + * send (bandwidth of responses) + +5. **Timings** ms (request processing time) + * min (bandwidth of requests) + * max (bandwidth of responses) + * average (bandwidth of responses) + +6. **Request per url** requests/s (configured by user) + +7. **Http Methods** requests/s (requests per http method) + +8. **Http Versions** requests/s (requests per http version) + +9. **IP protocols** requests/s (requests per ip protocol version) + +10. **Curent Poll Unique Client IPs** unique ips/s (unique client IPs per data collection iteration) + +11. **All Time Unique Client IPs** unique ips/s (unique client IPs since the last restart of netdata) + + +### configuration + +```yaml +nginx_log: + name : 'nginx_log' + path : '/var/log/nginx/access.log' + +apache_log: + name : 'apache_log' + path : '/var/log/apache/other_vhosts_access.log' + categories: + cacti : 'cacti.*' + observium : 'observium' +``` + +Module has preconfigured jobs for nginx, apache and gunicorn on various distros. + +---