Merge pull request #1953 from l2isbad/python_readme_update

[netdata.git] / python.d / README.md
diff --git a/python.d/README.md b/python.d/README.md

index b4b34333ba5f33864205fbf2eaed27b118a219c8..7df6e3e8689d1f8b6ff705cdfbbf9cba9275dd46 100644 (file)
--- a/python.d/README.md
+++ b/python.d/README.md
@@ -125,12 +125,82 @@ If no configuration is given, module will attempt to read log file at `/var/log/
  
  ---
  
+# bind_rndc
+
+Module parses bind dump file to collect real-time performance metrics
+
+**Requirements:**
+ * Version of bind must be 9.6 +
+ * Netdata must have permissions to run `rndc status`
+
+It produces:
+
+1. **Name server statistics**
+ * requests
+ * responses
+ * success
+ * auth_answer
+ * nonauth_answer
+ * nxrrset
+ * failure
+ * nxdomain
+ * recursion
+ * duplicate
+ * rejections
+ 
+2. **Incoming queries**
+ * RESERVED0
+ * A
+ * NS
+ * CNAME
+ * SOA
+ * PTR
+ * MX
+ * TXT
+ * X25
+ * AAAA
+ * SRV
+ * NAPTR
+ * A6
+ * DS
+ * RSIG
+ * DNSKEY
+ * SPF
+ * ANY
+ * DLV
+ 
+3. **Outgoing queries**
+ * Same as Incoming queries
+
+
+### configuration
+
+Sample:
+
+```yaml
+local:
+  named_stats_path       : '/var/log/bind/named.stats'
+```
+
+If no configuration is given, module will attempt to read named.stats file  at `/var/log/bind/named.stats`
+
+---
+
  # cpufreq
  
-Module shows current cpu frequency by looking at appropriate files in /sys/devices
+This module shows the current CPU frequency as set by the cpufreq kernel
+module.
  
  **Requirement:**
-Processor which presents data scaling frequency data
+You need to have `CONFIG_CPU_FREQ` and (optionally) `CONFIG_CPU_FREQ_STAT`
+enabled in your kernel.
+
+This module tries to read from one of two possible locations. On
+initialization, it tries to read the `time_in_state` files provided by
+cpufreq\_stats. If this file does not exist, or doesn't contain valid data, it
+falls back to using the more inaccurate `scaling_cur_freq` file (which only
+represents the **current** CPU frequency, and doesn't account for any state
+changes which happen between updates).
  
  It produces one chart with multiple lines (one line per core).
  
@@ -142,11 +212,23 @@ Sample:
  sys_dir: "/sys/devices"
  ```
  
-If no configuration is given, module will search for `scaling_cur_freq` files in `/sys/devices` directory.
+If no configuration is given, module will search for cpufreq files in `/sys/devices` directory.
  Directory is also prefixed with `NETDATA_HOST_PREFIX` if specified.
  
  ---
  
+# cpuidle
+
+This module monitors the usage of CPU idle states.
+
+**Requirement:**
+Your kernel needs to have `CONFIG_CPU_IDLE` enabled.
+
+It produces one stacked chart per CPU, showing the percentage of time spent in
+each state.
+
+---
+
  # dovecot
  
  This module provides statistics information from dovecot server. 
@@ -221,6 +303,67 @@ If no configuration is given, module will attempt to connect to dovecot using un
  
  ---
  
+# elasticsearch
+
+Module monitor elasticsearch performance and health metrics
+
+It produces:
+
+1. **Search performance** charts:
+ * Number of queries, fetches
+ * Time spent on queries, fetches
+ * Query and fetch latency
+
+2. **Indexing performance** charts:
+ * Number of documents indexed, index refreshes, flushes
+ * Time spent on indexing, refreshing, flushing
+ * Indexing and flushing latency
+
+3. **Memory usage and garbace collection** charts:
+ * JVM heap currently in use, commited
+ * Count of garbage collections
+ * Time spent on garbage collections
+
+4. **Host metrics** charts:
+ * Available file descriptors in percent 
+ * Opened HTTP connections
+ * Cluster communication transport metrics
+
+5. **Queues and rejections** charts:
+ * Number of queued/rejected threads in thread pool
+
+6. **Fielddata cache** charts:
+ * Fielddata cache size
+ * Fielddata evictions and circuit breaker tripped count
+
+7. **Cluster health API** charts:
+ * Cluster status
+ * Nodes and tasks statistics
+ * Shards statistics
+
+8. **Cluster stats API** charts:
+ * Nodes statistics
+ * Query cache statistics
+ * Docs statistics
+ * Store statistics
+ * Indices and shards statistics
+
+### configuration
+
+Sample:
+
+```yaml
+local:
+  host               :  'ipaddress'   # Server ip address or hostname
+  port               : 'password'     # Port on which elasticsearch listed
+  cluster_health     :  True/False    # Calls to cluster health elasticsearch API. Enabled by default.
+  cluster_stats      :  True/False    # Calls to cluster stats elasticsearch API. Enabled by default.
+```
+
+If no configuration is given, module will fail to run.
+
+---
+
  # exim
  
  Simple module executing `exim -bpc` to grab exim queue. 
@@ -235,6 +378,30 @@ Configuration is not needed.
  
  ---
  
+# fail2ban
+
+Module monitor fail2ban log file to show all bans for all active jails 
+
+**Requirements:**
+ * fail2ban.log file MUST BE readable by netdata (A good idea is to add  **create 0640 root netdata** to fail2ban conf at logrotate.d)
+ 
+It produces one chart with multiple lines (one line per jail)
+ 
+### configuration
+
+Sample:
+
+```yaml
+local:
+ log_path: '/var/log/fail2ban.log'
+ conf_path: '/etc/fail2ban/jail.local'
+ exclude: 'dropbear apache'
+```
+If no configuration is given, module will attempt to read log file at `/var/log/fail2ban.log` and conf file at `/etc/fail2ban/jail.local`.
+If conf file is not found default jail is `ssh`.
+
+---
+
  # freeradius
  
  Uses the `radclient` command to provide freeradius statistics. It is not recommended to run it every second.
@@ -306,6 +473,56 @@ and restart/reload your FREERADIUS server.
  
  ---
  
+# haproxy
+
+Module monitors frontend and backend metrics such as bytes in, bytes out, sessions current, sessions in queue current.
+And health metrics such as backend servers status (server check should be used).
+
+Plugin can obtain data from url **OR** unix socket.
+
+**Requirement:**
+Socket MUST be readable AND writable by netdata user.
+
+It produces:
+
+1. **Frontend** family charts
+ * Kilobytes in/s 
+ * Kilobytes out/s
+ * Sessions current
+ * Sessions in queue current
+
+2. **Backend** family charts
+ * Kilobytes in/s 
+ * Kilobytes out/s
+ * Sessions current
+ * Sessions in queue current
+
+3. **Health** chart
+ * number of failed servers for every backend (in DOWN state)
+
+
+### configuration
+
+Sample:
+
+```yaml
+via_url:
+  user       : 'username' # ONLY IF stats auth is used
+  pass       : 'password' # # ONLY IF stats auth is used
+  url     : 'http://ip.address:port/url;csv;norefresh'
+```
+
+OR
+
+```yaml
+via_socket:
+  socket       : 'path/to/haproxy/sock'
+```
+
+If no configuration is given, module will fail to run.
+
+---
+
  # hddtemp
   
  Module monitors disk temperatures from one or more hddtemp daemons.
@@ -354,6 +571,69 @@ localhost:
  
  ---
  
+# isc_dhcpd
+
+Module monitor leases database to show all active leases for given pools.
+
+**Requirements:**
+ * dhcpd leases file MUST BE readable by netdata
+ * pools MUST BE in CIDR format
+
+It produces:
+
+1. **Pools utilization** Aggregate chart for all pools.
+ * utilization in percent
+
+2. **Total leases**
+ * leases (overall number of leases for all pools)
+ 
+3. **Active leases** for every pools
+  * leases (number of active leases in pool)
+
+  
+### configuration
+
+Sample:
+
+```yaml
+local:
+  leases_path       : '/var/lib/dhcp/dhcpd.leases'
+  pools       : '192.168.3.0/24 192.168.4.0/24 192.168.5.0/24'
+```
+
+In case of python2 you need to  install `py2-ipaddress` to make plugin work.
+The module will not work If no configuration is given.
+
+---
+
+
+# mdstat
+
+Module monitor /proc/mdstat
+
+It produces:
+
+1. **Health** Number of failed disks in every array (aggregate chart).
+ 
+2. **Disks stats** 
+ * total (number of devices array ideally would have)
+ * inuse (number of devices currently are in use)
+
+3. **Current status**
+ * resync in percent
+ * recovery in percent
+ * reshape in percent
+ * check in percent
+ 
+4. **Operation status** (if resync/recovery/reshape/check is active)
+ * finish in minutes
+ * speed in megabytes/s
+  
+### configuration
+No configuration is needed.
+
+---
+
  # memcached
  
  Memcached monitoring module. Data grabbed from [stats interface](https://github.com/memcached/memcached/wiki/Commands#stats).
@@ -424,6 +704,149 @@ If no configuration is given, module will attempt to connect to memcached instan
  
  ---
  
+# mongodb
+
+Module monitor mongodb performance and health metrics
+
+**Requirements:**
+ * `python-pymongo` package.
+
+You need to install it manually.
+
+
+Number of charts depends on mongodb version, storage engine and other features (replication):
+
+1. **Read requests**:
+ * query
+ * getmore (operation the cursor executes to get additional data from query)
+
+2. **Write requests**:
+ * insert
+ * delete
+ * update
+
+3. **Active clients**:
+ * readers (number of clients with read operations in progress or queued)
+ * writers (number of clients with write operations in progress or queued)
+
+4. **Journal transactions**:
+ * commits (count of transactions that have been written to the journal)
+
+5. **Data written to the journal**:
+ * volume (volume of data)
+
+6. **Background flush** (MMAPv1):
+ * average ms (average time taken by flushes to execute)
+ * last ms (time taken by the last flush)
+
+8. **Read tickets** (WiredTiger):
+ * in use (number of read tickets in use)
+ * available (number of available read tickets remaining)
+
+9. **Write tickets** (WiredTiger):
+ * in use (number of write tickets in use)
+ * available (number of available write tickets remaining)
+
+10. **Cursors**:
+ * opened (number of cursors currently opened by MongoDB for clients)
+ * timedOut (number of cursors that have timed)
+ * noTimeout (number of open cursors with timeout disabled)
+
+11. **Connections**:
+ * connected (number of clients currently connected to the database server)
+ * unused (number of unused connections available for new clients)
+
+12. **Memory usage metrics**:
+ * virtual
+ * resident (amount of memory used by the database process)
+ * mapped
+ * non mapped
+
+13. **Page faults**:
+ * page faults (number of times MongoDB had to request from disk)
+
+14. **Cache metrics** (WiredTiger):
+ * percentage of bytes currently in the cache (amount of space taken by cached data)
+ * percantage of tracked dirty bytes in the cache (amount of space taken by dirty data)
+
+15. **Pages evicted from cache** (WiredTiger):
+ * modified
+ * unmodified
+
+16. **Queued requests**:
+ * readers (number of read request currently queued)
+ * writers (number of write request currently queued)
+
+17. **Errors**:
+ * msg (number of message assertions raised)
+ * warning (number of warning assertions raised)
+ * regular (number of regular assertions raised)
+ * user (number of assertions corresponding to errors generated by users)
+
+18. **Storage metrics** (one chart for every database)
+ * dataSize (size of all documents + padding in the database)
+ * indexSize (size of all indexes in the database)
+ * storageSize (size of all extents in the database)
+
+19. **Documents in the database** (one chart for all databases)
+ * documents (number of objects in the database among all the collections)
+
+20. **tcmalloc metrics**
+ * central cache free
+ * current total thread cache
+ * pageheap free
+ * pageheap unmapped
+ * thread cache free
+ * transfer cache free
+ * heap size
+
+21. **Commands total/failed rate**
+ * count
+ * createIndex
+ * delete
+ * eval
+ * findAndModify
+ * insert
+
+22. **Locks metrics** (acquireCount metrics - number of times the lock was acquired in the specified mode)
+ * Global lock
+ * Database lock
+ * Collection lock
+ * Metadata lock
+ * oplog lock
+
+23. **Replica set members state**
+ * state
+
+24. **Oplog window**
+  * window (interval of time between the oldest and the latest entries in the oplog)
+
+25. **Replication lag**
+  * member (time when last entry from the oplog was applied for every member)
+
+26. **Replication set member heartbeat latency**
+  * member (time when last heartbeat was received from replica set member)
+
+
+### configuration
+
+Sample:
+
+```yaml
+local:
+    name : 'local'
+    host : '127.0.0.1'
+    port : 27017
+    user : 'netdata'
+    pass : 'netdata'
+
+```
+
+If no configuration is given, module will attempt to connect to mongodb daemon on `127.0.0.1:27017` address
+
+---
+
+
  # mysql
  
  Module monitors one or more mysql servers
@@ -561,30 +984,58 @@ Without configuration, module attempts to connect to `http://localhost/stub_stat
  
  ---
  
-# nginx_log
+# nsd
  
-Module monitors nginx access log and produces only one chart:
+Module uses the `nsd-control stats_noreset` command to provide `nsd` statistics.
  
-1. **nginx status codes** in requests/s
- * 2xx
- * 3xx
- * 4xx
- * 5xx
+**Requirements:**
+ * Version of `nsd` must be 4.0+
+ * Netdata must have permissions to run `nsd-control stats_noreset`
  
-### configuration
+It produces:
  
-Sample for two vhosts:
+1. **Queries**
+ * queries
  
-```yaml
-site_A:
-  path: '/var/log/nginx/access-A.log'
+2. **Zones**
+ * master
+ * slave
+
+3. **Protocol**
+ * udp
+ * udp6
+ * tcp
+ * tcp6
+
+4. **Query Type**
+ * A
+ * NS
+ * CNAME
+ * SOA
+ * PTR
+ * HINFO
+ * MX
+ * NAPTR
+ * TXT
+ * AAAA
+ * SRV
+ * ANY
+
+5. **Transfer**
+ * NOTIFY
+ * AXFR
+
+6. **Return Code**
+ * NOERROR
+ * FORMERR
+ * SERVFAIL
+ * NXDOMAIN
+ * NOTIMP
+ * REFUSED
+ * YXDOMAIN
  
-site_B:
-  name: 'local'
-  path: '/var/log/nginx/access-B.log'
-```
  
-When no configuration file is found, module tries to parse `/var/log/nginx/access.log` file.
+Configuration is not needed.
  
  ---
  
@@ -594,19 +1045,19 @@ Module monitor openvpn-status log file.
  
  **Requirements:**
  
-1. If you are running multiple OpenVPN instances out of the same directory, MAKE SURE TO EDIT DIRECTIVES which create output files
+ * If you are running multiple OpenVPN instances out of the same directory, MAKE SURE TO EDIT DIRECTIVES which create output files
   so that multiple instances do not overwrite each other's output files.
  
-2. Make sure NETDATA USER CAN READ openvpn-status.log
+ * Make sure NETDATA USER CAN READ openvpn-status.log
  
-3. Update_every interval MUST MATCH interval on which OpenVPN writes operational status to log file.
+ * Update_every interval MUST MATCH interval on which OpenVPN writes operational status to log file.
   
  It produces:
  
-**Users** OpenVPN active users
+1. **Users** OpenVPN active users
   * users
   
- **Traffic** OpenVPN overall bandwidth usage in kilobit/s
+2. **Traffic** OpenVPN overall bandwidth usage in kilobit/s
   * in
   * out
   
@@ -678,6 +1129,75 @@ Configuration is not needed.
  
  ---
  
+# postgres
+
+Module monitors one or more postgres servers.
+
+**Requirements:**
+
+ * `python-psycopg2` package. You have to install to manually.
+
+Following charts are drawn:
+
+1. **Database size** MB
+ * size
+
+2. **Current Backend Processes** processes
+ * active
+
+3. **Write-Ahead Logging Statistics** files/s
+ * total
+ * ready
+ * done
+
+4. **Checkpoints** writes/s
+ * scheduled
+ * requested
+ 
+5. **Current connections to db** count
+ * connections
+ 
+6. **Tuples returned from db** tuples/s
+ * sequential
+ * bitmap
+
+7. **Tuple reads from db** reads/s
+ * disk
+ * cache
+
+8. **Transactions on db** transactions/s
+ * commited
+ * rolled back
+
+9. **Tuples written to db** writes/s
+ * inserted
+ * updated
+ * deleted
+ * conflicts
+
+10. **Locks on db** count per type
+ * locks
+ 
+### configuration
+
+```yaml
+socket:
+  name         : 'socket'
+  user         : 'postgres'
+  database     : 'postgres'
+
+tcp:
+  name         : 'tcp'
+  user         : 'postgres'
+  database     : 'postgres'
+  host         : 'localhost'
+  port         : 5432
+```
+
+When no configuration file is found, module tries to connect to TCP/IP socket: `localhost:5432`.
+
+---
+
  # redis
  
  Get INFO data from redis instance.
@@ -772,6 +1292,45 @@ Without any configuration module will try to autodetect where squid presents its
   
  ---
  
+# smartd_log
+
+Module monitor `smartd` log files to collect HDD/SSD S.M.A.R.T attributes.
+
+It produces following charts (you can add additional attributes in the module configuration file):
+
+1. **Read Error Rate** attribute 1
+
+2. **Start/Stop Count** attribute 4
+
+3. **Reallocated Sectors Count** attribute 5
+ 
+4. **Seek Error Rate** attribute 7
+
+5. **Power-On Hours Count** attribute 9
+
+6. **Power Cycle Count** attribute 12
+
+7. **Load/Unload Cycles** attribute 193
+
+8. **Temperature** attribute 194
+
+9. **Current Pending Sectors** attribute 197
+ 
+10. **Off-Line Uncorrectable** attribute 198
+
+11. **Write Error Rate** attribute 200
+ 
+### configuration
+
+```yaml
+local:
+  log_path : '/var/log/smartd/'
+```
+
+If no configuration is given, module will attempt to read log files in /var/log/smartd/ directory.
+ 
+---
+
  # tomcat
  
  Present tomcat containers memory utilization.
@@ -805,3 +1364,134 @@ Without configuration, module attempts to connect to `http://localhost:8080/mana
  So it will probably fail.
  
  --- 
+
+# varnish cache
+
+Module uses the `varnishstat` command to provide varnish cache statistics.
+
+It produces:
+
+1. **Client metrics**
+ * session accepted
+ * session dropped
+ * good client requests received
+
+2. **All history hit rate ratio**
+ * cache hits in percent
+ * cache miss in percent
+ * cache hits for pass percent
+
+3. **Curent poll hit rate ratio**
+ * cache hits in percent
+ * cache miss in percent
+ * cache hits for pass percent
+
+4. **Thread-related metrics** (only for varnish version 4+)
+ * total number of threads
+ * threads created
+ * threads creation failed
+ * threads hit max
+ * length os session queue
+ * sessions queued for thread
+
+5. **Backend health**
+ * backend conn. success
+ * backend conn. not attempted
+ * backend conn. too many
+ * backend conn. failures
+ * backend conn. reuses
+ * backend conn. recycles
+ * backend conn. retry
+ * backend requests made
+
+6. **Memory usage**
+ * memory available in megabytes
+ * memory allocated in megabytes
+
+7. **Problems summary**
+ * session dropped
+ * session accept failures
+ * session pipe overflow
+ * backend conn. not attempted
+ * fetch failed (all causes)
+ * backend conn. too many
+ * threads hit max
+ * threads destroyed
+ * length of session queue
+ * HTTP header overflows
+ * ESI parse errors
+ * ESI parse warnings
+
+8. **Uptime**
+ * varnish instance uptime in seconds
+
+### configuration
+
+No configuration is needed.
+
+---
+
+# web_log
+
+Tails the apache/nginx/lighttpd/gunicorn log files to collect real-time web-server statistics.
+
+It produces following charts:
+
+1. **Response by type** requests/s
+ * success (1xx, 2xx, 304)
+ * error (5xx)
+ * redirect (3xx except 304)
+ * bad (4xx)
+ * other (all other responses)
+
+2. **Response by code family** requests/s
+ * 1xx (informational)
+ * 2xx (successful)
+ * 3xx (redirect)
+ * 4xx (bad)
+ * 5xx (internal server errors)
+ * other (non-standart responses)
+ * unmatched (the lines in the log file that are not matched)
+
+3. **Detailed Response Codes** requests/s (number of responses for each response code family individually)
+ 
+4. **Bandwidth** KB/s
+ * received (bandwidth of requests)
+ * send (bandwidth of responses)
+
+5. **Timings** ms (request processing time)
+ * min (bandwidth of requests)
+ * max (bandwidth of responses)
+ * average (bandwidth of responses)
+
+6. **Request per url** requests/s (configured by user)
+
+7. **Http Methods** requests/s (requests per http method)
+
+8. **Http Versions** requests/s (requests per http version)
+
+9. **IP protocols** requests/s (requests per ip protocol version)
+
+10. **Curent Poll Unique Client IPs** unique ips/s (unique client IPs per data collection iteration)
+
+11. **All Time Unique Client IPs** unique ips/s (unique client IPs since the last restart of netdata)
+
+ 
+### configuration
+
+```yaml
+nginx_log:
+  name  : 'nginx_log'
+  path  : '/var/log/nginx/access.log'
+
+apache_log:
+  name  : 'apache_log'
+  path  : '/var/log/apache/other_vhosts_access.log'
+  categories:
+      cacti : 'cacti.*'
+      observium : 'observium'
+```
+
+Module has preconfigured jobs for nginx, apache and gunicorn on various distros.
+
+---