Nagios 4.0 正式版发布,企业级监控系统

来源: 投稿
作者: fei
2013-09-22 00:00:00



4.0.0正式发布。2013-09-20.这是新的产品系列.包含大量改进以及性能增强等。 Nagios Plugins还是2012-06-27的1.4.16 遗留稳定版是2013-08-30的3.5.1


  1. Performance Improvements:

    The performance improvements in Nagios Core 4 come primarily from the following areas:

    • Core Workers - Core workers are lightweight processes whose only job is to perform checks. Because they are smaller they spawn much more quickly than the the old process which forked the full Nagios Core. In addition, they communicate with the main Nagios Core process using in-memory techniques, eliminating the disk I/O latencies that could previously slow things down, especially in large installations.
    • Configuration Verification - Configuration verification has been improved so that each configuration item is verified only once. Previously configuration verification was an O(n2) operation.
    • Event Queue - The event queue now uses a data structure that has O(log n) insertion times versus the O(n) insertion time previously. This means that inserting events into the queue uses much less CPU than in Nagios Core 3.
    • Macro Resolution - Macros are now sorted on startup so macro lookup can use a binary search. In addition, frequently accessed macros $USERx$, $ARGx$, and $HOSTADDRESS$ are given special case, early lookups.
  2. Object Definitions:

    The following changes have been made to object definitions:

    • The host address attribute is now optional. The address attribute is set to the host name when it is absent. Most configurations set the host name attribute to the DNS host name making the address attribute redundant.
    • Both hosts and services now support an hourly value attribute. The hourly value attribute is intended to represent the value of a host or service to an organization and is used by the new minimum value contact attribute.
    • Services now support a parents attribute. A service parent performs a function similar to host parents and can be used in place of service dependencies in simple circumstances.
    • The failure_prediction_enabled flag has been removed from both host and service object definitions.
    • Contacts now support a minimum value attribute. The mininum value attribute is used with the host and service hourly value attributes to determine whether to notify a contact on host and service problems.
    • The host obess_over_host and the service obsess_over_service attributes can now both use the shortened attribute obsess.
  3. Object Behavior:
    • Contact Inheritance - According to the documentation, contacts should only be inherited from host to service if the service has no other contacts whatsoever (and the same goes for escalations), but the way the code previously worked was that it handled contact_groups and contacts directives separately, meaning services with only 'contacts' specified were still eligible for inheriting 'contact_groups' from the host. This has been updated to comply with the documentation.
    • Timeperiods - There were several issues processing timeperiods when both exclusions and exceptions were involved. The issues have been corrected.
  4. Configuration:

    The following changes have been made to the main Nagios Core configuration, nagios.cfg:

    • Because there are many ways to obtain object information, the object information is no longer stored if in the object cache if the configuration variable object_cache_file equals '/dev/null'. Setting the variable to '/dev/null' will reduce the disk I/O load.
    • Because there are many ways to obtain status information, the status information is no longer stored if in the status data file if the configuration variable status_file equals '/dev/null'. Setting the variable to '/dev/null' will reduce the disk I/O load.
    • There is a new configuration variable, log_current_states, which determines whether current states will be logged in the log files when they are rotated. In Nagios Core 3, this was always the behavior and it is the default in Nagios Core 4. Disabling the logging of current states on log rotation can save considerable disk space for large installations.
    • There is a new configuration variable, check_workers, which specifies how many worker processes are created when Nagios Core starts. If not specified, the number of worker process is determine by the number of CPUs on the system.
    • There is a new configuration variable, query_socket, which specifies the location of the query handler socket. The default location is /usr/local/nagios/var/rw/nagios.qh.
    • The configuration variables, check_result_reaper_frequency and max_check_result_reaper_time, have been deprecated. Because of the new worker architecture, checks are no longer reaped, but they are fed back to core by the worker processes. As a result, these variables no longer make sense.
    • All file and directory configuration variables in the main nagios.cfg can now use paths that are relative to the location of nagios.cfg.
    • Although rarely used in the past, creating nagios objects in the main nagios.cfg configuration file was allowed. This is now prohibited.
  5. Macros:
    • Additions - A new macro, $CHECKSOURCE$, has been added which contains information about what process performed a check.
    • Changes - If use_large_installation_tweaks is set, the $HOSTGROUPMEMBERS$ and $SERVICEGROUPMEMBERS$ macros are no longer exported because they can consume the available space for environment variables.
    • Macros are normally available as environment variables when check, event handler, notification, and other commands are run. This can be rather CPU intensive in large Nagios installations, so you can disable the export of environment variables completely with the enable_environment_macros option.
    • Macro information can be found here.
  6. Query Handler:

    The query handler is a general purpose communication mechanism that allows external entities to communicate with Nagios Core in a well-defined manner. As of this writing, all communication with the query handler takes place through a Unix-domain socket whose location is defined by the query_socket configuration variable.

    There are currently 5 built-in query handlers.

    • core - provides Nagios Core management and information
    • wproc - provides worker process registration, management and information
    • nerd - provides a subscription service to the Nagios Event Radio Dispatcher (NERD)
    • help - provides help for the query handler
    • echo - implements a basic query handler that simply echoes back the queries sent to it

    More information about the query handler interface, including an introduction to creating a custom query handler, can be found in the source-supplied documentation.

  7. Core Workers:

    Previously, all host and service checks were performed by the full Nagios Core process. This required forking the Nagios Core process for every check. The full Nagios Core process includes a lot of things that are not required to actually perform the check, including check scheduling, downtime handling, processing external commands, etc. As a result, forking the Nagios Core process was much slower than was necessary. When the actual check was run, the forked process again forked a shell to run the check and the shell forked to run the plugin.

    In addition, disk files were used as the inter-process communication (IPC) mechanism between the forked Nagios process doing the checking and the main Nagios process handling the check results.

    In Nagios Core 4, the process of performing host and service checks is now accomplished using a lightweight worker processes. Standard worker processes start up with the main Nagios Core process and additional, special-purpose workers, can be started at any time after Nagios Core starts. If the check command is "simple" (no shell escapes), the worker process can run the command directly, avoiding the 2 additional forks previously required.

    Also in Nagios Core 4, the worker processes report the check results to the main Nagios Core process using in-memory IPC mechanisms (the query handler interface), eliminating the disk I/O bottleneck that used to be an issue in large installations.

    When a worker process registers with the main Nagios Core process, it tells Nagios Core what checks it will handle. This feature allows external authors to create special-purpose workers which are optimized to perform certain checks. A sample special-purpose ping check worker is included with the Nagios Core source code in the worker/ping subdirectory.

    More information about workers, including an introduction to creating custom workers can be found in the source-supplied documentation.

  8. Nagios Event Radio Dispatcher (NERD):

    The Nagios Event Radio Dispatcher (NERD) is a query handler based service that streams Nagios Core events to the subscriber. Currently, there are three channels that can be subscribed to: hostchecks, servicechecks and opathchecks.

  9. libnagios:

    libnagios is a library of functions that can be used by developers of query handlers and worker processes. libnagios currently contains the following components.

    • bitmap - bitmap library for calculating dependency graphs
    • dkhash - dual-keyed hash api
    • fanout - sparsely populated array used for downtime, comments, and worker jobs
    • iobroker - I/O broker library for multiplexing between running tasks and the master nagios process.
    • iocache - I/O caching libary for bulk-reading requests and parsing them
    • kvvec - key/value library for parsing requests and building responses
    • nsock - socket library for connecting to and communicating through the qh socket
    • nspath - general purpose path library for converting between relative and absolute paths
    • nsutils - small library with worker related utilities
    • pqueue - pqueue library written by Volkan Yazici
    • runcmd - for spawning and reaping commands
    • skiplist - skiplist library used within Nagios Core
    • squeue - for maintaining a queue of the running job's timeouts
    • worker - for utils and stuff nifty to have if you're a worker
  10. Documentation:

    Documentation of Nagios Core internals is now provided as part of the source distribution. To create an HTML version of this documentation run 'make dox' from the root of the source distribution tree. The doxygen utilities must be installed to make this documentation.

  11. Tests:

    A much more complete test suite is now incuded with the Nagios Core source distribution.

  12. RPM Spec File:

    The RPM spec file has been completely overhauled to support more current standards.

  13. Deprecated Features:
    • Extended Host and Service Information - The hostextinfo and serviceextinfo objects are now deprecated and should not be used. Support for them will be removed in a future version. The same information specified in the hostextinfo and serviceextinfo objects can be specified in the host and service object respectively.
    • -x/--dont-verify-paths command line option (Don't check for circular object paths) - Because configuration checking is now so much faster, the option to skip checking for circular object paths has been deprecated.
    • The following configuration variables have been deprecated: check_result_reaper_frequency, max_check_result_reaper_time, sleep_time, external_command_buffer_slots, command_check_interval
  14. Obsoleted Features:
    • Failure Prediction - As noted above, the failure_prediction_enabled flag has been removed from both host and service object definitions. Failure predition was never fully implemented and would require breaking the paradigm that Nagios Core knows nothing about the performance data returned by plugins. Failure prediction is much more approprately handled by an add-on than by Nagios Core.
    • -o/--dont-verify-objects command line option - This option, while accepted in Nagios Core 3, has neither been advertized nor has had any effect for quite some time. The option has been removed in Nagios Core 4.
    • Embedded Perl - Embedded Perl has historically been the least tested and the most problem prone part of Nagios Core. A significant part of the issue is that there are so many versions of Perl available. The performance enhancements provided by the new worker process architecture make up for any performance loss due to the removal of embeddd Perl. In addition, the worker process architecture makes possible the implementation of a special purpose worker to persistently load and run Perl plugins. The following configuration variables that were related to embedded Perl have been obsoleted: use_embedded_perl_implicitly, enable_embedded_perl, p1_file.
  15. Miscellaneous:
    • Object IDs - Primarily only of interest to developers, all of the first-class objects now have object IDs. First-class objects are timeperiod, command, contact, host, service, escalations, dependencies and all kinds of groups. Object IDs are not persistent and are recreated on each restart.


点击加入讨论🔥(9) 发布并加入讨论🔥
9 评论
51 收藏