Database Table: csv2_system_status

This table maintains the status of CSV2 services and is used to indicate the ‘health’ of the CSV2 system. The table has a single row. Within this row, each service represented will have one or more columns indicating the status of the service.

The csv2-system-status poller periodically checks the status of CSV2 services maintaining the following fields:

  • xxx_status and xxx_msg, where ‘xxx’ is a service name
  • load
  • disk, disk_size, disk_used
  • ram, ram_size, ram_used
  • swap, swap_size, swap_used
  • last_updated

In addition, many of the services are multi-process services with a parent process calling the CSV2 library function ‘ProcessMonitor’ to instantiate and monitor its data gathering child processes. Services of this kind will have an error count (‘xxx_error_count’) maintained by the parent process/ProcessMonitor which increments and decrements this counter depending on whether child errors are observed or not during its’ monitoring cycle. Each time a child process error is observed, the count is incremented by one. Otherwise, any count greater than zero is decremented by one. Low counts are considered normal, since transient polling error may arise from may sources. However, counts equal or greater than the PrcosessMonitor’s configurable ‘orange_threshold’ constitute a warning which are high-lighted by the User Interface (UI).

Keys:

  • id (Integer):

    Is a unique numeric key for the status record.

Columns:

  • csv2_main_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_main_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • mariadb_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • mariadb_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_openstack_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_openstack_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_openstack_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_jobs_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_jobs_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_jobs_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_machines_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_machines_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_machines_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_condor_gsi_error_count (Integer):

  • csv2_condor_gsi_status (Boolean):

  • csv2_condor_gsi_msg (String(512)):

  • csv2_status_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_status_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_status_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_timeseries_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_timeseries_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_timeseries_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_ec2_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_ec2_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_ec2_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_htc_agent_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_htc_agent_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_htc_agent_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_glint_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_glint_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_glint_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_watch_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_watch_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_watch_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • csv2_vm_data_error_count (Integer):

    transient poller error count maintained by the ProcessMonitor (see above).

  • csv2_vm_data_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • csv2_vm_data_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • condor_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • condor_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • rabbitmq_server_status (Boolean):

    If set to 1, the service is up. Otherwise, no part of the service is running.

  • rabbitmq_server_msg (String(512)):

    A detailed status message indicating the service run time or the failure time.

  • load (Float):

    The current load average on the CSV2 server.

  • ram (Float):

    The percentage of RAM used on the CSV2 server.

  • ram_size (Float):

    The size of RAM on the CSV2 server.

  • ram_used (Float):

    The size of used RAM on the CSV2 server.

  • swap (Float):

    The percentage of swap space used on the CSV2 server.

  • swap_size (Float):

    The size of swap space on the CSV2 server.

  • swap_used (Float):

    The size of used swap space on the CSV2 server.

  • disk (Float):

    The percentage of disk used on the CSV2 server.

  • disk_size (Float):

    The size of disk on the CSV2 server.

  • disk_used (Float):

    The size of used disk on the CSV2 server.

  • last_updated (Integer):

    The time the status record was last updated.