The Key to Healthy IT Systems

The keys to maintaining personal and public health share commonalities with the keys to maintaining IT system health. The Centers for Disease Control published a list of Local Public Health System Performance Standards which lists 10 essential public health services. The first is monitoring health status. This is the first step in maintaining IT system health too.

In our previous post mHealth – Health IT Going Mobile, we covered the growth in the use of technology to monitor personal health. The high level objectives of these tools are:

  • Continuously monitor health status
  • Proactively alert when undesirable conditions are identified
  • Adjust conditions as necessary to prevent recurrence

When these same principals are applied to IT system health monitoring, it becomes obvious that it’s not enough to monitor networks and servers alone. To continue with the personal health analogy, that would be akin to monitoring for a person’s pulse. While important, more specific measurements are required for diabetes and sleep apnea patients. In order to measure the health of a specific software application, we need to monitor the application itself and not just the server and network it resides on.

The CES Monitor Everything approach improves on traditional monitoring models by adding deep monitoring tools that monitor the availability of specific software functions – user interfaces, database connections, reports, etc. CES uses automated tools to continually assess system status and auto-alert technicians as necessary so they can prevent service degradation. The goal is to predict the next user’s experience before that user actually logs on and use that information to ensure all necessary resources are available.

Monitor Everything uses automated tools to continuously monitor the availability of each IT infrastructure element from the network all the way down to specific features within each software application. The following IT resources, whether local or remote, on stationary servers and mobile devices, are monitored:

  • Hardware
  • Operating systems
  • Databases
  • Services
  • COTS software product features and functions
  • Custom developed software application features and functions
  • The monitoring tools themselves

When used by a responsive support team, Monitor Everything promotes excellent system health. Through collaboration with software users, IT experts establish acceptable usage loads for each monitored item based on the business needs of the user community. System administrators establish server load thresholds and software engineers set load thresholds for COTS and custom software. When the load on a monitored resource reaches its threshold, the IT team is alerted via email, text and/or phone calls.

As with medical standards of care, the type of response depends on the type of condition encountered. Each known condition requires a unique response documented in Standard Operating Procedures (SOP) that provides governance over the IT operation. When an unexpected condition is encountered, it provides an opportunity – and a mandate – to improve. To prevent recurrence, appropriate responses include monitoring additional software features, adjusting existing thresholds and updating the SOP.

The thresholds offer specific performance benchmarks against which system health is measured. If that sounds similar to healthcare best practices, it should. The purpose of the CDC publication referenced above is to “provide measurable performance standards that public health systems can use to ensure the delivery of public health services.” What’s good for personal and public health is also good for IT system health.