HP Integrity rx2600 Especificações descarregar pdf (Página 27)

The Integrity rx1600, rx2600, rx4640, and rx5670 servers employ dynamic processor resiliency (DPR), too. With

DPR, any CPU generating correctable cache errors at a rate deemed unacceptable is de-allocated from use by the

system. This feature helps protect against a CPU degrading to the point where it may cause system crashes.

DPR works like this: When excessive errors are reported against a CPU, the CPU is deactivated (that is, the

operating system will not schedule any new processes on it). The system firmware remembers the CPU’s serial

number and the time when this action was taken. From then on, at each poll interval the system monitor checks (by

comparing the serial numbers) to see if the CPU has been replaced or not. If the processor has been replaced, its

history is reset.

If the system is rebooted before the offending CPU has been replaced, the monitor generates a warning message

and immediately de-allocates the CPU. (Such CPU de-allocation is only supported in the HP-UX operating system.

It is not supported in Windows or Linux.)

Comprehensive error logs

All system events are stored in the system event log (SEL) in nonvolatile memory. In addition, system firmware

creates activity and forward progress logs (FPLs) in nonvolatile memory. In all but the most extreme situations—that

is, in more than 95 percent of cases—this information will be sufficient to diagnose system failures to a single

replaceable part. The SEL and FPL are available to both the management processor (and therefore are available

remotely) and to system-level tools, leading to quick and accurate diagnosis.

Fault management throughout the lifecycle

Fault management is HP’s overall strategy and program to provide a complete value chain for detection,

notification, and repair of system problems. Fault management starts right during the design phase, when

hardware and OS designers include capabilities and instrumentation points that provide the ability to detect and

isolate system anomalies. Monitors are created to poll for system health information or to asynchronously respond

to instrumentation points that have been designed into the system to report problems or faults.

Fault management also involves implementing several methods for maintaining historical event information,

allowing preservation of information for analysis or trending. Faults that generate errors and warnings are

automatically logged to syslog, while notes and audit information are copied to an event log. Other options are

available for preserving historical information as well.

Fault management provides immediate alerts of problems—and even potential problems—as soon as they are

detected, so customers can take corrective action. In some cases fault monitors are actually smart enough to repair

faults or prevent them from occurring.

Capabilities of fault monitors

Fault management, coupled with the monitoring capabilities, keeps tabs on the health of system components and

generates close to real-time events when problems develop. These events can trigger corrective action to enable

the system to continue functioning, or they can trigger alerts to systems personnel to appropriately handle the

situation before it becomes more severe.

Fault monitors are able to:

• Poll the system for health information

• Handle asynchronous events that have been designed into the hardware or software

• Perform corrective action when possible

• De-allocate failing memory before it fails (dynamic memory resiliency)

• De-allocate failing processors before they fail (dynamic processor resiliency)

• De-configure failed processors from the working set before the next reboot

• Shut down the system when power failure causes a switch to UPS

• Manage events so that system performance is not hindered in the face of errors

• Provide information on problem causes and what actions to take

Notification and integrated enterprise management

Fault management currently uses the HP EMS (Event Monitoring Service) infrastructure for its notification

methodology. EMS enables a wide variety of notification methods, including pager, e-mail, SNMP traps,

system console, system log, text log file TCP/UDP, and HP OpenView Operations center (OPC) messaging. Fault

1 2 ... 22 23 24 25 26 27 28 29 30

Comentários a estes Manuais

Sem comentários

HP Integrity rx2600 Especificações Página 27

Comentários a estes Manuais

Manuais e produtos relacionados com Barebones PC/estação de trabalho HP Integrity rx2600