COVID-19 Pandemic & DCIM Essentials
Published on March 30, 2020,
COVID-19 is a new challenge to data center managers, IT, and facilities teams. We in the DCIM world are constantly planning and developing contingency and disaster plans. But most of these are around physical disasters, fire, environmental, and power disruption. Now we are faced with a workforce disruption and a demand to shift connectivity from centralized networks to external networks, entirely out of our control. What do we do to keep our data centers and the critical workloads running and the lights on (so to speak)?
Managing Resources & Workload More Efficiently
Without trying to be self-serving, the bulk of the answer, and a disaster plan relies on a DCIM software implementation. Now things like floor space planning, cable management, and audit might not be critical since there most likely isn’t a person to do the physical activities associated with asset moves, compliance audits, or running cables. But what DCIM does bring is:
1) Centralized Management, the ability to see all facilities and assets from a centralized point
2) Network mapping, to identify dependencies between a workload, server, storage, power, and network connections
3) Fault simulations
4) Automated monitoring of any asset changes to the networked infrastructure
5) Automated audit trails for any workflows executed
Let’s look at how each of these is important in today’s current environment (not that they are not at any time).
Why These Matter
Centralized Management. Several things come to mind: you may be faced with fewer people to manage, and all are working remote. If you can monitor all data center activity, colocation, and edge computing from a single view, you have less opportunity for an unstaffed monitoring center to disrupt business. Also, the central view reduces the bandwidth and connectivity uncertainties, that are being experienced as more traffic moves to the public network, from a disaggregated management system. While the hybrid compute world that we have been striving for has allowed for more freedom to users, that laissez-faire approach now needs to be monitored and checked, at least for now.
Network Mapping. If you don’t know what happens when you pull a plug, you could be in big trouble! DCIM Automation provides the scanning of the hybrid digital infrastructure to identify the network and power connections to each asset and the corresponding workloads running on those assets. This discovery process enables a clear picture of the effects of planned and unplanned disruptions. With data regarding the dependencies, you eliminate the time consuming “needle in the haystack” search approach.
Fault Simulations. So, network mapping is great, but it still is essential to be able to run a fault simulation. What happens when a power connection is lost, what is the cascade effect? And just because there is an N+1 redundancy in power, does not mean that that solves the problem. If a rack is already running at a high consumption rate, that second “redundant” supply may be overwhelmed. It is best to know that before the lights go out.
Automated Asset Monitoring. With the best-laid plans, things still happen we don’t expect. While the on-hand workforce may be limited, there are still people around to “touch’ things – data center, colocation facilities, and certainly out at edge sites. While you have the Centralized Management we already talked about, people still do things out of our sigh. DCIM automated monitoring scans the network as part of the network mapping function and takes inventory of everything coming on and leaving the network. Network mapping ensures unplanned changes are captured, and planned changes are validated to be complete and correct. This mapping also provides reassurance that some nefarious implementation hasn’t occurred by taking advantage of the reduction of eyes on the infrastructure.
Automated Audit Trails. While we said we are probably not concerning ourselves in routine compliance and finance audits, we still have some change orders that are happening. Given the potential reduction of personnel and the possibility of unskilled help, having an audit trail of who did what and when is critical should something need to be addressed…again avoiding that “needle in the haystack.”
How Nlyte Can Help
Nlyte DCIM is a complete Automated DCIM solution delivering all of the classic components of DCIM from Asset Lifecycle Management, Space and Capacity Planning, Automated Workflows, to Power and Cooling management. Additionally, Nlyte offers as part of its core DCIM solution Nlyte Command Center that further delivers critical support for a Pandemic Disaster Plan by:
- Enabling credentialed personnel to log in and perform specific tasks on individual end devices from a remote location.
- Providing power cycling assets that have frozen or need to be hard-booted.
- Remotely managing access by locking/unlocking cabinets.
- Allowing authorized personnel to make changes to environmental systems such as changing fan speeds or adjusting set points.
- Extending capabilities to data centers, data rooms, remote facilities and colocation footprints.
For more information on what to do with your Data Center and Hybrid Cloud during this COVID-19 pandemic check out Uptime Institute’s – UI Intelligence Report 37