Why automation must move from dashboards to closed-loop control, and the concrete targets each operator archetype should hit by 2030
The growing imperative for data centre operational optimisation
Historically, data centre operators have been content relying on a combination of rudimentary environmental sensors, alerts and manual interventions to maintain facility efficiency, as they prioritised minimum intervention, maximum uptime operations over chasing marginal efficiency gains. The rise in data centre infrastructure management (DCIM) platforms and simple automations using the dead band logic have begun the process of implementing dynamic facility optimisation, but a step change towards multi-domain orchestration is both accessible and desirable, in particular for new facilities coming online.
Four key drivers are pushing data centre operators towards this approach:
- Scale: Individual data centres are growing in scale, meaning a proportional efficiency saving will have a larger impact in dollar terms on opex.
- Density & operational risk: Increasing density is creating new failure modes across the facility, such as in secondary loop coolant fouling and safety concerns associated with DC electrical architectures (although this is not yet common in even the densest facilities today), and in turn increasing both the frequency and impact of potentially damaging events across the data centre. This is against a backdrop of ‘sensitive’ GPUs which have tight environmental windows for optimum performance.
- Public perception: Regulation and public perception are adding to the efficiency imperative for data centre operators, with countries such as Germany mandating new DCs reach a PUE below 1.2 within two years of commissioning, and regulated metrics expected to broaden at least to WUE in the near future.
- Competition: Competition is increasing, and market leading PUE is increasingly mandatory just to get a seat at the table, with innovations such as cross IT/facility optimisation and grid interactivity necessary for those seeking to differentiate based on operational maturity and efficiency.
From human-led oversight to closed-loop automation
DCIM has historically strived to maximise the visibility a human employee has into the data centre environment under their control. Automated systems are pushing this workflow beyond just observe and alert, towards closed-loop workflows that replace the alert with an action, review of said action and subsequent continuous learning. It is important to note that this does not necessarily mean the human is out of the loop, they can still approve some or all of the proposed actions of an automation engine. The most advanced data centres have removed the human from the loop to focus on anomalous events and cross-checking recorded data across sources and with their boots on the ground – this stretches the role of data centre manager and it will remain an integral and immensely valuable position.
This cross-domain automation is usually achieved using a combination of advanced telemetry and sensing, an AI-backed interaction engine (often referred to as a digital twin) plus linked analytics and recommendation engine, and a front-end UI from which a range of roles can analyse a wide breadth of data related to the facility, and interrogate automated optimisation decisions. A simple maturity model for this transition is as follows:
1. Visibility (DCIM)
2. Human-in-the-loop automation
3. Single domain autonomy
4. Multi-domain autonomy
5. Self-optimising facility
Key challenges in moving from visibility to autonomous operations
Some of the key challenges to data centre operators progressing in maturity from visibility to autonomy include:
- Challenge: data centres environments vary significantly based on location, customers, age of facility, power/ cooling infrastructure, and more, and are
- Resolution: data centre operators should ensure close collaboration with vendors across the landscape to ensure accurate data collection and analysis
- Challenge: transforming legacy data centre facilities with equipment from a wide range of vendors into an environment which is both data-rich and able able to be analysed holistically as opposed to within several domain-specific data platforms (e.g. cooling optimisation software)
- Resolution: either focus on a third party sensing and analysis layer on top of existing infrastructure (such as Ekkosense), or invest in a staggered retrofit where the business case permits, to prove the value within your facility prior to investing in a full retrofit.
- Challenge: security, uptime and not introducing a single point of failure
- Resolution: design for security as opposed to squeezing it in as an afterthought, through stringent access controls, redundant instances and parallel process – any automations should be documented and have named owners to ensure smooth course correction and subsequent root cause analysis for operational employees
The emerging data centre automation ecosystem
A number of vendors are emerging to help data centre operators progress towards more autonomous operations by applying AI-driven optimisation across facility systems. These platforms typically overlay existing operational data from across the data centre and apply advanced analytics or machine learning to dynamically optimise systems such as power distribution, cooling infrastructure, and plant equipment.
One example is Phaidra, which develops autonomous AI agents designed to optimise complex industrial systems, including data centre environments. Its approach uses reinforcement learning to model the dynamics of individual facilities and adjust operational parameters accordingly. This allows optimisation strategies to be tailored to the objectives of each deployment, whether that is minimising operating costs in enterprise environments or maximising GPU utilisation and uptime in AI-focused facilities. Phaidra’s positioning increasingly reflects the growing demands of AI infrastructure, with its platform designed to orchestrate the power, cooling and workload management systems underpinning large-scale GPU deployments. While still an emerging category, companies like Phaidra illustrate the type of platforms that are beginning to enable more autonomous, cross-domain optimisation in modern data centres.
Key recommendations
For AI factories
Pilot closed-loop automations if you haven’t already, focusing on single domains to start with before progressing to cross-domain automations should those initial automations prove valuable. Bring standardised guidelines into design and commissioning around sensing across the facility, and retrofit older sites where necessary. This standardisation is crucial to building up a proprietary bank of operational data which can be leveraged by AI models now and in the future to optimise facility operations.
For wholesale colocation operators
Engage wholesale tenant’s early to understand what requirements they have around integration of BMS with any automation software they are planning on implementing. Even if there are no immediate dependencies on the facility side of operations, guiding tenants towards futureproofing their deployments for such software in the future is a strong way to promote credibility and technical excellence to customers and prospects. Evaluating opportunities to monetised advanced data packages for premium customers is definitely worthwhile, and can be explored easily through regular customer engagement.
For multi-tenant, retail colocation operators
Consider low-friction open-loop automations on facility-side infrastructure which has no direct impact on tenants first. This wil help slowly develop the culture into one which is comfortable supporting the concept of automation, as opposed to always having a data centre manager having full visibility into the why and how of every decision. Develop this into targeted single-domain automations, liaising with larger customers to evaluate willingness to pay for advanced data packages related to their deployments.
Looking for advisory services in data centres? Schedule a call.
Download the data centre insights overview pack
Download the data centre insights overview pack
Get a concise, practical summary of the data centre market: the impact of AI and sovereign strategies, the differentiators that win in a crowded landscape, and proven frameworks for market entry, channel partnerships and customer acquisition—backed by case studies and sample deliverables.
The hidden economics of liquid-cooled data centre retrofits
This article, kindly supported by Airedale, explores when liquid cooling retrofits make commercial sense, showing why the decision depends on HPC pricing, facility fill, market constraints and disciplined execution.
Key takeaways from the launch of the Quantum Infrastructure Council
STL Partners attended the launch of QIC to explore what it will take to deploy
quantum technologies in real-world data-centre environments.
The EU’s AI Gigafactory Initiative: What it means for digital infrastructure?
The EU’s €20 billion AI gigafactory initiative will fund up to five large-scale AI compute facilities, each designed to support frontier model training with more than 100,000 advanced AI processors.
The hidden economics of liquid-cooled data centre retrofits
This article, kindly supported by Airedale, explores when liquid cooling retrofits make commercial sense, showing why the decision depends on HPC pricing, facility fill, market constraints and disciplined execution.
Key takeaways from the launch of the Quantum Infrastructure Council
STL Partners attended the launch of QIC to explore what it will take to deploy
quantum technologies in real-world data-centre environments.
Modular data centres: can prefabricated design speed up construction?
A look at whether modular data-centre design can help operators deliver capacity faster, scale more flexibly and respond to rising demand.