Beyond Copilot: driving real business outcomes for defence with AI
Oliver Rees
AI promises battlefield transformation but in government and defence progress often stops at meeting minutes and email summaries.
In the last few months “AI [has changed the war in] Ukraine”, a “major NHS AI trial [has delivered] unprecedented time and cost savings,” and civil servants have used “AI [to save] two weeks a year”. Dig a little deeper, though, and the picture becomes more nuanced. Yes, the NHS and civil service trials were impactful, but not in providing better patient care or delivering citizen outcomes. Instead, the most significant impact these trials had was to summarise long emails and take notes during meetings. While beneficial, these impacts are hardly the radical change that AI evangelists have promised us. In a similar way, the impact of AI on the battlefield in Ukraine is nothing compared to the introduction by Russian forces of simple unjammable fibre-optic drones in late 2024. It was this advance, not AI, that led Lt. Gen. Joseph Ryan to say “we are so far behind” in March of this year. And this, while AI assisted targeting still fails in cluttered environments where drones are often confused by puddles and trees.
These examples suffer from what the American sociologist Robert Merton calls “goal displacement”. In his brilliant 1940 analysis on the failure of bureaucracies he noted that “adherence to the rules, originally conceived as a means, becomes transformed into an end-in-itself.” In other words, while the processes of an organisation - meetings, compliance, committees - are at first designed to support an outcome or series of outcomes, they eventually tend towards becoming the outcome themselves. A close reading of the NHS and civil service trials illustrate this. It would be easy to think that the output of both organisations were emails and meeting transcripts.
In Defence, the SDR addresses this issue head on, stating that “in modern warfare, simple metrics such as the number of people and platforms deployed are outdated and inadequate.” However, as RUSI commentary on the SDR points out: “what, for example, does increasing the Army’s ‘lethality’ by a factor of ten mean? … The danger here is in the absence of more specific benchmarks (some of which the Review says the MOD can define for itself) there is significant wriggle room in ‘transforming’ the Armed Forces.” This ‘wriggle room’, in the context of AI implementations, means that the danger of falling into Merton’s goal displacement trap is real. Without clear outcome led goals being defined, there is a real risk of vendors offering generic AI productivity tooling that drives minor productivity improvements. To avoid this, and harness the genuine transformation shift that AI enables, Defence leaders should focus on three things: articulating a clear North Star vision, defining guardrail metrics and developing a process for continued experimentation.
North Star
Driving genuine change with AI requires leaders to have open and honest conversations about their North Star - and for MOD leaders to use the “wriggle room” created by the SDR to develop North Stars that support the lethality imperative. A carefully chosen North Star becomes the all encompassing metric at the heart of an organisation or business unit, and can be used to keep all new AI initiatives aligned to delivering genuine value, not low impact activities around the edges.
Guardrail metrics
In parallel, conversations about the guardrail metrics and principles that are put in place to avoid significant harm to products, people and the organisation are key. Guardrail metrics are more than just broad ethical frameworks, they are tangible, tactical rules that prevent AI initiatives from creating the wrong kind of disruption.
Process for experimentation
The key enabler for transformative AI initiatives is having a clear experimentation process. Empowering teams to experiment with AI to drive key North Star metrics, within the defined guardrails is how commercial organisations like Spotify are able to facilitate “58 teams [running] 520 experiments on Spotify’s mobile home screen alone” - driving genuine outcomes while avoiding negative effects. Setting up AI initiatives in this way, with the right governance, data quality and hypotheses is critical to driving real outcomes.