23 Jan 2024
by Adam C, Richard Carter

Large language models and intelligence analysis

Guest blog by Adam C, Chief Data Scientist at GCHQ and Richard Carter, Senior Research Consultant at CETaS #NatSec2024

Large Language Models in the Wild

When OpenAI released ChatGPT, many were impressed by its ability to synthesise information and produce amusing content. Practically overnight, the Internet was flooded with interesting, funny, scary, and perplexing examples of it being used for various purposes.

This new generation of large language models (LLMs) has also produced surprising behaviour where the chatbot would get maths or logic problems right or wrong depending on the prompt or would refuse to answer a direct question citing moral constraints but would subsequently supply the answer if it was requested in a more obscure way. This raises questions about how organisations can most effectively use LLMs and security or safety concerns, three of which are worthy of particular attention:

  1. Prompt hacking (tricking LLMs into providing erroneous or malicious results). AutoGPT is an example of how this risk is amplified through its automation of complex tasks through chained prompts, combining use of GPT4 for reasoning, GPT3.5 for content generation and natural language conversation, and Internet access to perform web searches and examine websites. This semi-autonomous capability surpasses traditional chatbots, enabling the system to take real-world actions independently, potentially leading to unforeseen security risks as LLMs integrate with physical and digital assets.
  2. Diminished cybersecurity standards. There are serious concerns that individuals are providing proprietary or sensitive information to LLMs such as ChatGPT, or that sensitive information was inappropriately used in training; these issues have the potential to introduce new data security risks.
  3. Disinformation and threats to democratic processes. Generative AI, such as LLMs, has significantly improved the ability not only of state actors or organised crime groups to launch disinformation campaigns but, worryingly, lowered the barrier to entry for less-sophisticated actors to produce their own campaigns and to potentially cause significant damage. This has become a pressing national security concern akin to conventional cyber-attacks such as hacking. There are also concerns about the security of democratic processes, and how institutions cope with the potential deluge of fake but realistic-looking content flooding different channels. A new report from The Alan Turing Institute’s Centre for Emerging Technology and Security explores the security risks posed by generative AI in more depth.

Large Language Models in the Intelligence Context

However, assuming barriers can be overcome and these risks appropriately managed, there are numerous potential practical uses of LLMs. This includes within the intelligence community, where the manual processing of very large volumes of data has historically been a highly resource-intensive and time-consuming process.

Five areas where LLMs could potentially provide significant improvements to the intelligence analysis process include:

  1. Productivity assistants. For example, autocompleting sentences, proofreading emails, and automating certain repetitive tasks. These will offer valuable efficiency gains to those working within the intelligence community, as with any other large organisation.
  2. Automated software development and cybersecurity. As LLMs can now also automate software development, GCHQ is encouraging cybersecurity analysts to study LLM-written code for vulnerabilities. In the future, LLMs could significantly enhance the efficiency of software development within the intelligence community.
  3. Automated generation of intelligence reports. Intelligence reports are core products for the intelligence community. While LLMs are unlikely to be trusted to generate finished reporting for the foreseeable future, there might be a role for them in the early drafting stages, similar to an extremely junior analyst whose work requires supervision and substantial revision before release.
  4. Knowledge search: A game-changing LLM capability would be one that could, in a self-supervised manner, extract knowledge from massive corpora of information, distilling facts and identifying where and how they evolve over time, and which entities (individuals and organisations) are most influential.
  5. Text analytics: LLMs have the potential to improve text analytics quality, enable instantaneous deployment, and support an iterative chain-of-reasoning process for analysts seeking further detail or to extract further summaries on targeted themes.

While these capabilities are promising, the real potential for LLMs to augment intelligence work will not be fully realised by the current generation of LLMs. Significant improvements will need to be made along three alignment criteria before integration of such capabilities into everyday intelligence work:

  • Helpfulness: The model’s ability to follow instructions; a model that does not follow the user’s instructions is not always helpful in all circumstances.
  • Honesty: The propensity of the tool to output answers that are convincing, yet factually incorrect. Unless the user is more knowledgeable than the tool, then there is a risk that the user accepts such outputs as being true.
  • Harmlessness: A model might create harm either by producing biased or toxic outputs due to the data it was trained on, or by producing erroneous outputs which lead the user to act in a way that subsequently results in some form of harm.

To be truly game-changing for the national security community, models must reliably provide verifiable sources and explain how it came to its conclusions. Current text-based foundation models have the right framework to generate text but analysts need to be able to query the model’s knowledge: facts it has gleaned, why it believes those facts, and pieces of evidence that support and/or contradict its conclusions.

Models must also be rapidly updateable to be used in dynamic, real-world mission-critical situations. Current foundation models, trained over a long period of time, lock in information at time of training. A mechanism for ‘live’ updating of the model with new information is still a fundamental requirement. To address this, emerging trends with encouraging results involve training and fine-tuning smaller models on specific, highly relevant data for a specific community.

Finally, models must support complex chain-of-reasoning and multi-modal reasoning. Achieving this may require the development of hybrid architectures such as neurosymbolic networks, that combine the statistical inference power of neural networks with the logic and interpretability of symbol processing. Additionally, machine learning models must be significantly more robust to tampering, in addition to being explainable and citeable. This is particularly important in the national security context, where the decisions made based on the insights provided could have significant consequences for individuals and wider society.

Looking ahead

In the intelligence community work is mostly conducted in secret. Were we to naively trust an LLM, we might be inadvertently exposing our analytic rigor to substantial misinformation, which could have significant consequences. The cost of putting in place the necessary safeguards required to manage the risks will need to be weighed against the possible benefits this technology could offer for intelligence work. Current LLMs show promising potential but the most promising use cases are still on the horizon, and future efforts should focus on developing models that understand the context of the information they are processing – rather than just predicting what the next word is likely to be.

techUK’s National Security Week 2024 #NatSec2024

The National Security team are delighted to be hosting our annual National Security Week between Monday, 22 January 2024, and Friday, 26 January 2024.

Read all the insights here.

National Security Programme

techUK's National Security programme aims to lead debate on new and emerging technologies which present opportunities to strengthen UK national security, but also expose vulnerabilities which threaten it. Through a variety of market engagement and policy activities, it assesses the capability of these technologies against various national security threats, developing thought-leadership on topics such as procurement, innovation, diversity and skills.

Learn more

National Security updates

Sign-up to get the latest updates and opportunities from our National Security programme.





Adam C

Chief Data Scientist, GCHQ

Richard Carter

Richard Carter

Senior Research Consultant, CETaS