New lab offers generative AI for defense wargaming

A new lab is hoping to use generative AI to enhance defense wargaming.

The goal of the new GenWar lab, scheduled to open in 2026 at the Johns Hopkins Applied Physics Laboratory in Laurel, Maryland, is to improve tabletop exercises by harnessing the speed and user-friendliness of large language models, or LLMs, which include popular chatbots such as ChatGPT.

By accessing AI during an exercise, human players can quickly experiment with different strategies. At the same time, human participants will be assisted by AI agents playing the role of staff advisers or even enemy leaders.

But the GenWar lab will offer even more intriguing possibilities. One is to allow human players to interact directly with sophisticated computer models operating behind the scenes of tabletop exercises. Another is the possibility of wargames played solely by AI actors on both sides.

“We’ve heard a demand signal from our sponsors of the need to do wargaming faster,” Kevin Mather, who heads the GenWar Lab, told Defense News. “The ability to get more in depth, the ability sometimes to be able to include modeling and simulation and do what-if analysis.”

Since the Prussian Army began using “kriegsspiel” to train staff officers in the 19th Century, tabletop exercises have essentially pitted Blue against Red teams, with an umpire to judge.

The problem is that these wargames are labor-intensive to design and adjudicate, and too cumbersome to allow replay and incorporating lessons learned.

But AI could yield multiple iterations of a game, or allow a scenario to be redone. For example, players might choose a strategy, only to have a subject matter expert rule that the idea wasn’t realistic.

“Let’s rewind the gameplay and go back one turn,” said GenWar Lab program manager Kelly Diaz, who discussed a hypothetical scenario with Defense News. “We’re going to retry that move. And because it’s all digital, we’ll have a log. Afterward for the post-game analytics, we can kind of trace through how the decisions were being made.”

To facilitate this, GenWar Lab uses an array of tools. GenWar TTX creates the digital environment and AI agents for the exercise. GenWar Sim — built on the government-owned Advanced Framework for Simulation, Integration and Modeling, or AFSIM — allows players to interact with physics-based models used for adjudication.

Thus, GenWar Sim functions as a translator that lets humans communicate in plain speech with the mathematical models operating behind the scenes.

“As a human, you give your commands: ‘I’d like to attack here, I’d like to defend there,’” explained Mather. “We’ve built code in the modeling and sim engine to read from that database layer and automatically execute those commands.”

Conversely, the LLMs can communicate with humans in ordinary speech.

Still, the thought of AI players may give some humans pause. As anyone who has played commercial strategy games can attest, computer players are not the sharpest of opponents.

“We won’t remotely claim that they’re making optimal decisions,” Mather said.

However, the hope is that their decisions are realistic enough to facilitate gameplay and allow the players to explore multiple strategies during an exercise.

Mather and Kelly emphasize that AI will not replace traditional wargaming techniques.

“It’s not nearly as in depth, for example, as a traditional ops analysis or modeling and sim study,” Mather said. But AI can offer “those 70% to 80% solutions that are not the answer, but really accelerate the human learning,” he added.

As AI is woven into societal fabric, general consensus is that it will inevitably become part of wargaming. The question is whether AI will dominate in the space.

Benjamin Jensen, a researcher at the Center for International and Strategic, believes AI can enhance wargames — if properly documented and evaluated.

The risk is that “we find strategic analysis reduced to, ‘Here is what an LLM said,’” he told Defense News.

How well LLMs relate to national security policy is also unclear.

“The larger challenge is that most foundation models commonly used haven’t been sufficiently benchmarked against strategy and statecraft,” Jensen said. “So, using AI to support game design, development and execution is a great idea. The question is how far that use goes and how well it is documented to avoid common pitfalls.”

About Michael Peck

Michael Peck is a correspondent for Defense News and a columnist for the Center for European Policy Analysis. He holds an M.A. in political science from Rutgers University. Find him on X at @Mipeck1. His email is [email protected].

Read the full article here