Developer’s Insights: AI Development

Greetings all,

Today is a special day for Developer’s Insights, where we discuss a topic that doesn’t involve any thing designed or visual in the game: the AI. To create an AI for Exodus: Proxima Centauri, we had to find a way to have the game generate behaviors that emulate human players. As Exodus is the most complex game we’ve made so far, it was our biggest AI challenge yet. Luckily, Gordon, our AI expert, had a few tricks up his sleeve.

The Infrastructure

The creation of our Exodus AI began with a decision matrix, a list of things that matter to a player in Exodus with a score value assigned to each item. These include ideas like “Planetary Control” and “Superior Mobility.” Here’s an example of the early decision matrix associated with our AI:

[[PLANETARY_CONTROL, 0];[OVERALL_BALANCE, 0];[AVOIDING_LOSSES, 0];[STRONG_SHIELDS, 0];[SPECIALIZATION, 0];[FLEET_ACTION, 0];[FLEET_SUPERIORITY, 0];[POP_CARRIER_DEFENSE, 0];[FIRST_STRIKE, 0];[SUPERIOR_MOBILITY, 0];[DEFENDING_SYSTEMS, 0];[HOARDING, 0];[FLEXIBILITY, 0];[STEALTH, 0];[CAPTURING_PLANETS, 0];[SNIPING_VP, 0];[FIXING_WEAKPOINTS, 0];[SAVING_MONEY, 0];[PENALTY, 1.00];[PICKING_OFF_CENTAURIANS, 0];[KILLING_POP_CARRIERS, 0];[EFFICIENCY, 0];[POLITICAL_POWER, 0];[OVERWHELMING_FIREPOWER, 0];[AGGRESSIVE_COLONIZATION, 0]]

This has evolved dramatically since the beginning, but I won’t give away any of the AIs secrets, so we’ll reference this matrix moving forward.

To use these values, the game needs logic about each decision it may have to make. This ranges from installing shields to fighting Centaurians. So Gordon put together a controller for the AI, essentially telling it which of the values in the decision matrix matter for each decision. This was a lot of work, but by the end, we had a machine to emulate human behavior, but no rules to govern that behavior.

AI Training

In the above decision matrix, there are no values associated with each decision yet. In order to both assign and optimize them, we used a technique called Particle Swarm Optimization. Here’s where things get math-y.

In math, this Decision Matrix is called a vector, and it exists within a vector space. Somewhere in that vector space, using the rules defined by Gordon, is an optimized vector, one that performs the best in games. Thus, we needed the AI to find it. To do that, we created a bunch of AI’s with randomly assigned values. Then, we let them play against each other in tournaments, thousands of times, while making small decisions to explore the space around it. Eventually, one of the AIs came out as the strongest, the “local leader”, so we pushed all other AIs to be more like the winner, and repeated the process.

In the picture above, each point is an AI, each arrow is the exploration of its local space, and the star is the optimization point. As we run more and more tournaments, the swarm closes in on the optimum vector.

This allowed the computer to set the AI values, and got us started in the process of making our AI. Then, we could load the AIs into our game and play against them, to check their progress. This got us started, but we ran into a few challenges along the way.

Challenges

There were two major challenges we faced while creating the AI. First, the training process was quite slow. We would make incremental changes, and climb towards higher points on games, but it took the AI weeks to figure out how to colonize planets, and even then the winners felt more like luck than skill. If we were to let the Particle Swarm Optimization be the driver for our complex AI, we would need much more time and much more computing power than what was available to us.

The second issue was local optimums. The AI could find a way to beat itself and all the opponents around it, and small changes to its play style would yield worse results, but there were better ways to play outside that explore space – the AI just couldn’t reach them:

Solution

After some consideration, we solved the issue in a surprisingly simple way: introduce human trainers. We know how to play Exodus inside and out (or at least we hope we do, by now). It is easy for us to see the better play during game states, so why not let us humans become the “local leaders”? We added the ability for the AI to train on games we played against it, using our decisions to tug on the the particles and bring them closer to our optimum. We also added a Neural Network, to control decision making, and trained it in the same manner. These changes yielded dramatic results, and the AI improved substantially.

However, there were only 3 of us playing games for the AI to train against. Even with a year of playing daily, we couldn’t amount anywhere near the number of games the tournaments could, and the AI still has space to improve, but that is where you all come in.

Future Evolution

We plan to continue training the AI, using ALL OF YOU, the players, as trainers. We’ll record your game decisions, feed them into our AI system, and run our AI training continuously. We’ll push updates to your systems, and the AI will stand improved. This will also create an AI metagame! The more you play, the more the AI will adapt to beat you, and the more you’ll have to adapt to beat the AI, and so on. We hope this will create an ever evolving game play experience against the AI, and keep you coming back for more and more of Exodus: Proxima Centauri.

Thanks for reading! I’ll be back next week with an article on the evolution of our Research Interface.

Cheers,

Tyler

Leave a Reply

Your email address will not be published. Required fields are marked *