Devloping A High-Volume NPC System
The Scope & Problem
In this game, the goal is to have many dozens, to even hundreds of NPC's active at a time, roaming around with each of them animated and with their identity. While this is easily implemented in singleplayer, the issue with multiplayer is that replicating npc/entity data can be expensive, especially at higher amounts (say, 200 entities trying to replicate states, transforms, stats) every physics tick. Furthermore, having hundreds of animations playing at once as well as computing all the pathfinding nodes is going to be very expensive for the server (in this case, is the host client as the game implements multiplayer via Steam P2P).
The Solution: Swarm
Swarm is the name of the system I've implemented into TaikaSteam in order to support the scope explained above.
Swarm revolves around the slow but interpolated replication for the transform of a swarm entity, while replicating state data on change immediately.
It's called Swarm because players may be swarmed by enemies. Anyways,
The system is broken into three parts:
Swarm Director
Type: Global Node
The Swarm Director is a singleton class responsible for being the brain of all swarm entities. A swarm entity is basically just an NPC that uses the Swarm. The swarm director only runs on the host client (a.k.a server) and performs all calculations for each entity such as:
- Target Position: Not the next instant position, but the target position.
- Pathfinding Waypoints: Used to gather the list of target positions towards a destination
- Target: A player it may be attempting to target
- States: Various states like
is_attacking, or animation states likeis_running. - More: anything that is a "logic" state that must be kept synchronized between all peers.
Swarm Proxy
Type: Node3D
A Swarm Proxy is the networking bridge between the Swarm Director, which exists on the host and Swarm Entities, which exist on all clients. A swarm entity always has a unique corresponding swarm proxy. The swarm proxy exists because of there isn't a good way to interpolate the values synced by MultiplayerSynchronizers. For the swarm entity transform, the replicated values to the Swarm only represent the target transform (where it should be). The target transform is then used by the Swarm Entity to smoothly interpolate.
While entity transforms are replicated slowly, entity states are decoupled and as such are able to be replicated instantly on change. The states are also replicated to the proxy node, and are read by the Swarm Entity to determine client side visuals.
Swarm Entity
Type: CharacterBody3D
The Swarm Entity is the "front-end". It is what the player sees. The Swarm Entity is the character controller that parses the Swarm Proxy data, interpolating to the target transform and also setting the animations depending on the states. There isn't much to talk about swarm entities, because by this point, the data is already replicated on the client.
Further Optimizations:
Pathfinding
Attempting to pathfind for hundreds of entities at once is expensive. As such, there are a few options:
No pathfinding:
- Instead of pathfinding, simply have all enemies go directly to the player, vampire survivors style.
- This can lead to enemies getting stuck, which could be worked around by adding enemy lifetimes to despawn hidden enemies that have been stuck somewhere, or by allowing them to traverse almost any obstacle.
- This is very efficient, but obviously leads to very degraded movement.
Staggered Pathfinding:
- Instead of calling pathfinding frequently, stagger the pathfinding calls such that it is called much less frequently.
- Less calls means much less computations and more performance
- However there are some big drawbacks, most noticeable in close proximity to a target (player). Because pathfinding is staggered, dynamic moving targets such as players are mostly already gone by the time an NPC reaches the target position.
- Furthermore, even if calling pathfinding once every 5 seconds, calling it for hundreds of entities will still produce a noticeable lag spike.
Dynamic Pathfinding:
- Dynamic pathfinding is the solution I went for Taika and also for TaikaSteam. I transition between no pathfinding and having stagged pathfinding as well as grouped pathfinding. This is implemented via a State Machine.
- The way it works is that for enemies that are very far from the player, they either despawn or have a path generated by the Swarm Director (once per long interval) that will be used by multiple swarm entities in the area. The path leads to an approximate location of the player (because the player could have moved). This is the wandering state.
- Once the enemies havae reached a relatively near distance to the player, they may enter the targeting state, which is where pathfinding is on and being calculated for each entity (in shorter intervals). In order to prevent lag when tons of entities are in the targeting state, the farther an entity is from a player, the slower the interval. As a further optimization (that I have not considered to apply due to it not being necessary), entities could form groups that all follow one pathfinding route determined by a leader..
- Finally, once an entity is in close proximity to the target, the pathfinding is turned off and they enter the direct chase state in which they simply constant go straight towards the target point, without any pathfinding calls. At this stage, it is unlikely for them to get stuck as they are already in close proximity to the player.
Misc:
Swarm Instance
Type: Node
The swarm instance is the "whole" of an entity. It contains both the Swarm Proxy and Swarm Entity
and is the entrypoint to intializing both. The setup logic is contained in the Swarm Instance.