ECS back and forth

Of the three most important models, there is one that I haven’t talked about much but that can offer much more than what has been said so far.
Let’s take a closer look at the big matrix model, to find out how to make it suitable even for more demanding software.

Introduction

The big matrix is quite simple in its basic implementation. The follwing image can give a quick grasp of it:

big_array

It has the obvious advantage of being trivial to implement and to maintain. Furthermore, it’s very close to the way ECS is usually explained. We could say that it’s the closest practical representation to the theory.
Its key feature/downsides are:

The best performance ever when it comes to adding and removing components.
Positional access, the entity is also the index used to get its elements.
Expensive iteration, the cost is roughly that of the number of systems multiplied by the number of entities.
Risk to waste lot of memory because of many unused buffered elements.

Can we take a few steps forward from the basic implementation and make it more interesting?
Remember that this model has independent pools and often this aspect leaves a lot of room for optimizations of all kinds.

Pagination

The first thing to address is memory usage. Fortunately, there are many simple solutions for that.
The most obvious one is to paginate the arrays of components and only create a page when a component from that page is required.

For very dense pools, pagination may even increase memory usage due to the small array of pointers to pages. However, most of the times it helps to save lot of memory. Futhermore, since we have an independent pools model, we don’t have to paginate all arrays if we don’t want to (even though I would suggest to avoid this kind of optimizations unless strictly required).
A nice to have side effect of pagination is that it also avoids the peaks due to reallocations. In the worst case, we don’t have anymore to copy a possibly large array of components. Instead, we only have to copy a small array of pointers to pages.

Iterations aren’t usually affected by pagination in real world cases. It’s true that we now have to jump from a page to the other when iterating entities and components. However:

As we’ll see, pagination can save a lot of elaboration time in many (many) cases.
The small array of pointers to pages is likely to fit your cache and stay hot most of the time.
Iterating a packed array has cache misses already, don’t expect this to increase drastically that number.

On the other side, random access to components is the feature that suffers the most from pagination. The guard to decide whether the component exists or not is more complex (we have now to also check if pages exist) and every access requires a couple of jumps in memory.
Is this a problem? In general, it is not, unless you spend 90% of the time doing random accesses to components. Moreover, I’ve some good news for you: pagination gives us another (very very interesting) feature that can avoid them all.

Pointer stability

The basic implementation of the big matrix is almost there with pointer stability. Components never move in fact, they occupy the same position withtin their arrays. No swap-and-pop (sparse sets and archetypes), no full move between tables (archetypes).
Still, a vector based implementation that does reallocate and moves all components when we run out of memory isn’t suitable for this purpose. However, we want the cake and eat it, right? Here, pagination allows us to do that!

Even more, pagination gives it to us for free. Components have stable pointers for their entire lifetimes once created and this is out of the box.
It gives us the possibility to correctly balance random accesses with direct references, so as to better optimize this aspect and manage all costs. Moreover, we can pass pointers around, easily create hierarchies and the like or even use pointers with external data structures. In all cases, we are guaranteed that they’ll never break by design.

Iteration

Let’s go back to talking about iterations instead. In its basic implementation, the big matrix has a cost equal to the number of systems multiplied by the number of entities. At each step, we check whether the entity mask coincides with that of the system and the requested data are provided if there is a match.
So far, so good. Can we improve this? There are at least a couple of tricks as obvious as they are simple to put into practice that can give us a boost in terms of performance.

Shortest set

Once again, remember that the big matrix has an independent pools model. Why not exploit it then?
When iterating N components, the most obvious thing to do is to identify the smallest pool (a negligible cost) and cut the iteration when it has reached the given size.
Why does this work? Because entity N has its components at position N in each pool with the big matrix. So, if the size of the smallest pool is M (< N number of entities), we know it won’t do any good to iterate the remaining entities looking for a match, because it won’t be found for sure.

This doesn’t give much advantage when iterating very dense component arrays assigned roughly to the same entities. However, for higher level systems, you can save a lot of cpu cycles with this little trick.
All that remains is to combine it with pagination and the trick described in the next section to push this model even harder.

Fast forward

Pagination has a huge advantage: if no entity is assigned a component within a page, the page isn’t even created. In other words, the array of pointers to pages contains holes, literally null pointers.
Now suppose we’re iterating over our entities and components and have chosen the pool of component C to drive the iteration. For sure, we know the page size of this pool (let’s say 1024), and therefore we know which entities represent the page turn during an iteration.
At this point, it will be enough to check the existence of the page in the pool (or the pages in the pools, even better but it works only if all pools have a common page size) when reaching the sentinel entities and we’ll be able to jump forward thousands of elements in one go, avoiding to check if the bitmask of those entities match the one required by the systems.

Of course, this won’t give us much of an advantage if the pool driving the iteration is mostly dense or contains all the pages. For example, imagine the worst case, where only one element was created per page.
However, we also know that the average case is the most interesting and, all in all, it’s very likely that many systems (especially high-level ones) will be able to skip large amounts of entities, allowing to gain several cpu cycles and save many memory accesses.

Hybrid solution

Finally, a few words about a variant that combines the above by adding an additional parameter.
In fact, it’s possible to choose the pool with the least number of elements assigned instead of the one with the smallest extension, admitted and not granted that the first information is available.
Or rather, we can iterate the pool with the least number of elements but still stopping at the minimum extension (smallest entity among all pools).

Why should this give us an advantage? It’s indeed debatable, but it could make things even better in some cases.
Roughly speaking, the extension only tells us which pool contains the smallest entity, allowing us to stop early. However, it doesn’t tell us much about pagination and how pages are used. On the other side, using the pool with the fewer elements increases the possibility of encountering holes for pages never created so far, allowing us to skip many elements in one go.

Therefore, combining the two information minimizes the linear extension of the iteration and on average increases the likelihood of jumping forward during the iteration, thus maximizing performance.

Conclusions

All in all, the simplicity of implementation and understanding of this model makes it one of my favorites.
It’s easy to understand and handle for novices, it has some features that other models cannot offer, but it also has cons.

However, as we have seen, it’s enough to take a few more steps than the basic implementation to squeeze the big matrix and bring out more performance, while also reducing memory usage.
All this makes the solution much more interesting and certainly usable in a greater number of areas, easily putting it in competition with other models with different characteristics.

As always, the perfect solution doesn’t exist. It depends on our needs, on our coding style and on our team, on how complex or accessible we want to make everything and on a thousand other factors.
Should you decide to opt for the big matrix, I hope I’ve given you some ideas to push it to its limits and save your memory.

Let me know that it helped

I hope you enjoyed what you’ve read so far.

If you liked this post and want to say thanks, consider to star the GitHub project that hosts this blog. It’s the only way you have to let me know that you appreciate my work.

Thanks.