Designing an efficient system with version numbers

"Change version" are numbers book keeped throughout ECS database already by default. By using them, you could be skipping unnecessary works. Let's see how they works, and how much we could skip.

Designing an efficient system with version numbers

Version numbers are book keeped throughout ECS database already by default. By using them, you could be skipping unnecessary works. Let's see how they works, and how much we could skip.

Remember, they are already there! Free! Start asking yourself does your system really wants to work on every frame, or only when something changed? Practice adding "changed" to your system's definition.

For example, I might have made QuaternionToRadianSystem which keeps updating the character's orientation to radian representation in order to display in the UI. After knowing about "changed" could be in the design spec, I realize I only have to convert to radian only if that quaternion changed.

Global System Version (GSV)

An int shared globally, global as in “ECS World” (literally “globe”) because the EntityManager is keeping it and you can have only one EntityManager per world.

It is the backbone of everything. Along the way someone will do something based on this version. Let's see the overall picture.. in this handmade diagram.

Blue arrow is indicating ECS update flow that run through systems. Currently, the CG in the image is renamed to EntityQuery (or EQ)

We need 2 versions to detect a change

A question "did it change?" is answered in bool. The idea is that we would let 2 version number fight with each other. If the version to check is higher than required version, it is considered "changed". Equal is not enough. (To do it by hand :  (version - required) > 0)

However if version to check is 0, it is a special case that is always considered "changed". This is to handle the "fresh" case where it has no history to compare with yet.

Next we will learn various places where this 2 version comparison take place.

Change version in : chunk iteration

When doing chunk iteration, we can use the method chunk.DidChange(requiredVer) to check if the chunk changed or not. Then you can if on the result and maybe decide to skip the chunk completely! The next question is, did changed since when? Did changed from what?

You do the chunk iteration in a system. (Or schedule a chunk iteration, if JobComponentSystem) The method is designed to check if the chunk was changed since this system's previous update or not.

This mechanism is achieved by...

Version on the system

On before OnUpdate of every system, the global system version is increased by 1 (the first blue square). You now have a general idea that this number increases rapidly depending on how many systems you have. By the time it loops over to this system again it is increased by a lot. If you are doing manual system update, then it also do so every time you call .Update.

And there is a number called LastSystemVersion in each system (a protected property you could use) which got refreshed and remembered to be the GSV just before the code left the system, after OnUpdate. (the 3rd blue square) Right after GSV+1, it is not applied to the system yet. As you might have notice the property name is "last" system version not "current" system version. It wouldn't make sense if it applies immediately.

Meaning that in OnUpdate where we do work, LastSystemVersion is true to the name, it is representing the GSV that this system used the last time it updates. We are using this LastSystemVersion as the required version.  (So, it is to be the argument of chunk.DidChange(reqVer))

As I said we need one more version number, which is...

Chunk’s version

Each time you request ECS data chunk with write access, the version of all relevant chunks are immediately updated to the GSV. (And remember that GSV just got +1'ed before OnUpdate.)

This is the "write intent". You can see this on the arrow that says "RW Request" pointing to OnUpdate box. You request write access usually on OnUpdate (or in the job scheduled from OnUpdate).

The code cannot detect if you actually write any data or not. Because the write is direct to the memory, there should be no more wrapper and place to detect writes other than this “get with write access”. It's better this way for performance anyways.

  • In context of main thread chunk iteration or threaded chunk iteration ( IJobChunk), this RW request moment is NOT GetArchetypeChunkComponentType<T>(isReadOnly : false) but when you chunk.GetNativeArray(acct), that it seal the deal you are going to write. (Of course it knows because acct was created with isReadOnly:false.
  • Outside of chunk iteration, methods like EntityManager.SetComponentData<T> obviously do this immediately. SetSingleton<T> too of course.
  • ForEach syntax cannot use [ReadOnly] with the IComponentData (denoted with ref). So it always instantly, unavoidably RW and bumps version.
  • IJobForEach could use [ReadOnly] ref to prevent RW from happening. If you use just ref it instantly bumps the version (even if you didn't actually write for some reason).
  • Getting ToComponentDataArray<T> gather, linearize, and make a copy so this is not considered a write yet, and you can do whatever to the returned NativeArray without triggering a write since that is completely disconnected from ECS database. (However CopyFromComponentDataArray<T> which uses to update back, obviously is a write.)

This number is to be check if it is higher than required version (argument of DidChange) or not. If it is, then yes, changed.

One chunk can hold change version for EACH component type

You know that one chunk is strictly belonging to one EntityArchetype . An archetype contains multiple types. Each of these types is versioned!

That’s why when you do chunk.DidChange(reqVersion) you have to say which type when you get the chunk out from ArchetypeChunkComponentType ‘s <T> in the first place.

You are not alone in a chunk

The version that is with the chunk has a certain problem. One chunk has many entities, when any of them change then the change version is updated for the whole chunk.

You may think this is annoying that just one entity's change cause the whole chunk to be marked as changed and potentially cause useless iteration through entities inside it that doesn't really change and you don't want to work with them (especially in IJobForEach where you are given each entity to work on, you think you could be doing less useless work).

But chunk is the most granular and performant layer for Unity to implement this feature. Per-entity change detection, if possible to detect at all, must be very very costly. It must be the major headache that service like Dropbox and Google Drive must went through to be able to detect any miniscule change. Now that we have an ECS-specific chunk level data you should be glad we can detect changes much cheaper than Dropbox!

So that should not discourage you to optimize with changed version, it is fine that you could be iterating through entities that aren't really changed. Getting some chunk skips is always better than none.

You may think your current game has only a few chunks and turns out they are always changing, rendering DidChange skip useless. But before you know it things will be split into multiple chunks, once you increase your data size, adds tag components, shared component data, etc. to the design later on. (I even have one SCD to force chunk separation in my game, just so that the change detection could be more granular manually.)

DidChange simulation challenge

Did you realize it is already working as intended? The same location as RW-ing and bumping chunk version, DidChange check is also usually occuring in OnUpdate or in a job liks IJobChunk. (You can do both if you want, even to the same component type.)

Remember, this function test "did this chunk had changed (got requested with write permission, actually) since the last round of this system's update or not." (When given the LastSystemNumber to the argument)

Let's test your understanding by this example situation.

We have 5 systems updating in this order : A > B > C > D > E. GSV starts at 1. We have Z : IComponentData which all systems read or write together in some way.

The first update round no one did anything to Z. Final LastSystemVersion after a complete loop stored in order : 2 > 3 > 4 > 5 > 6

The 2nd update round, only System D request write access (and probably write to Z). Final LastSystemVersion after a complete loop stored in order : 7 > 8 > 9 > 10 > 11. What is the version number that got baked into the chunk? It's 10, the RW access will use the newly increased GSV before OnUpdate.

First question, system E that followed immediately afterwards only wants to RO. However a good design in it ask for DidChange  so it could skip some work. Did it change? By common sense is yes, it did. Just a while ago. (And yes, it did lol)

But by number, DidChange is putting required version 6 against 10. 10 is higher than 6, and so DidChange returns true ("over the requirement"). Can you see where 6 came from? It's the current LastSystemVersion that is required to use DidChange. At this moment it hadn't updated to 11 yet. It do so after OnUpdate.

2nd question, no one touch the Z anymore that it loops over to system E again. E asks for DidChange again. Did it change?

NO, the chunk was not baked with any new number since no one RW on it, staying at 10. However LastSystemVersion is now 11. Required version is 11. Chunk version is stil 10. It is not over the requirement, so DidChange returns false. Didn't change.

What to keep in mind is that your common sense might says that it did changed! ...2 rounds ago. But the definition of DidChange is "changed from previous round/update" (round = loop over to the same system again) So it lasts for only 1 loop. If you manually update your system, this is even more complicated as GSV is baked on after every update.

To solidify your understanding again that "round" is not equal to "same frame", what if in A > B > C > D > E, System D did the RW. The next frame System C asks DidChange, the answer is yes it did.

Also in effect you cannot detect the change the system made itself in the next frame. C do the RW. No one touch the Z component. The next frame somehow C changes its own logic to check DidChange on Z instead of RWing Z, trying to detect change it did in the last frame. The answer is no it didn't change.

Lesson Learned

  • “Changed” definition lasts only for one round of Update. “Round” is regarding to system’s update cycle.
  • Imagine you have system A B C D A B C D A B C D … running. If B right here A [B] C D A B C D A B C D make changes and all other updates only read, then A [B] C D A B C D A B C D, the bold system can detect that change but all highlighted systems cannot detect changes.
  • B cannot detect its own change on the previous round.
  • All these detect change via DidChange, it is implying you are either doing chunk iteration in OnUpdate ComponentSystem or in the IJob  JobComponentSystem with NativeArray<ArchetypeChunk or with IJobChunk. (so you have the thing to .DidChange in the first place, the chunk)

    But! This same mind simulation can be easily applied to EntityQuery ChangedFilter and also [ChangedFilter] attribute on IJobForEach. They work almost the same way on check. I will explain their difference now.

Change version in :  eq.SetFilterChanged(ComponentType)

Change version is not only being utilized by chunk iteration on LastSystemVersion vs chunk version. This time the changed detection is not on your chunk, but on your EntityQuery!  EntityQuery contains multiple assortment of chunks depending in your query.

By putting a filter on your query with a ComponentType filter target, it affects various methods like To____ will now return just the one in chunk that detects changed. When you use EntityQuery overload of the batched EntityManager.__ method too, it will destroy only the chunk that changed in that EQ. It is very handy that you could do "whole chunk operation" on only subset of things you want to do. (The whole chunk operation on EntityManager do not cause any data movement!) Also it used to affect CDA... uh you-know-who. But that's dead you better not know about it.

You see an arrow going up from the 2nd blue square that I didn't explain yet. That's how it works. Just before OnUpdate, it get the required version as LastSystemVersion. And so it works the same way as what you learned from DidChange simulation challenge, just think that LastSystemVersion had already applied for you instead of having to put it in DidChange manually.

Also on using GetEntityQuery and ended up getting a new query created, it will save LastSystemVersion to the changed filter version number immediately even before you add any changed filter. (Of course a new query doesn't have any filter) When you add one with eq.SetFilter, the number then is not started at 0.

Also one more, a version number in the filter will always be preserved. If you eq.SetFilter where you already have one (any kind) the version number continues. Even eq.ResetFilter will keep the remembered LastSystemVersion version number.

(One other kind of filter is the eq.SetFilter<SCD>(value) used to filter out some chunks with SCD index you don't want)

Change version in : IJobForEach

To use "did change" in IJFE just put the attribute  [ChangedFilter] before your ref which will affects the underlying EntityQuery. Put it like [ReadOnly].

Remember when you Schedule IJFE you could be putting this in the argument, this is the system, and here is what it did you your system :

// Update cached EntityQuery and ComponentSystem data.
if (system != null)
{
    if (cache.ComponentSystem != system)
    {
        cache.EntityQuery = system.GetEntityQueryInternal(cache.Types);

        // If the cached filter has changed, update the newly cached EntityQuery with those changes.
        if (cache.FilterChanged.Length != 0)
            cache.EntityQuery.SetFilterChanged(cache.FilterChanged);

        // Otherwise, just reset our newly cached EntityQuery's filter.
        else
            cache.EntityQuery.ResetFilter();

        cache.ComponentSystem = system;
    }
}
else if (entityQuery != null)
{
    if (cache.EntityQuery != entityQuery)
    {
        // Cache the new EntityQuery and cache that our system is null.
        cache.EntityQuery = entityQuery;
        cache.ComponentSystem = null;
    }
}

You can see too if you are putting in EntityQuery instead of system, it would fully trust the query instead and ignore your [ChangedFilter] tag. Use a real filter on that query instead.

IJFE works in parallel per chunk, you could be skipping some chunks or even all chunks with the attribute. BUT even though you can check for did change, you will have some trouble controlling whether to RW or not in IJFE, detailed in the next section.

Change version in : ComponentDataFromEntity

The ComponentDataFromEntity could be created in your OnUpdate then it would be your handy portal to read or write to any Entity by its indexer. Create it with GetComponentDataFromEntity along wit isReadOnly parameter.

Have you ever wondered why you can't cache this seemingly simple thing in OnCreate? It is because it captures the system's version number on creating it, at the OnUpdate moment, so it could use to perform chunk version recording when you write to it in jobs or even in main thread.

Therefore, the version captured is only used if isReadOnly: false on creating it, and you use the indexer to write. (e.g. myCdfe[myEntity] =  ...)

How to NOT write

You now realize there’s a lot of speed up potential with the change version system. To take advantage of this to the fullest, you have to do your best to not cause a version change. (Or in other words use RO as much as possible) If you are always dirtying things then your changed check will be wasted.

Sometimes it is not as easy as simply choosing RO or RW on your ComponentType. Imagine a job code which check on some data first, if some conditions are met then you compute more things and write to some data. If not, then you don’t want to touch the chunk’s version. You need ComponentType.ReadWrite just in case it need to write. But in some case it doesn't write you don't want to bump version number. Can you make sure of that?

You should avoid the “write anyways even if the value is the same” logic usually caused by calculations with clamping values, or bad conditionals with common catch-all case that over-assign values. You have to always plan a “do nothing” path.

Avoiding a write in chunk iteration

  • If you use chunk iteration, the write is decided when you use isReadOnly : false ArchetypeChunkComponentType to GetNativeArray(acct) of your type from a chunk. Since chunk iteration is the method with the most flexibility, the fact that RW request is decided as late as GetNativeArray is in fact a boon. You may check for conditions with just ACCT and chunk, without looking at the data as much as you want (maybe with DidChange we just learned) before deciding to do GetNativeArray(ACCT) to finally commit a write intent.

    This is because after GetNativeArray, you have arrived at the "nothing between you and memory" state of Unity ECS. It already allowed you to mess it up. No more write checks are allowed and ECS do not know what content you replace it with or not.
  • Then you cannot decide to write or not to write in chunk iteration based on its own data if you brought in ACCT with write access, just “peeking” the data requires getting the NativeArray and that’s already a write.
  • Also you cannot bring both RO and RW ACCT to the job, it will says : InvalidOperationException: The writable NativeArray … is the same NativeArray as …, two NativeArrays may not be the same (aliasing). (ACCT is actually a pointer to memory area, and you are bringing the same memory area with just different access policy to the job) Unfortunately it is often the case to have to look at the current data to decide if the write is necessary or not.
  • However you may peek with the readonly version outside of the job, if you wants to write then schedule the job with writable version.
  • Or if you do not want to break dep chain by doing something in the main thread which might requires completing some jobs, make a new job just for peeking data with RO version and send peek result via NativeArray out for the 2nd job that use the 1st job’s JobHandle with RW version, to early exit without writing, at the same time avoid aliasing problem because the aliasing arrays are now in separated jobs. Holy shit! All these trouble just for avoiding a write. But might worth it if you had setup a chain of systems that rely on changed mechanism.

Avoiding a write in IJobForEach

If you use IJobForEach , you may prepare your write destination as ref without [ReadOnly]. Unfortunately that already count as write and will increase chunk version immediately, and you can't change your mind mid Execute. You can not check on other things and then decide not to write later. (Impossible to NOT write, once you had already commit to not put [ReadOnly])

The ref uses pointer, it has no mechanism to know if you actually assign something new to it or not. Again you are already in "nothing between you and memory" state of Unity ECS. That ref can directly modify memory and no one knows you did or didn't. (Other that that promise of [ReadOnly])

This is the work of ComponentChunkIterator contained inside your IJFE, method UpdateCacheToCurrentChunk. On execute of each chunk any component without [ReadOnly] will be immediately version increased.

Also with IJFE, you cannot prevent a job’s scheduling (and thus the change) with one prior job that check for conditions since Unity does not allow scheduling a job from job. Both jobs are already on its way. If you don’t use IJFE, that trick is possible and you just early out from the 2nd job before it can write data.

Avoiding a write with ComponentDataFromEntity

If you want to delay a write in your IJobForEach in a way of not bumping version until the last moment, you can do it with the CDFE! Remember to remove the ref from your IJobForEach too, because having both will trigger a memory aliasing error when both serve the same purpose.

As explained, the version number is recorded on the chunk only if you write. Just by bringing the ComponentDataFromEntity to the job does not yet commit a write. You can if as much as you want. See this excerpt from CDFE's source code on its indexer's setter. m_GlobalSystemVersion was escorted from the system creating the CDFE.

Finally it would come to this :

Beware of feedback changes!

Imagine system A B C running in that order in a frame.

  • A : Use IComponentData X to update some hybrid objects UI position in component array. You think there is no prior system that always run that could change X so it must be efficient.
  • B : Scan those UI objects and grab their calculated RectTransform back, save it as Rect hoping to do some parallel raycasting in ECS next. You use the same changed condition as A thinking that if A does not happen and so this system should not run.
  • C : Calculated rect are saved back to X.

The first loop DidChange will enter a special case that anything against version 0 return true. A B C runs, then in the next frame A will detect change from C from the previous frame which in turn cause C to do a write again. Only the first activation is enough to kick off this “feedback loop”. With a normal system design without changed version trick this will produce normal behaviour, but you overworked.

You have to carefully design some other stopping condition in some case. In this case A’s condition should not be based on X or purely X.

WHERE is the change? Help!

You scattered Debug.Log under each DidChange and found your system still runs when you do nothing to the game. You don’t remember writing anything!

Basic : Look for non-readonly containers

Since native containers should be decorated with [ReadOnly] when you are not going to write or have some kind of flag on it (ACCT uses flag, ACCT’s field in the job also use [ReadOnly] and so on) and since ECS lib is based on strongly typed generic works you can use your IDE to search for problematic type that unknowingly change (remember that change version is per component per chunk).

Expert : See the change version number and figure out what’s going on

You can go to the source code in Library/PackageCache file ChangeVersionUtility.cs and put something in the method DidChange.

And also try to debug the protected field LastSystemVersion around the change checking.

Challenge : Debug.Log the source of version change

One difficult spot to find is your IJobForEach ref argument without [ReadOnly] in which case it cause a write right at the job’s schedule.

In this case you can go to ComponentChunkIterator.cs and in method UpdateCacheToCurrentChunk and UpdateChangeVersion put something like Debug.Log($”Wow! changed to {m_GlobalSystemVersion}”); and you will see it when IJFE scheduled! Then just follow the stack trace.

Other sources includes ArchetypeChunkArray.cs which is related to chunk iteration’s native array generation (that I said it is written immediately on getting the array) Or ChunkDataUtility.cs which is a bit higher level.