Unity ECS

Everything about ISharedComponentData

This is one of the most misunderstood feature of ECS. Usually results in question like "How could I get SCD data in a job? I can't? Useless!" when you don't know how it was designed. So let's get to know how it works throughly.

5argon / Sirawat Pitaksarit

Mar 25, 2019 • 18 min read

You know that ECS database pack entity data very tightly in chunks. Also the whole thing around High Performance C# (HPC#) restriction is ensuring you do not have any "portal" to outside worlds. static unusable. Aliasing disabled. Bringing reference type to a job via its public field is prevented by analyzer. It doesn't let you sneak pointers in your IComponentData either. Also the memory must be linear.

The concept of something "shared" doesn't sounds ECS-like at all. You got everything to work in the chunk. You are not jumping anywhere, and this is the source of that "performance by default". Predictable, optimizable, and a lot of cache hit.

But what if I told you we can have a real shared value associated with any Entity? Wouldn't that be a big cheat in ECS? What if someone changed that shared value in the main thread, and mid-job iterating Entity is suddenly getting something else? It goes against ECS big time to introduce value sharing.

Turns out ISharedComponentData tries to do this with restrictions that make it possible to live together with ECS.

Just keep the pointer with you : the SCD index

How to store data in only one place and "share" that to multiple Entity? In the same fashion as pointers in language like C++, we are storing the data elsewhere but giving out just a simple number as an address of sorts.

But a pointer which is a memory address is easy to jump to the real thing, by dereferencing the pointer (the * operator in C++). You can imagine there is no way in hell ECS would let you do that mid-job.

With SCD index you kinda could do the same by asking EntityManager with that number. EntityManager keep the real value of shared data in biiiiiiig list of List<object> per world. The SCD index it is giving out is just an index into this list of lawless object. If you want the object just tell EntityManager the value. It's pretty simple right?

It will not work in a job

EntityManager is a main thread only thing. It is not usable from inside the job. I saw too many programmers that assume they could use the SCD value as a part of heavy computation in the job, but that is not going to happen. (But you still have the SCD index in the job! Useless? Not entirely.)

It will be very bad for performance anyways even if it is possible. We have gone this far to prevent jumping addresses...

SCD could store anything

I mentioned List<object> is the data storage of shared component data. Meaning that you could store anything from a simple int to GameObject to Material to your EpicMonster : MonoBehaviour . Reference type also possible. Anything goes!

Now you are saving space too. Imagine you got a struct of 5 float3. That data is compatible to be directly in the chunk, on each Entity. But if you store that as ISharedComponentData you are turning 5* float3 * number of entities in the chunk into just 1 integer (SCD index) sitting on the chunk header.

What is really happening when you get back the SCD value

When using something like GetSharedComponentData<T>(entity) how will it interact with that one big List<object> ? Isn't it has to be multiple list per type T ? Or something?

The get is actually pretty simple. The entity will know its own chunk. The chunk got multiple SCD indexes depending on how many ISharedComponentData it got. The correct index is grabbed with T. The SCD index pierce straight into List<object> as an indexer to the list. Then that object got casted to T. (Like (T)scdValue )

SCD is typed

Sure you could store anything, but you are going to talk in ECS language.

If you want to store one Material and share it to multiple Entity, you have to do this :

public struct SharedMaterial : ISharedComponentData
{
	public Material material;
}

If you want to store an int and share it you have to do this :

public struct Measure : ISharedComponentData
{
    //Which music measure since the beginning of song this note belongs to
    public int measure;
}

It's a bit of hassle but

You could store multiple things and call it as one unit.
You could store the same type content but call it differently. (For example my shared Measure for music game has the same single int content as shared Lap in a racing game, for example)
Component typing is at the heart of ECS, it will streamline everything later on you will see soon.

SCD index is per chunk

Instead of giving out this SCD index to each Entity, the design is the chunk would have it. This allows a lot of Entity to be associated with a certain shared value because an Entity surely know its own chunk.

Association is the keyword. The shared value is not right here in the chunk. Just SCD index.

Then this connects directly why SCD must be typed : one chunk is associated with one archetype. The SCD is now a part of archetype! So depending on how many kind of SCD in this chunk, you would have that many SCD indexes on the chunk header.

Also the chunk iteration gives you ArchetypeChunkSharedComponentType even though you are not ever be getting the real value of SCD inside the chunk iteration job. Just the type itself is quite useful for example, checking type existence on a chunk or even check if SCD index is a certain number on a chunk.

Smart self-hashing to determine value uniqueness

This is one of the most awesome thing about SCD.

Think about traditional value sharing, you declare an int = 555 in C++. You want to share this int to multiple objects. You wouldn't give 555 to everyone because that's just a number copy and nowhere near definition of "shared". You would have to ask for the address of this int and distribute that. When you change the original int, then all of them can get the update.

In Unity ECS, you could be saying SetSharedComponentData to an Entity with the value of int = 555. This 555 is then added to List<object> and a SCD index is generated and returned to be put on the chunk as an association. At this point it also create a special record in a dictionary that remember hash-to-index.

On an another Entity you say SetSharedComponentData to an Entity with the value of int = 555 again. It somehow knows that 555 already exist in the big List<object> and give out the same SCD index without adding anymore 555 to the list!! It is really shared, but magically.

Thanks to the earlier dictionary it knows immediately that this hash exists in the SCD List<object> database. This dict make hash-to-index O(1). And that List<object> makes index-to-value also O(1). No linear search or anything.

In otherwords, ISharedComponentData makes value type reference-like automatically by value hashing. For hashing algorithm, it drill down to "your thing" recursively to all fields to check if it is equal. If found a reference type, it simply stops drilling down and hash the pointer number for that reference type variable.

You are not ever going to "change" the SCD value, just swap.

Still, ECS do not allow "magic" to happen. You cannot ask EntityManager to change that 555 value inside the SCD database to 666 and somehow everyone that got the SCD index for 555 would instead get 666 on GetSharedComponentData. You can only set an Entity or the whole chunk to some certain value. That value get reference count +1. And the old value is getting reference count -1 towards real removal.

Instead of editing the SCD value, you might as well call it correctly "swapping SCD value". You can swap SCD value such that the old one completely disappear by getting EntityQuery first, then set a filter to old value you want to swap, then finally use EntityManager.SetSharedComponentData with EntityQuery overload to change to new value. All reference count of the old one should instantly reduce to zero, provided that your query covered all of them.

However if it is a reference type that you stored in SCD, you can make the magic happen by keeping that reference and change things in it. An example is SCD with Material. You could change Material's color long after assigning it with SetSharedComponentData then if any Entity ask for that it would get a new color, thanks to reference type. (However it could not suddenly be linked to another Material instance, in the same reason as 555 could not suddenly change to 666)

SCD data removes itself when no one is holding that SCD index anymore

Like reference counting, it knows to delete itself when reference count reaches 0, that is, no chunk is using that SCD index anymore. Removing things from List<object> is done by just null the element. So all ever happen to this List<object> is .Add. Never .Remove. This way we have no need to go and update all the existing indexes on all the chunks, making sure they are always usable.

Then the hash-to-index dictionary will remove that hash entry too so when that hash come again it knows that the value is new again. (But new item will be just .Add to the end, ignoring null holes you might have created.)

The List<object> data structure

By this point you may already able to picture what the List<object> could looks like. It is not even maintaining the same type close to each other. Just keep adding to the end in chronological order.

And remember that each object is a boxed value, a pointer to somewhere. It is not even that each adjacent SCD object is really sits next to each other in memory, even if it contains just a simple int. So, SCD value is a cache miss galore if you dare to touch the SCD value. (So please work on just the SCD type as much as possible)

Excercise

We have these :

public struct A : IComponentData { }
public struct B : ISharedComponentData { public int value; }

Create 5 entities with archetype : A. The chunk is now 1 chunk of archetype A with 5 entities. Each entity is named v w x y z
Add B SCD to entity v and w with value new B { value = 555 }. Just one number 555 is added to the List<object> and returned the same SCD index for both adds.We now have 2 chunks, the first chunk of archetype A reduced to 3 entities x y z. The new chunk with archetype A B has 2 entities v w.
Use SetSharedComponentData to entity w with new B { value = 666 } instead. 666 is added to List<object then returned a new SCD index. Now w could no longer be in the same chunk as v as the SCD index is only one number per chunk per type (type B). We now have 3 chunks, a chunk with v and a chunk with w has the same archetype A B but because of differing SCD index stored they must be splitted.

Chunk splitting/segmenting via different SCD value

Of course adding new SCD type changes the archetype and split up chunk. This is typical because it works like IComponentData type addition. But with SCD from the "one index per chunk" rule, changing SCD value also results in chunk splitting since it gets a new SCD index for that chunk!

(If it allows multiple SCD indexes per chunk, we would have to additionally think of a way to say which Entity in this chunk got which index, a mess in design IMO so it is great as-is.)

This "chunk splitting technique" even though the SCD is of the same type is used extensively in the hybrid renderer package. When rendering you usually want to work on the same set of unique Mesh and Material in one go then work on the next. Instead of trying fancy algorithm to sort and iterate how about we just have an SCD like this for automatic categorizing?

public struct C : ISharedComponentData { 
    public Material mat; 
    public Mesh mesh;
}

Chunk has a fixed size and you may think this is a waste of space. (It is) But having multiple chunks is not all bad as something like IJobChunk or IJobForEach could parallelize each chunk to work in a different worker thread!

At the same time it means any change in SCD value will cause a structural change. This is not the case with normal IComponentData where changing value does nothing to the structure. No need to move anywhere.

In fact, this seems to be the point of SCD rather than intending to share data. Quoted from this thread :

Shared component data is really for segmenting your entities into forced chunk grouping. The name is unfortunate I think. Because really if you use it as data sharing mechanism, you will mostly shoot yourself in the foot because often you just end up with too small chunks.

Some way to not move data with SCD

EntityManager.AddSharedComponentData<T>(EntityQuery entityQuery,...
EntityCommandBuffer.AddSharedComponentData<T>(EntityQuery entityQuery,...

With EntityQuery overload, you are doing it equally to everyone in the chunk rather than per Entity. And by doing so it just modify the chunk header, allowing all entities to remain where they were. RemoveComponent also works for SCD and it came with EntityQuery overload too. I have mentioned this before on swapping (changing) SCD values.

SCD version number

Because you cannot change the SCD data as learned in the prior sections, you might think there is no concept of version number on SCD type. (What is a version number? http://gametorrahod.com/designing-an-efficient-system-with-version-numbers/)

- EntityManager -

public int GetSharedComponentOrderVersion<T>(T sharedComponent) where T : struct, ISharedComponentData

- ArchetypeChunk -

public bool DidChange<T>(ArchetypeChunkSharedComponentType<T> chunkSharedComponentData, uint version) where T : struct, ISharedComponentData

Turns out there is a version number per chunk per SCD type just like normal type anyways! But because SCD could not change its own data, instead it is able to track all structural change happening on the chunk. It is not entirely unrelated to switching SCD value though. As you know switching value will cause structural change. And so this version number could track this. Adding unrelated IComponentData to a chunk with SCD type also increase all the SCD version number on that chunk, however.

Ways to get an actual SCD value

EntityManager.GetSharedComponentData<SCDTYPE>(entity)

The straightforward get. The Entity would ask its own chunk for SCD index then ask EntityManager for the actual value.

EntityManager.GetAllUniqueSharedComponentData<T>(List<T> sharedComponentValues, List<int> sharedComponentIndices)

This method is rather rough as it returns every stored SCD in the List<object>, at least constrained to one SCD type. You prepare an allocated List and it would fill up the data for you. (fill = append, you need to Clear on your own if you want to.)

The List<int> is also when you want the corresponding SCD index. Usually you don't want just all the unique values but want to know the index representation of each value, which you can bring to the job for advanced filtering. The Count of both lists will be the same provided if you Clear both before.

Here are some gotchas for you. This test passed.

private struct SampleScd : ISharedComponentData
{
    public int value;
}

[Test]
public void UniqueScd()
{
    var w = new World("Test World");
    var em = w.EntityManager;
    List<SampleScd> unique = new List<SampleScd>(8);
    em.GetAllUniqueSharedComponentData<SampleScd>(unique);
    Assert.That(unique.Count, Is.EqualTo(1));
}

There is not a single entity yet, I just request all unique SCDs from start and why is there one thing already? The default of that type will always be in the list for some reason, even if no entity at all currently has that SCD type.

EntityManager.GetSharedComponentData<T>(int sharedComponentIndex) where T : struct, ISharedComponentData

This is a rather hardcore way because it ask you for the SCD index instead. You can find out with something like entityManager.GetAllUniqueSharedComponentData<T>(List<T> sharedComponentValues, List<int> sharedComponentIndices) or just ask the chunk while in chunk iteration : archetypeChunk.GetSharedComponentIndex<T>(ArchetypeChunkSharedComponentType<T> chunkSharedComponentData)

The index that goes into this method is directly an indexer for your object, then it is simply casted to <T>.

Entities.ForEach(SCDTYPE scd, ...)

This is a very hip way of getting SCD values. Just put ISharedComponentData type the first thing in your lambda signature (it only supports 1 SCD, and it must come first) ForEach only works in main thread, and it make sense that it could ask EntityManager to prepare you a lot of SCD values. Remember that as you going through ForEach, SCD value will likely stay the same for long time and only changes when it cross over to a new chunk. SCD is a per chunk thing.

chunk.GetSharedComponentData(ArchetypeChunkSharedComponentType, EntityManager)

You can also do this while doing chunk iteration. Ha! Are you thinking you could be getting shared value in a job now? It requires EntityManager as the final argument to prevent you from doing that as you can't ever bringing the whole EM, a thing full of reference types, to the job.

And it is quite logical and barebone here. The chunk would have only a bunch of stupid SCD indexes. The ACSCT know how to pick a specific SCD index to use from that bunch. EntityManager use that and return things from List<object>. All 3 basic things to make SCD work is right here in the same line.

In a job

There are many who complained about why can't we get SCD value from a job. After you read this article I hope you realize why that's a big violation to everything else in the design. Also do you really want to access something like managed List<object> which could "warp" to anywhere, in a job?

Filtering

ISharedComponentData has its most popular use to filter. Filtering do not need to access the actual value held at EntityManager, so it is quite performant and useful even inside a job.

Getting only a subset of `EntityQuery` and its archetypes

Each system contains EntityQuery. You are only working on Entity that match the query. That's the first filter you get so you don't iterate through everything in the universe. The EntityQuery returns multiple chunks even though you may have only one pure IComponentData type on it, since a chunk has limited storage, if it is full then it expands to a new chunk.

However you may want to remove some more chunks from the query. Suppose that one of the type in your EntityQuery is an ISharedComponentData. Having SCD type in your query is a signal that you maybe getting multiple chunks even if those chunks are far from full, but rather because they got different SCD index on them.

You are now able to filter for one more level : discarding chunks without the SCD index you want.

Shared component data filter : `eq.SetSharedComponentFilter<SCDTYPE>(scdValue)`

A filter is something you could add or remove to your EntityQuery. Currently there are only 2 kinds of filter : SCD filter and changed filter. Only one type can be active on your EntityQuery at the same time. Each filter can look out for limited number of component type, currently 2. (For changed filter you could read this)

Added filter take effect when you do something involving the EntityQuery to get data. Removed filter ( eq.ResetFilter) will restore the query to returning all chunks matching query's archetype.

You don't even have to say the SCD index to the filter. Just say the value and it will work out which SCD index you want (thanks to smart hashing) and then filter out the rest of chunks not matching that. There is an overload with 2 SCD types too just in case you have 2 ISharedComponentData in the archetype.

After setting the filter,

To___ methods will start returning less data automatically.
CalculateLength from EntityQuery will be reduced correctly.
eq.CreateArchetypeChunkArray would return less ArchetypeChunk in the NativeArray<ArchetypeChunk> automatically.
Entities.With(eqWithScdFilter).ForEach(SCDTYPE scd, ...) also iterates through less data automatically.

SCD filtering in `IJobForEach`

You can use the EntityQuery on IJobForEach 's Schedule instead of relying on ref type in the IJobForEach + [RequireComponentTag] + [ExcludeComponent]. (It will ignore the attribute and ref type contribution) It is also possible to include SCD type in the query to get only entities with that SCD type.

Then, if your EntityQuery got the filter attached, you have successfully filter SCD with a job easily! (Not getting the value, however. Just filter based on value.)

SCD filtering in main thread chunk iteration / `IJob` / `IJobChunk`

If you want to go hardcore and not using even EntityQuery filter feature or IJobForEach I said earlier, you can instead comparing the SCD index on the chunk with SCD index of SCD value you want by yourself, while traveling through chunks. Though in my opinion this is the best way to get the "aha" moment for the filter way, since you are doing it all by hand!

Make an EntityQuery containing your ISharedComponentData type and other data.
eq.CreateArchetypeChunkArray , you get all relevant chunks with SCD type, but still containing wrong chunks with the ISharedComponentData index that you don’t want. You get an archetype chunk array (ACA) NativeArray<ArchetypeChunk>.
Get archetype chunk type outside a job, to work with ACA in the job later. Use GetArchetypeChunkSharedComponentType<T>() to get ACSCT.
Since when iterating through each ArchetypeChunk (potentially in a job) it would only have SCD index on it. You want to know if that index is correct or wrong. You have to first find out the SCD index for the SCD value you want to compare with before begin the iteration. Unfortunately there is no "SCD value to index" method. You have to allocate 2 List then use entityManager.GetAllUniqueSharedComponentData<T>(List<T> sharedComponentValues, List<int> sharedComponentIndices), then search for your desired SCD value, and get its index in the other list at the same position. This can be done out of a job prior to your real work.
You should already have these 3 things : ArchetypeChunkSharedComponentType, int SCD index representing the SCD value you found earilier, and finally the chunks : NativeArray<ArchetypeChunk> (For IJobChunk you would already be iterating through each ArchetypeChunk.) No matter you are in main thread, in IJob, or in IJobChunk.
Remember that all chunks in NativeArray<ArchetypeChunk> has the ISharedComponentData type you want, but not necessary the correct SCD index on them, so we are looking for that (this is essentially “filtering”). For each chunk (ArchetypeChunk) use chunk.GetSharedComponentIndex(ACSCT) you will get an int. Compare the returned int value with the int you prepared before. If it is the same you just found a chunk not only with the correct ISharedComponentData type, but also the correct ISharedComponentData value. (even though you cannot see what value it is inside a job, because you lacked EntityManager.) You can then decide to skip (filter out) or work on (filter in) the chunk.

You may have already notice some unique advantage if you go hardcore : eq.SetFilter for SCD filter works as "filter in", that is you want only one with this SCD value. Manual SCD filtering gives you flexibility to invert the filter, or look for multiple indexes at once. (The EQ filter has overload up to 2 at once max)

SCD filtered batched EntityManager operation

This is my most favorite move with SCD filter.

Usually things like AddComponentData or RemoveComponentData or DestroyEntity on EntityManager is costly because of structural change.

However if you are doing the same thing to everyone in the chunk then there is no need to move any data as we could change the chunk's header instead. This "whole chunk operation" instead of on each Entity is achieved by using EntityManager overload that accepts an EntityQuery. (Read more about that here)

Talking about EntityQuery, if it is currently filtered with SCD filter, then we could do a "filtered whole chunk operation"! Since differing SCD values will separate chunks, it all make sense that the batched EntityManager operation could know which chunk to discard or work on.

For example, how about removing all SCD of a particular type with entityManager.RemoveComponent(eq). But not all of them, only entities with a certain SCD value? Just put an SCD filter on your query and throw your query in there for surgical batch remove.

Recap

At this point you should realize that "filtering" works on the property "chunk segmentation via SCD value", and in the end is just a simple SCD index comparison per chunk. (No entity accessed at all!)

It is nothing like iterating through all entities and take some out like LINQ, but rather iterating through all matched chunks and remove some chunks out of them.

With only Chilli : IComponentData, suppose that its data is so big that one chunk could contain only 100 Entity with Chilli. If I have 600 Entity I am getting 6 chunks from EntityQuery with only Chilli.

Suppose that I got an additional SCD named ChilliColor : ISharedComponentData with one enum that says the Chilli is Red, Yellow, or Green. (Let's assume that an entity with Chilli and ChilliColor still make the chunk capacity = 100) And turns out we have 50 Yellow, 150 Green, and 400 Red. Could you reason how many chunks we will get from an EntityQuery with Chilli and ChilliColor ?

The answer is :

Chunk 1 : 50/100 Yellow
Chunk 2 : 100/100 Green
Chunk 3 : 50/100 Green
Chunk 4~7 : 100/100 Red

The chunk returned increased with an addition of SCD, and each one conveniently only contains one color of chilli. In real world, you could imagine them grouped satisfiably by color in a pile of <= 100 where some pile is not quite full, but for the sake of neat color we are OK with that. Where if your ChilliColor is just a normal IComponentData it will be 6 full piles of colorful mess.

What would happen if I add SCD value filter to EntityQuery with Red? Because the filter make only that thing remains, now we get only chunks no. 4~7 from the query. How it achieve this is efficient because it has to literally for loop for 7 times through chunk headers and just throw away the chunk without that SCD without actually looking to any entity, the check is also fast thanks to how we keep just an integer SCD index on the chunk header. This is the real identity of "filtering". It is essentially, just an equality test of SCD index.

If I were to add Rotten : ISharedComponentData with bool as true or false randomly to all chillies, you could imagine we would get around ~15 chunks and then we could put a not-rotten SCD filter to get only remaining good chillies regardless of color from EntityQuery.

BlobBuilder & BlobAssetReference

This is not SCD, why is it doing here in this article? I said before that many is looking to share something in a manner that each Entity could access that resource. It's the blob data that you are looking for. But let's look at alternatives first.

You could keep copying NativeArray<T> to a job for everyone, but who's that data belongs to? Is it the system that keep the array and is kicking off the job? This is not only not-so-ECS way because the system now has a data, you could not associate it with an Entity since if you create IComponentData with NativeArray<T> as a public field it would complain that you got a native container in there.

What about DynamicArray as a new component to an Entity? That isn't shared, so out of the question.

Blob data to the rescue. This article is not for this, but a short introduction, with BlobBuilder, you will be allocating memory in any shape imaginable. You should take a look at BlobificationTests.cs in the package.

struct MyData
{
    public BlobArray<float> floatArray;
    public BlobPtr<float> nullPtr;
    public BlobPtr<Vector3> oneVector3;
    public float embeddedFloat;
    public BlobArray<BlobArray<int>> nestedArray;
}

Then you could get BlobAssetReference<MyData> from the builder. This BlobAssetReference is valid as a public field on IComponentData, yet when you distribute it to many Entity it is by reference, not actually copy the allocated values. (Much like you putting NativeArray in job that looks like a copy since it is a struct, but you are copying just a wrapped pointer inside that) There you go, now this is a real shared data.

Data sharing is not so ECS