Tag component or "zero-sized component" is a special case in Unity ECS where  IComponentData doesn't contain any field.

public struct TagComponent : IComponentData { }

Advantages of tagging

The use is to intentionally separate entities to more chunks because a chunk is defined by one set of archetype. By not having any data, the IComponentData's "data" may sounded wrong but that's when we call it a "tag" instead.

Many "query" in ECS API brought you things in chunk unit. By tagging, you are making it having easier time in bringing you the "filtered" specific data, while not doing any actual element-based filtering (by iteration O(n) regarding to entity amount, etc.) since it is just picking chunks to you without even looking at the data inside.

For example, if you have Name component and Occupation component attach to every entity (to represent a human). Then you also have one ISharedComponentData Sex that has an enum with possible values Male Female Undefined. You know shared component will separate chunks based on its hashed value, so right now the possible chunk archetype you could have are :

  • Name Occupation Sex = Male
  • Name Occupation Sex = Female
  • Name Occupation Sex = Undefined

Sanity check, it is not necessary that you have exactly 3 chunks. If the chunk is packed with many entities one archetype could span several chunks.

You could have "NEET" in the Occupation component data, but you want it to be special because you are going to do something to all the NEETs a lot. You are thinking about separating those entities that are NEET to a new chunk. You created a NEET tag component. (You have a self defined rule that Occupation on entity that contains NEET tag is meaningless)

The possible chunk archetype you could have are now :

  • Name Occupation Sex = Male
  • Name Occupation Sex = Female
  • Name Occupation Sex = Undefined
  • Name Occupation NEET Sex = Male
  • Name Occupation NEET Sex = Female
  • Name Occupation NEET Sex = Undefined

No matter how many chunks each archetype has, it is now fast to get those chunks with NEET. Some example :

  • EntityQuery with NEET could be used to get all the chunks with NEET regardless of Sex or other components. It could leads to To_ API, chunk iteration API, ForEach API.
  • IJobForEach<Name> with an attribute [RequireComponentTag(typeof(NEET))] and scheduled without EntityQuery argument. (where it would ignore all your attributes in that case) It previously returns all 6 type of chunks without the attribute, but now only those with NEET. This "filtering" is very fast as it sift through chunks, not each entity.
  • Many EntityManager commands has EntityQuery batched overload where it does work based on the whole chunk (on all entities in a chunk, not on "chunk component") rather than per entity, and prevents data movement between chunk even on costly sounding action such as add/removing component.
    You can easily change the Sex to Undefined for all NEET for example. If you instead store the state "NEET" in Occupation, you would have to iterate through all entities with Occupation and if check on each, which is not only O(n) but also causing data movement on every step you do.

Special treatments in the source code

Tag components are not just ComponentType with no data.

  • They will not ever become a read/write dependency.
  • Adding/removing tag component to an entity is cheaper compared to non-tag component, since we could skip adding/removing the actual data, saving memory write. (Even though you are explicitly new ing it like em.AddComponent(entity, new NEET());) I am not talking about chunk movement yet, both tag and non-tag cause data movement.
  • When adding tag component via EntityCommandBuffer, you will also save the internal buffer memory it used to hold your commands because it don't have to remember the field value.
  • Adding/removing tag component to all entities in a chunk via batched EntityManager command is a lot cheaper. If you add normal component, a new chunk with an entirely new data layout has to be prepared. What preparation? Remember that chunks are arranged in SOA (Structure of Array). If your component has even one little data inside it, the chunk will have to allocate a contiguous space for that times chunk's capacity (how many entities can fit in given the archetype). But if a tag component, it doesn't have to care!

The anti-tagging camp

A lot of gains are available after tagging. Most ECS practitioners concerned when we are tagging. Tagging gave us advantages by separating entities to more chunks, that separation is memory copy.

What if we tag often (like every frame) and it cause too many data movements between chunks? You started to fear that the query speed up available after tagging would be outweighted by these chunk data movement.

In this case, wouldn't it be better if we just have NEET boolean field in Occupation and we just iterate through them all and check?

Facts about chunk movement

Before you fear about the cost of tagging, make sure you know what is actually this "chunk movement" when we attach a component to an entity.

  1. The tagged entity cannot stay in its chunk anymore, since archetype changed.
  2. If destination chunk with the new tag component/without the removed tag component doesn't exist, allocate it. This allocation doesn't necessary has to be malloc, it could reuse "chunk hole" that was freed earlier. Saving you a bit of time if that is the case.
  3. When the destination is confirmed, we move the entity data to that destination chunk.
  4. The old "entity hole" left at the old chunk is replaced by the final entity of that old chunk. The final entity is now ignored by reducing entity count of the chunk. It is not like we have to move all entities after that point back by one step to restore tightly packed memory.
  5. This has to be on the main thread, or on worker thread with help from ExclusiveEntityTransaction but you can't touch the same world's EntityManager in the main thread in that mean time.
  6. None of these occur if you are using EntityQuery overload of EntityManager where it operates on everything in a chunk equally.
  7. Writing could also use the same cache technology as reading. It could write to the (faster) cache memory first then by some kind of policy, write it to slower and more permanent memory later. This policy is for example, when the cache entry is being replaced. It depends on your CPU. So, tagging many entities in a row may save you some performance depending on where they would be written to.

So maybe you could see chunk movements are not as costly as you think? Still, noting is certain based on your situation. You might be guessing your SIMD and threaded if Occupation could beat main thread chunk movment. But also remember that if means branching assembly, and they are costly in their own way.

The only way to find out is to profile it. And here I will provide my own benchmark.

Benchmarking tag components

What we are finding out

Adding a tag sounded costly because of chunk movement. Iterate and if also sounded costly because we have to iterate through more entities and cause tons of branching assemblies. When will it be better to avoid using tag component?

Note that if you could do tag component via EntityManager's EntityQuery chunk overload, it is very likely that tag component way always win. So in this benchmark I will ignore that possibility.

Setup

public struct Fish : IComponentData 
{ 
    public int hunger;
    public int hp;
}

public struct Disease : ISharedComponentData
{
    public InfectiousDisease infectiousDisease;
    public NonInfectiousDisease nonInfectiousDisease;
}

public struct Hungry : IComponentData { }

I had designed a fishy test, one entity contains Fish and Disease component. When hunger is equal to or over 100, it is considered "hungry" and get its hp continuously decreasing.

We could check if hunger >= 100 and reduce hp, or take time adding Hungry tag to fishes that have hunger >= 100 and reduce hp based on chunk with Hungry tag component. This way we could switch the approach and compare performance.

The Disease is just there to intentionally separate chunks easily in some tests. Each member is an enum with following values :

public enum InfectiousDisease
{
    Normal,
    Parasitic,
    Bacterial,
    Viral,
    Fungal
}

public enum NonInfectiousDisease
{
    Normal,
    Environmental,
    Nutritional,
    Genetic
}

Because the fish archetype contains Disease component by default, all the fishes starts at Normal and Normal, not causing chunk separation from shared component yet.

Measurement

This test will be run on a real Android device, Xiaomi Mi A2, using C#'s Stopwatch to print a message to adb logcat and collect the data via pidcat terminal program.

All tests will be repeated 50 times and averaged the time. The first run is ignored as that often contains "warm up" problem. (So 49 times averaged) And also outliers are filtered out. Outliers are determined manually by eye. (When they are about 2x~3x of neighbours)

Variables for all tests

World w;
EntityManager em;
FishTankSystem fts;
EntityCommandBuffer ecb;
System.Diagnostics.Stopwatch sw;

[SetUp]
public void SetUp()
{
    w = new World("TestWorld");
    fts = w.GetOrCreateSystem<FishTankSystem>();
    em = w.EntityManager;
    ecb = new EntityCommandBuffer(Allocator.Persistent);
    ecb.MinimumChunkSize = 3200000; //So that ECB doesn't have to expand
    sw = new System.Diagnostics.Stopwatch();
}

[TearDown]
public void TearDown()
{
    ecb.Dispose();
    w.Dispose();
}

Packages

  • Entities : preview 30-0.012
  • Burst : preview.13 - 1.0.0

Tagging cost

We will benchmark just the cost to tag, not actually doing work on things that are tagged yet. The tagging will be by iterating through entities one by one and use EntityCommandBuffer.Concurrent to remember which one should be tagged in parallel, then playback the command on the main thread to finally tag them sequentially.

Note that this is 100% slower than just iterate and tag them in the main thread, but the ECB "handicap" is to accomodate games that has no choice but to do it on thread.

All fishes are started with either hunger 0 or 150, depending on test's parameter.

[ExcludeComponent(typeof(Hungry))]
public struct HungryTaggingJob : IJobForEachWithEntity<Fish>
{
    public EntityCommandBuffer.Concurrent taggingBuffer;
    public void Execute(Entity entity, int index, ref Fish c0)
    {
        if (c0.hunger >= 100)
        {
            taggingBuffer.AddComponent(index, entity, new Hungry());
        }
    }
}
[Test]
[Repeat(50)]
public void TaggingCost([Values(100, 1000, 10000, 100000)] int amount, [Values(0, 50, 100, 500, 1000, 5000, 10000, 50000, 100000)] int hungryFishAmount)
{
    if(hungryFishAmount > amount) return;

    fts.SetUpFish(amount, hungryFishAmount);
    var job = new HungryTaggingJob
    {
        taggingBuffer = ecb.ToConcurrent(),
    }.Schedule(fts);

    sw.Start();
    job.Complete();
    ecb.Playback(em);
    sw.Stop();

    Assert.That(em.CreateEntityQuery(ComponentType.ReadOnly<Fish>()).CalculateLength(), Is.EqualTo(amount));
    Assert.That(em.CreateEntityQuery(ComponentType.ReadOnly<Hungry>()).CalculateLength(), Is.EqualTo(hungryFishAmount));
    Assert.That(em.CreateEntityQuery(ComponentType.Exclude<Hungry>()).CalculateLength(), Is.EqualTo(amount - hungryFishAmount));

    Debug.Log($"{nameof(TaggingCost)},{amount},{hungryFishAmount},{sw.ElapsedTicks}");
}
Entity count Tagging count Ticks Average ticks per entity
100 0 1183 11.83
100 50 2871 28.71
100 100 3918 39.18
1000 0 1338 1.34
1000 50 3049 3.05
1000 100 4160 4.16
1000 500 13258 13.26
1000 1000 23885 23.89
10000 0 2918 0.29
10000 50 4608 0.46
10000 100 5734 0.57
10000 500 14892 1.49
10000 1000 25379 2.54
10000 5000 114304 11.43
10000 10000 225922 22.59
100000 0 12779 0.13
100000 50 14761 0.15
100000 100 15807 0.16
100000 500 24953 0.25
100000 1000 35489 0.35
100000 5000 121874 1.22
100000 10000 233941 2.34
100000 50000 1119026 11.19
100000 100000 2212125 22.12

The row where "tagging count" is 0 shows just the iteration cost with all failing if. The row where "entity count" = "tagging count" shows that every entities have to be tagged.

Interestingly, when the count is large enough (~1000 ?) we see that the average cost to tag each entity is around 23 ticks. (And accordingly with amount of entities to be tagged halved, the cost halves accordingly to around 10 ticks)

By the way 10000 ticks = 1 ms. You got 16.66 ms total to work per frame to get that 60 FPS. Typical mobile games will already spent around 6ms on layout/rendering UI and meshes.

And lesson learned, just iterating without tagging anything costs ticks (because of if too). If through 1000 entities and costing you 1300 ticks doesn't looks like a lot but imagine you got many systems that are doing empty iteration and not doing anything, it may adds up to 1 ms quickly.

Iterating the tagged

Time to make the fish hurt. We continue from the previous test by iterating once on the tagged entities to reduce the hp by 1 for fishes that have the Hungry tag.

[BurstCompile]
[RequireComponentTag(typeof(Hungry))]
public struct TagDyingJob : IJobForEach<Fish>
{
    public void Execute(ref Fish c0)
    {
        c0.hp -= 1;
    }
}
[Test]
[Repeat(50)]
public void IteratingTheTagged([Values(100, 1000, 10000, 100000)] int amount, [Values(0, 50, 100, 500, 1000, 5000, 10000, 50000, 100000)] int hungryFishAmount)
{
    if(hungryFishAmount > amount) return;

    fts.SetUpFish(amount, hungryFishAmount);
    var job = new HungryTaggingJob
    {
        taggingBuffer = ecb.ToConcurrent(),
    }.Schedule(fts);

    job.Complete();
    ecb.Playback(em);

    Assert.That(em.CreateEntityQuery(ComponentType.ReadOnly<Fish>()).CalculateLength(), Is.EqualTo(amount));
    Assert.That(em.CreateEntityQuery(ComponentType.ReadOnly<Hungry>()).CalculateLength(), Is.EqualTo(hungryFishAmount));
    Assert.That(em.CreateEntityQuery(ComponentType.Exclude<Hungry>()).CalculateLength(), Is.EqualTo(amount - hungryFishAmount));

    var job2 = new TagDyingJob().Schedule(fts);


    sw.Start();
    job2.Complete();
    sw.Stop();

    using (var cda = em.CreateEntityQuery(
            ComponentType.ReadOnly<Fish>(),
            ComponentType.ReadOnly<Hungry>()
    ).ToComponentDataArray<Fish>(Allocator.TempJob))
    {
        Assert.That(cda.All(x => x.hp == 99));
    }

    using (var cda = em.CreateEntityQuery(
            ComponentType.ReadOnly<Fish>(),
            ComponentType.Exclude<Hungry>()
    ).ToComponentDataArray<Fish>(Allocator.TempJob))
    {
        Assert.That(cda.All(x => x.hp == 100));
    }

    Debug.Log($"{nameof(IteratingTheTagged)},{amount},{hungryFishAmount},{sw.ElapsedTicks}");
}
Entity count Tagged count Matching chunks Ticks Average ticks per tagged entity
100 0 0 532
100 50 1 522 -0.2
100 100 1 531 -0.01
1000 0 0 509
1000 50 1 701 3.84
1000 100 1 556 0.47
1000 500 1 572 0.126
1000 1000 1 617 0.108
10000 0 0 475
10000 50 1 510 0.7
10000 100 1 516 0.41
10000 500 1 608 0.266
10000 1000 1 622 0.147
10000 5000 5 1625 0.23
10000 10000 10 2106 0.1631
100000 0 0 711
100000 50 1 672 -0.78
100000 100 1 671 -0.4
100000 500 1 724 0.026
100000 1000 1 754 0.043
100000 5000 5 1747 0.2072
100000 10000 10 2147 0.1436
100000 50000 50 4461 0.075
100000 100000 100 6844 0.06133
  • The row with tagged count 0 do nothing, since no chunk matches the job at all. So these row signify the job completion cost as the benchmark is just surrounding the job2.Complete().
  • The other rows has "Average ticks per tagged entity". This is computed from minus the tick from the 0 row first to separate the job completion cost then divide with amount of tagged entities. You could see it scales really well, when you have as much as 100000 entities the ticks grow to just 6844 average.
  • You also notice that the "1000 tagged" column on 1000, 10000, 100000 entity count has the same ticks (600~700). As expected for the strength of tag component way, all other entities that don't have the tag are not included in the queried chunk. It doesn't matter if you have million total entities if only 1000 are tagged.
  • The matching chunk remains 1 for long time since each entity is small, chunk capacity becomes large. But no matter how small each entity takes the chunk has a constant number of entity upper limit as (16 * 1024 - 256) / 8 = 2016. So when it gets to 5000 entities it can't help but must go to the other chunk.

Next up I will try increasing an amount of chunks intentionally by infecting fishes with diseases. That is, adding different Disease SCD combinations then run the same test again. Because the Disease SCD contains 2 fields I could permutate it to get many different chunks quickly.

public void InfectAllFishes()
{
    var fishes = Entities.WithAll<Fish>().ToEntityQuery().ToEntityArray(Allocator.TempJob);
    for(int i = 0; i < fishes.Length; i++)
    {
        InfectiousDisease d1 = (InfectiousDisease)(i % Enum.GetValues(typeof(InfectiousDisease)).Length);
        NonInfectiousDisease d2 = (NonInfectiousDisease)(i % Enum.GetValues(typeof(NonInfectiousDisease)).Length);
        EntityManager.SetSharedComponentData(fishes[i], new Disease { infectiousDisease = d1, nonInfectiousDisease = d2 });
    }
    fishes.Dispose();
}
  • There's no table to see since it doesn't make much difference and I'm too lazy to paste it.
  • The amount of chunks doesn't meaningfully affect iteration time. As expect since it gets you chunk by chunk to work on.

So in the other tests we will not considering an amount of chunks anymore.

Iterating with conditional

Next up, we will just use if on the component's hunger field if we should reduce the fish's hp or not. It is just the previous test but with the job that reduce hp replaced by this :

[BurstCompile]
public struct ConditionalDyingJob : IJobForEach<Fish>
{
    public void Execute(ref Fish c0)
    {
        if (c0.hunger >= 100)
        {
            c0.hp -= 1;
        }
    }
}
Entity count True conditional count Ticks Average ticks per entity
100 0 546 5.46
100 50 607 6.07
100 100 575 5.75
1000 0 592 0.59
1000 50 706 0.71
1000 100 720 0.72
1000 500 794 0.79
1000 1000 697 0.7
10000 0 1730 0.17
10000 50 1835 0.18
10000 100 1874 0.19
10000 500 1909 0.19
10000 1000 1965 0.2
10000 5000 2184 0.22
10000 10000 2297 0.23
100000 0 8364 0.08
100000 50 8540 0.09
100000 100 8512 0.09
100000 500 8701 0.09
100000 1000 8193 0.08
100000 5000 8440 0.08
100000 10000 8644 0.09
100000 50000 9052 0.09
100000 100000 9568 0.1
  • Ticks grow according to your total amount of entities, as expected. 1000 -> 10000 increases ticks by about 3x, 10000 -> 100000 increases ticks by about 4x.
  • No matter what the conditional evaluates as true or false, ticks is similar. Though theoretically the true case should cost more since it minus a number there. But at most it is just about 1000 ticks larger than the all false case. Thank you branch predictor!
  • 1183, 1338, 2918, 12779 are the amount of all false case from the "tagging cost" table. Compared with this table where it is all false which should cost the same, they are instead 546, 592, 1730,  8364. Why could they be cheaper where they are essentially the same, just an if that evaluates to false?
    Remember that the tagging job couldn't be Burst compiled because of the use of EntityCommandBuffer. When it could, maybe this number will be about the same.

Tag component iteration vs conditional iteration

Combining 2 tables together, we got this :

Entity count True conditional count / Tagged entities count Conditional iteration ticks Tagged entities iteration ticks Tagged is faster by (ticks) Tagged is faster by (%)
100 0 546 532 14 2.63
100 50 607 522 85 16.28
100 100 575 531 44 8.29
1000 0 592 509 83 16.31
1000 50 706 701 5 0.71
1000 100 720 556 164 29.5
1000 500 794 572 222 38.81
1000 1000 697 617 80 12.97
10000 0 1730 475 1255 264.21
10000 50 1835 510 1325 259.8
10000 100 1874 516 1358 263.18
10000 500 1909 608 1301 213.98
10000 1000 1965 622 1343 215.92
10000 5000 2184 1625 559 34.4
10000 10000 2297 2106 191 9.07
100000 0 8364 711 7653 1076.37
100000 50 8540 672 7868 1170.83
100000 100 8512 671 7841 1168.55
100000 500 8701 724 7977 1101.8
100000 1000 8193 754 7439 986.6
100000 5000 8440 1747 6693 383.11
100000 10000 8644 2147 6497 302.61
100000 50000 9052 4461 4591 102.91
100000 100000 9568 6844 2724 39.8
  • You get the idea, if conditional will be at disadvantage when there is a lot of entities but only a few matches the condition and have to take action. You already pay the filtering cost when tagging, then you can enjoy working on only data that matters.
  • You could also think you had "pre if" by paying higher cost in order to save ticks later with tagging way.
  • Even when the amount of matching if conditional is increased to be about the same, tag entity iteration still won by small margin. Probably because it doesn't have to deal with if. It's like you had "bake" the if already. *The lack of if could also potentially enable Burst SIMD optimization where otherwise not possible with if.

Time to think about your game

We can now use those data to choose an appropriate approach.

Example scenario : I have 10000 entities which about 5000 of them flipping the state every frame. There is a work I want to do based on that state. Tag them every frame or use if?

  • if them : 2184 ticks per frame.
  • Tag them : 114304 ticks to iterate through 10000 entities and moving 5000 of them to a new chunk. Then 1625 ticks to work on them.

Answer : if is faster. You shouldn't be tagging that many every frame!

Example scenario : Like the previous scenario, but I have 300 systems working on that state that has to work in this same frame the state changed.

  • Now the stake had changed. Notice that if you pay 114304 ticks to tag them on the main thread, you are doing favor for all systems working on them later (potentially completely on thread in parallel) The estimated cost this frame for tag component way : 114304 + (300 * 1625) = 601804.
  • If using if without tagging : (300 * 2184) = 655200.

Answer : tagging is faster. Note that the "300 systems" sounds absurd and obviously designed to make tagging win in this case, but my point is that when you tag your entities, all following systems will benefit. As opposed to if that you have to iterate through them all over and over and it cost you as many iterations you have. And usually games will not have to tag 5000 entities every frame. Tagging will be at even more advantage when more frames goes on without any more tagging, continuing to diverge from if's performance.

Example scenario : I have 10000 entities with all of them changing states little by little over time. These state change are already in the existing system that has to run every frame. I think occasionally in some frames, it has to tag or untag around 50 entities. This frame occur once in 1 second or so. And in any moment, there will be only 50 entities that are tagged. There is a work I want to do based on that state. Tag them each time they change state or just use if?

  • if them : 1835 ticks per frame.
  • Tag them : this may not be accurate, but we could estimate using 10000 entities rows. When nothing to tag, it costs 2918 ticks, when having to tag 50 entities, it cost 4608 ticks. To use this as an estimate, 4608 - 2918 = 1690. After tagging, you would be doing about 500 ticks amount of work each frame. In some frame you have to pay +1690. 500 +1690 = 2190, even in that kind of frame you are doing only a bit more than if.

Answer : tagging is faster. If you have more than 1 system to work on the state change, tagging will be even more faster.

Remember

  • Tagging is still being handicapped by not able to be Burst compiled. It will be in the future. (And I will return to benchmark it, let's hope the benchmark project is still here at that time)
  • If you could do per-chunk EntityManager action it is a no brainer to use tags instead of if the value. (So try to design the game this way, maybe cleverly using ISharedComponentData that things you want to tag are always grouped.)
  • You are still encouraged to benchmark your own game! Do not trust this benchmark too much as it is too simple.