To be or not to be: dicvovery reference counting.

To be or not to be, that is the question, asking Hamlet the Prince of Denmark. He can do this, because he has self awareness and consciousness. But SCOM class instances (or objects) couldn’t. So, it’s SCOM to decide if a class instance to live or die. But how?

First, let’s recall how class instances appear in SCOM. There is a special workflow type, called discovery. A discovery returns a list of objects to add, remove, or update to/in SCOM database. Usually, objects represent monitored subjects, such as computers, logical discs, virtual machines, applications, etc. Each object has inventory attributes, but more importantly, each object has a unique key (which may consist of object’s own key and parent’s key, and parent’s parent key, etc.). As a key uniquely identifies an object, that means, that two object with the same key are the same object. In other words, if a discovery returns an object, and the same object (with the same key) is already in SCOM database, this object will be updated with new details.

Most discoveries run in snapshot mode. What does this mean? A snapshot discovery always return a full list of objects. When SCOM process list from snapshot discovery it:

  • if object from discovery doesn’t exist in database, it will be created in database.
  • if object from discovery already exists in database, it will be updated in database.
  • if no object from discovery matches existing object in database, the object in database will be deleted (or, same in different words: if database has object, which is not in the snapshot discovery list, this object will be deleted from database).

However, discovery also can run in “Add-Update” and “Remove” modes. This may significantly increase performance when object number is big and only one object needs update. So, in add-update mode:

  • if object from discovery doesn’t exist in database, it will be created in database.
  • if object from discovery already exists in database, it will be updated in database.

And in Remove mode:

  • All objects from discovery will be removed from database.
  • If an object from remove discovery doesn’t exist in database, then nothing will happen.

The above works perfectly, if there is only one discovery (which, however, can return discovery data in all three modes — discovery mode is not set in discovery properties and can be chosen at runtime). But what is here is more than one discovery? To resolve this, SCOM has a process called reference counting to decide. How it works?

Discovery Reference Counting Process.

SCOM makes a reference count for each unique discovery. A discovery reference count is set to 1 when discovery returns an object instance in add-update or snapshot mode. And it’s set to 0 when the discovery doesn’t return an existing object in snapshot mode, or returns an object in remove mode. When all references are made, then:

  • An object will exist in SCOM database, if at least one reference is 1.
  • An object will be deleted from SCOM database, if all reference are 0.

I also mentioned, that references are made per “unique discovery”. What is this? The unique discovery id consists of discovery rule id and discovery target instance id. The discovery rule is is the same id, which is returned, for example by Get-SCOMDiscovery. The target instance id is the id of specific class instance where discovery has been executed. For example, a discovery returns logical disks and has a target class set to “Windows Computer”. There are two windows computers in SCOM. So, the discovery will run two times at each computer, and there will be two unique discoveries for reference counting: (discovery id + computer 1 id) and (discovery id + computer 2 id).

Normally, this process doesn’t make any difference, because in the scenario above, each discovery returns disjoint sets of objects. For example one discovery returns: computer1\C:. The other discovery returns: computer2\C: and computer2\D:. There is no common id in both discovery results.

However, consider the following scenario. There are two discoveries. One discovery targets Windows Computer and returns logical disks always in add-update mode. Another discovery runs per each logical disk and tests disk existence. If the disk doesn’t exist, the discovery returns the disk object in remove mode. In this case, each discovery will have their own reference. When a disk is added, there will be count 1 for the add-update discovery id plus computer 1 object id for, say for instance, computer1\D:. When the D drive is removed, the add-update discovery simply stop supplying the object updates. But then the disk-targeted second discovery will set another reference. There will be count 0 for the remove discovery id plus computer 1 object id for the instance of computer1\D:. Therefore, as per the counting rule, the D: drive’s object will not be removed, because at least one unique discovery has count 1. So, this scenario doesn’t work.

Rituals in discovery.

As humans being, we have lots of rituals in our life. Some, like tea ceremony brings some meaning into our mortal life. Other are rudiments of the past, so we follow them without realizing a rational behind them, therefore cannot judge if they are still applicable or not. There are some second type rituals in SCOM authoring. When creating a discovery, we always do:

  1. Define sourceId and managedEntityId parameters in discovery’s data source module type like:
<Configuration>
  <xsd:element minOccurs="1" name="sourceId" type="xsd:string" />
  <xsd:element minOccurs="1" name="managedEntityId" type="xsd:string" />
</Configuration>
  1. Use them in discovery script to initialize discovery data structure:
$discoveryData = $api.CreateDiscoveryData(0, $sourceId, $managedEntityId)
  1. Always pass the following values in the discovery definition:
<sourceId>$MPElement$</sourceId>
<managedEntityId>$Target/Id$</managedEntityId>

But no one really explains why it’s like this. Well, if you dig a little bit deeper, that you can find, that $MPElement$ refers to the current workflow id (i.e. discovery id) and $Target/Id$ refers to discovery target instance id. But can we use other values but these?

The solution.

So, the last scenario with two discoveries (add-update and separate remove) didn’t work because it created two references to the same object instance. But how SCOM knows, what discovery is returning the data? Or how SCOM knows against each object instance the discovery runs. Ok, it seems to be very simple. Of cause SCOM knows what code/workflow/discovery it executes, and, obviously, its id, as well as the target object id. And that doesn’t look like we can do anything to this. But wait…!!!

These “discovery id” and “target instance id”… Don’t they sound familiar to the same terms in the “ritual” paragraph? So, may be it’s not “SCOM knows both ids”, but “discovery reports its own id and current target object id”? So, may be we can pass different values? Or not “just different”, but exactly the same values as the add-update discovery did? Luckily, the answer is YES!

So, that’s very easy for discovery id. Normally, we refer to workflow own id as $MPElement$, but there is an extended option of this syntax: $MPElement[Name=’any management pack element name’]$. Therefore, to match the reference count from the second discovery we need to replace $MPElement$ with $MPElement[Name=’The.AddUpdate.Discovery.Name’]$. What is about the instance id parameter? The first discovery runs for a computer, so it has computer object id as a parameter. But the second discovery runs for a disk object. Luckily, in our scenario, disk is hosted at computer, so we may try to use ‘host’ notation, i.e. use $Target/Host/Id$ instead of $Target/Id$. And this works! So, the first discovery uses the traditional notation:

<sourceId>$MPElement$</sourceId>
<managedEntityId>$Target/Id$</managedEntityId>

But the second remove-mode discovery disguise as the first and use:

<sourceId>$MPElement[Name='The.AddUpdate.Discovery.Name']$</sourceId>
<managedEntityId>$Target/Host/Id$</managedEntityId>

And this works like a charm!

Conclusion.

This is an unusual technic, and normally no one bother using it, unless there is a strong argument for it. Such an argument can be a situation, when a discovery needs to return several or tens thousand of objects. Returning them in a single snapshot sets significant load on SCOM management servers and database, and as well may exceed data item limit. But returning objects in small batches continuedly and then verifying each current object’s existence is much more lightweight scenario.

Appendix: query references from SCOM SQL database.

You can use the following query to query all references for all objects from SCOM DB. Note, when one discovery mimics another one, using the technique from this article, it will not appear as a separate workflow in this query, but will be shown as the original one.

(replace condition in WHERE clause with your MP name prefix to avoid too larger query).

SELECT dtme.DiscoverySourceId as UniqueDiscoveryId,
       d.DiscoveryId as DiscoveryId,
       d.DiscoveryName as DiscoveryName,
	   boundbme.BaseManagedEntityId as DiscoveryTargetObjectId,
	   boundbme.DisplayName as DiscoveryTargetObjectName,
	   dtme.IsDeleted as IsDeletedFromDiscovery,
	   ' | ' as Separator,
	   bme.BaseManagedEntityId as DiscoveredObjectId,
	   bme.DisplayName as DiscoveredObjectName,
	   bme.FullName as DiscoveredObjectFullName,
	   bme.IsDeleted  as IsDiscoveredObjectDeleted
  FROM [OperationsManager].[dbo].[DiscoverySource] ds
  join Discovery d on d.DiscoveryId = ds.DiscoveryRuleId
  join DiscoverySourceToTypedManagedEntity dtme on dtme.DiscoverySourceId = ds.DiscoverySourceId
  join TypedManagedEntity tme on tme.BaseManagedEntityId = dtme.TypedManagedEntityId
  join BaseManagedEntity bme on bme.BaseManagedEntityId = tme.BaseManagedEntityId
  join BaseManagedEntity boundbme on boundbme.BaseManagedEntityId = ds.BoundManagedEntityId
  where d.DiscoveryName like 'your.management.pack.prefix.%'
  order by boundbme.DisplayName, bme.DisplayName

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s