External Data - Attributes - Profiling Data Access

Mariusz Postol
Jan 06, 2024

1.3k
0
1
- facebook
- twitter
- linkedIn
- Reddit
- WhatsApp
- Email
- Print
- Other Artcile

Introduction

The external data is recognized as the data we must pull or push from outside of a boundary of the process hosting the computer program. In general, the external data may be grouped as follows.

streaming: bitstreams managed using the content of files, or network payload
structural: data fetched/pushed from/to external database management systems using queries
graphical: data rendered on a Graphical User Interface (GUI)

This article collects a description of examples explaining the selected aspects of the streaming data usage sent over a network or managed as files (see also External Data - File and Stream Concepts). Here the first problem arises, namely to save/read working data, we need a transition operation in both directions between streaming and object-oriented data regardless of the types we used to create the graph of objects contained in the working memory. This transition from the object world to the streaming world is called serialization. Deserialization is the reverse process, which involves replacing the bitstream with interconnected objects located in the working memory of a computer. Reflection used to implement serialization/deserialization usually employs attributes to profile the behavior of this functionality. In this article, selected examples related to the definition and usage of attributes are explained.

It is the next part of a series of articles commonly titled Programming in Practice - External Data.

Of course, attributes are not only used for this particular application. By design, the described code examples are intentionally limited to have educational applicability.

Profiling Development Environment

Let's start by creating a very simple AttributedClass example used as a starting point for the discussion on attributes. It has only one method, but its functionality is not important in the context of the discussion. The method creates an object and returns it. Imagine that after some time, we conclude that this method is not perfectly correct anymore, and we want to avoid referencing it. We know it is used in many places in the program, so to preserve backward compatibility, we cannot simply remove it from the program text to avoid triggering a bunch of syntax errors. Hence, we must keep in place this definition, but we should associate additional information with code in a declarative way. This additional information should prevent it from being used in the program any further. This way, we try to fix an issue by preventing referencing of inadequate code instead of replacing it. In other words, there will be no further references to it in new programs.

We may use the Obsolete attribute for this purpose. To observe this attribute and the effects it causes, let's open a test window and add a test method. In the test method, we simply call the method that we previously marked with the Obsolete attribute, and we see that the compiler now reports a warning. It is also available in the error list. Therefore, this is a clear signal that we should not use this method because it is no longer valid.

This warning should make us use some other alternative solutions. Of course, we could use a regular comment instead. Unfortunately, this will cause us to lose the warning to avoid using this method in newly created program fragments. Based on this, we can conclude that a comment is a very good tool for communicating with the reader of the program text - after all, any program is a text. On the other hand, attributes are a concept for communicating with the compiler. And as we will see next, not only with the compiler.

The F12 key takes us to the definition, and we see that the attribute is a class that is derived from the Attribute class. Now we can formulate a key question; whether we can define our attributes, which we may use to associate additional information with code in a declarative way to be used at run-time.

Attribute Definition

To create a custom attribute, I have created an additional CustomAttribute class. As before, it inherits from the System.Attribute base class. The main goal of it is to provide additional information related to the program content. Therefore, to define it, we need to specify the following things:

what additional information do we want to convey using it
how the information is to be represented using types
how to restrict the location of attribute usage

The first two tasks - related to the selection of information that is to be conveyed using an attribute, can be achieved by choosing how this information is to be represented (data type selection) and adding appropriate properties (value holders) that will convey this data. In this example, it is the Description property which is of type string. This way, some descriptions expressed in natural language may be added to the target construct. Notice that also a constructor is added here, which is responsible for initializing this description when the attribute is instantiated.

The next task of how to restrict attribute usage may be accomplished by associating an existing, dedicated attribute with a definition of a new attribute class. It is a message to the compiler. In other words, we use an existing attribute to define a new one. The AttributeUsage attribute is predefined by the built-in definitions of the C# programming language that allows expressing where adding a new attribute makes sense.

And here's a crucial note about terminology. I used the term attribute for both.

to name a class that is derived from the System.Attribute base class
as an identifier that is used elsewhere and surrounded between square brackets

Maybe it sounds puzzling, but it is a typical recurring reference to the joint terms.

Let's examine the features of the newly created CustomAttribute class using the CustomAttributeTest test method. It just instantiates an object of this class traditionally using the new operator and then compares the value of the embedded property value with an expected one. This way, we can prove that this class behaves like any other regular class.

Keeping in mind that the newly created attribute is a class, let's try to use it to add additional information to the previously defined AttributedClass class. So a linguistic construct appears, where between square brackets, we will have the name of the class and additional data that we want to be associated with this class. Since this is additional data, we call it metadata; in other words, data describing data. Since, in this case, the data being described is a linguistic construct, there is the text of the program - the program becomes data. The question is how this metadata may be used throughout the processing process, hence at run-time. Let's see this with an example of a unit test where we try to recover the associated data.

From this example, we see that it can also be associated with actual parameters placed between round brackets. In other words, it looks like a method call, doesn't it? Moreover, because the name is the same as the class name, it looks like a constructor call. Unfortunately, the detailed discussion of these linguistic constructs syntax is beyond the scope of the article. To possibly fill in a gap in this respect, I recommend the C# language user manual. To make the discursion generic, from now on, we will only focus on the semantics, i.e. on the meaning of these entries.

Attribute Use Based Directly on Type Definition

So let's add a test method AttributedClassTypeTest in a test class, in which the code will refer to AttributedClass that has been associated with an attribute. To refer to the type the typeof keyword is applied. As a result of using typeof an object of the Type type is instantiated for the selected type. An object created this way contains details of the source type definition. And here, we encounter reflection for the first time. Reflection, which means that we can create objects in the program that represent selected linguistic constructs. In this case, _objectType it is a variable of the Type type that will contain a reference to the object representing the AttributedClass class definition. Notice that to avoid code cloning, the main test functionality is gathered in the GoTest method. Then, from this object, we can read all attributes related to the selected type using the GetCustomAttributes instance method. Additionally, in this case, it is specified that we are only interested in attributes of the selected type. We can then determine that there is returned an array with exactly one element in it. This element is of the CustomAttribute type, i.e. the type we associated with the class as a class attribute.

Therefore, we can return to the discussion about semantics, i.e. the meaning of the notation between square brackets. We see that the GetCustomAttributes method creates objects. Objects that are associated with selected language construct, in this case, AttributedClass. It looks the same as if we used the new keyword to create an object of the CustomAttribute class. After creating the object, it can be used as if it had been created using the new keyword.

So, once again, back to the heart of the topic. We can ask what role this linguistic construct plays - where the class name is placed between square brackets. This class has to be an attribute, i.e. a class that is derived from the System.Attribute class. We see that the main purpose of this construct is to describe the instantiation of an object and therefore answers the question of how to create an object. This way, we can conclude that it is equivalent to the constructor call, which is typically placed after the new operator. Here, it plays the same role, except that it is not part of the expression in the assignment statement. Hence it has to provide all values for the constructor and initial parameter for the members of the attribute class.

The AttributedClass class is preceded by the CustomAttribute. In the unit test, we have the AttributedClassTypeTest test method, which proves how to retrieve features of the definitions of this type by creating an instance of the Type type. The main testing stuff has been aggregated in the GoTest method to reuse this functionality and avoid code cloning. This example shows that we can recover type features that are provided in the form of attributes. Additionally, we can perform operations on attributes (instances of classes derived from the System.Attribute base type) that are created as a result of the GetCustomAttributes operation. In this approach, the identifier of the type definition is directly used. In this code snippet, the typeof is an operator which is used to instantiate an object that represents metadata of a type, utilizing the identifier of an attribute type definition. The argument to the typeof operator must be the name of a type definition.

Attribute Use Based Indirectly on Type Instance

In the above example AttributedClassTypeTest, there is a weak point. Unfortunately, talking about serialization/deserialization we have to implement appropriate algorithms in such a way that we do not have to refer directly to the type definitions because we simply do not know them. In general, we must assume that the type is defined later, and it doesn't matter if it is defined milliseconds or years later. Let's try to imagine a scenario in which we have to deal with objects whose types we do not know.

To prepare an example that resembles the above scenario, I have added the Siyova16 class with all identifiers created randomly by a password generator. The main idea of creating a random definition is to give the impression and stress that there is no need to refer directly to them while implementing the required functionality. To create a generic solution, the reality is that we need to be prepared for a situation where referencing identifiers directly is impossible. The reflection can be applied to both cases, so we can investigate it using a simplified case.

To continue building an example in which we will show how to operate on objects of unknown types, I have inserted the ObjectFactory class. The main task of this class is to create objects of random type. Precisely, the objects are only of different types, but they have one thing in common, namely they are preceded by the same CustomAttribute attribute. The AttributedClassInstanceTest test shows that it is possible to detect this feature without referring to identifiers associated with the object type. For this purpose, it mimics the creation of objects of various types using the dedicated ObjectFactory class. Regardless of the object type, the same GoTest test method is performed, which checks the presence of the selected attribute. For this purpose, an enumeration type is defined in this class, in which the values are also random. It is worth stressing that there is no direct relationship between the enum identifier and the identifier of the instantiated type.

The ObjectFactory method is responsible for creating objects of various types. Because it creates objects of different not related to each other types, the return value must be of the object type, which allows returning objects of any type. Therefore, after calling ObjectFactory we don't know the type of the returned instance. Hiding the type of the created objects is intended to mimic operation on unknown types. Of course, this is just a simulation for this particular example to make the example as simple as possible. I want to emphasize that the tests are solely used to demonstrate certain features and the possibility of using reflection for serialization/deserialization.

To show how to operate on objects without referring directly to their type definitions, we have to recover the features of types from their instances. To follow up, check out the example from the AttributedClassInstanceTest test method. Once again, the test method instantiates a variety of types having the same feature and executes a test against this feature.

How to recover the features of a type referring directly to this type we already know. This can be done by creating a Type instance for the selected type definition using the typeof keyword and an identifier of this definition. In the case of an object for which the type is not known for some reason, the GetType instance method inherited from the Object type comes in handy. Let me remind you that this operation is inherited from the Object base class. It is the ultimate base class of all .NET classes; it is the root of the type hierarchy. So in our case, reflection starts when an instance of Type is created using the GetType method. It should be emphasized here that based on this example, we can conclude that reflection is related even to the Object base type.

Conclusion

To make a summary of the discussion above, regardless of whether we have a type definition visible or we need to bother with recovering the type description from an instance instead, the common point in the process of further processing and converting the associated attribute to an object is to create an instance of the System.Type abstract type, which holds a detailed description of the type in concern. Because it is abstract, we cannot create this instance directly and have to use the typeof keyword or employ the GetType instance method. Going right to the point, since in both cases we can reach a common point, we can have the same test method GoTest to avoid text cloning.