MSIL Programming: Part 2

Before reading this article, I highly recommend reading the previous part:
 

Abstract

 
The primary goal of this article is to exhibit the mechanism of defining (syntax and semantics) the entire typical Object-Oriented Programming “terms” like a namespace, interface, fields, class, and so on, in particular, CIL metadata programming perspectives without having the IntelliSense support because we are practicing IL code authoring using common editors that unfortunately, don't leverage such features. IL coding enables the developers to directly implement the typical functionality of the C# language using its opcode instructions, for instance creating and consuming DLL files or invoking methods of an unmanaged DLL into IL code without relying on Visual Studio IDE. Such inner understandings of IL coding could be beneficial, especially for code optimization, debugging, and reverse engineering, in terms of malicious code detection and subverting essential security mechanisms.
 

Field Metadata

 
Fields are data locations for diverse .NET types, for instance numeric, character, and decimal. A field can be defined using a .field directive, which has three types of information- name, signature, and access modifier (flag). Moreover, fields in the .NET framework could be categorized as a value type or reference type; hence, it is also important to recognize the owner of the field such as TypeRef, TypeDef, or ModuleRef.
  1. .field <flags> <type> <name>  
The field flags determine the accessibility scope of the field inside or outside the assembly. In fact, a flag does categorize further as accessibility (public, private), contract (static, initonly, literal), and reserved (marshal, rtspecialname). The type indicates the kind of data such as strings, characters, or numeric that will be stored at a specific location.
 
The fields could be either global (inside class type) or local (inside function scope). Hence, the following code snippet is showing the data fields that are defined in the class-level scope.
 
Syntax 1: Fields declaration in IL code:
  1. // for integer type data  
  2. .field private int32 iVal  
  3.   
  4. // for float type data  
  5. .field private float32 fVal  
  6.   
  7. // for string type data  
  8. .field private string sVal  
  9.   
  10. // for character type data  
  11. .field private char cVal  
In the case of class-level data initialization, the default value can be directly assigned to the variable as in the following:
  1. // declaration with assignment   
  2.   
  3. .field private int32 iVal= int32(50)  
  4. .field private string sVal= “Ajay”  
The data fields that are defined inside the method body are considered to be “local data” and defined using a .locals directive. Here, we are defining a local integer type value as in the following:
  1. .local init ([0] int32 x)  
Hence the following code demonstrates some common operation by specifying both local variables and global variables (outside the method) as in the following.
 
Listing 1: Fields declaration in IL code
  1. ..  
  2. .module cilFields.exe  
  3.   
  4. .class private auto ansi beforefieldinit cilFields.fldsDemo  
  5. extends [mscorlib]System.Object  
  6. {  
  7.    .field private int32 x  
  8.    .method public hidebysig instance void testCal() cil managed  
  9.    {  
  10.   
  11.       .maxstack 2  
  12.       .locals init ([0] int32 z, [1] int32 y)  
  13.       IL_0000: nop  
  14.       IL_0001: ldc.i4.s 50  
  15.       IL_0003: stloc.1  
  16.       IL_0004: ldarg.0  
  17.       IL_0005: ldfld int32 cilFields.fldsDemo::x  
  18.       IL_000a: ldloc.1  
  19.       IL_000b: add  
  20.       IL_000c: stloc.0  
  21.       IL_000d: ldstr "Fields Demo:: Calculation is {0}"  
  22.       IL_0012: ldloc.0  
  23.       IL_0013: box [mscorlib]System.Int32  
  24.       IL_0018: call void [mscorlib]System.Console::WriteLine(stringobject)  
  25.       IL_001d: nop  
  26.       IL_001e: ret  
  27.    }  

Properties Metadata

 
Properties enable strict control of access to the internal state of an object. It behaves like a public field and the notation to access property is the same as a public field on the instance. A property is a shorthand notation used to read and write fields. The .property directive is employed to define a property by use of the related .get and .set directives as in the following.
 
Syntax 2: Properties declaration in IL code:
  1. .property instance int32 iVal()  
  2. {  
  3.    .get instance int32 NamespaceName.Class::get_iVal()  
  4.    .set instance void Namespace.Class::set_iVal(int32)   
  5. }  
Listing 2: Properties declaration in IL code
  1. ...  
  2.   
  3. .class private auto ansi beforefieldinit cilProperties.cPrptDemo extends [mscorlib]System.Object  
  4. {  
  5.    .field private string 'Color__Field'  
  6.   
  7.    .method public hidebysig specialname instance string get_Color() cil managed  
  8.    {  
  9.   
  10.       .maxstack 1  
  11.       .locals init (string V_0)  
  12.       IL_0000: ldarg.0  
  13.       IL_0001: ldfld string cilProperties.cPrptDemo::'Color__Field'  
  14.       IL_0006: stloc.0  
  15.       IL_0007: br.s IL_0009  
  16.   
  17.       IL_0009: ldloc.0  
  18.       IL_000a: ret  
  19.    }   
  20.   
  21.    .method public hidebysig specialname instance void set_Color(string 'value') cil managed  
  22.    {  
  23.       .maxstack 8  
  24.       IL_0000: ldarg.0  
  25.       IL_0001: ldarg.1  
  26.       IL_0002: stfld string cilProperties.cPrptDemo::'Color__Field'  
  27.       IL_0007: ret  
  28.    }   
  29.   
  30.    .method public hidebysig instance void Display() cil managed  
  31.    {  
  32.   
  33.       .maxstack 8  
  34.       IL_0000: nop  
  35.       IL_0001: ldstr "Property Demo::Color is {0}"  
  36.       IL_0006: ldarg.0  
  37.       IL_0007: call instance string cilProperties.cPrptDemo::get_Color()  
  38.       IL_000c: call void [mscorlib]System.Console::WriteLine(string,object)  
  39.       IL_0011: nop  
  40.       IL_0012: ret  
  41.    }   
  42.   
  43.    ..  
  44.    .property instance string Color()  
  45.    {  
  46.       .get instance string cilProperties.cPrptDemo::get_Color()  
  47.       .set instance void cilProperties.cPrptDemo::set_Color(string)  
  48.    }   
  49.    ..  
  50. }   

Namespace

 
A namespace is a consortium of related .NET types, such as class, interface, and so on contained in an assembly, particularly employed to fully qualify a class name. Moreover, a single assembly can have more than one namespace definition. A namespace in IL coding is declared using the .namespace directive in the following way.
 
Syntax 3: Namespace declaration:
  1. .namespace testNamespace  
  2. {  
  3.    ...  
  4.    // Classes declaration section  
  5. }  
Namespace could be declared nested like class type since one namespace can contain the definition of another and so on.
  1. .namespace parent  
  2. {  
  3.    ...  
  4.    // Classes declaration section  
  5.   
  6.    .namespace child  
  7.    {  
  8.       ...  
  9.       // Classes declaration section  
  10.    }  
  11. }  
  12. // or nested namespace can be specified as  
  13. .namespace parent.child { }  
The important point to remember is that namespaces are neither considered to be metadata nor referenced by IL tokens. Examine the following metadata where there is no entry of metadata and the token pertains to the namespace:
 
Namespace
 

Class Metadata

 
The Class type in IL code is defined using a .class directive and implicitly obtains the entry of the .NET System.Object base class entry, as well as class, should be specified by its full name, even if it resides in the same assembly.
 
Syntax 4: Class declaration:
  1. .namespace testNamespace  
  2. {  
  3.    ...  
  4.    // Classes declaration section  
  5.   
  6.    .class public myClass  
  7.    {  
  8.       //Class members  
  9.    }  
  10. }  
As we know, the C# code controls the visibility of fields, methods, classes, and property types using various keywords like public, private, abstract, sealed, and so on. Hence, IL code has such keywords that are counterparts to control the availability of types inside or outside the assembly. The following table gives a brief description of these IL code attributes.
 
Table 1: Visibility attributes IL code
 
Visibility attributes IL code
 

Constructor Metadata

 
A constructor invokes class fields and is defined using .ctor and .cctor directives in the IL code, where .ctor represents an instance-level constructor while .cctor epitomizes static-level constructors. Moreover, it is always implicitly treated as void indeed, due to not returning a value. Here, the following code shows a default class constructor.
 
Syntax 5: Parameterless Constructor declaration:
  1. .method public hidebysig specialname rtspecialname instance void .ctor() cil managed {}  
In the previous code, it is mandatory to prefix the specialname and rtspecialname attributes that uniquely identify each constructor definition in the IL code. In case of initializing the class fields using a constructor, for example, invoking an integer variable, the specification should be as in the following:
 
Syntax 6: Parameter Constructor declaration:
  1. .field private int32 iValue   
  2.   
  3. .method public hidebysig specialname rtspecialname instance void .ctor(int32 i) cil managed   
  4. {  
  5.    // Implementation code  
  6. }  
Here, we are defining a class constructor that handles strings as a parameter and invoked later during class instantiation as in the following:
 
Listing 3: Constructor declaration in IL code:
  1. ..  
  2. .module cilConstructor.exe  
  3.   
  4. .class public auto ansi beforefieldinit cilConstructor.EntryPint  
  5. extends [mscorlib]System.Object  
  6. {  
  7.    ..  
  8.      
  9.    .method public hidebysig specialname rtspecialname   
  10.    instance void .ctor() cil managed  
  11.    {  
  12.   
  13.       .maxstack 8  
  14.       IL_0000: ldarg.0  
  15.       IL_0001: call instance void [mscorlib]System.Object::.ctor()  
  16.       IL_0006: ret  
  17.    }   
  18. }  

Interface Metadata

 
An interface is similar to a classic COM interface, defined as a descriptor of properties and methods of a class type. Moreover, an interface can't offer the implementation of these exposed items except the static members; in fact, an interface can only implement another interface. An interface is neither derived from another type like class nor can't be any other type derived from it. The interface shouldn't be sealed and methods defined in an interface must be marked as virtual. Interface types are defined using the .class directive in IL code as in the following:
 
Syntax 7: Interface declaration:
  1. .namespace testNamespace  
  2. {  
  3. ...  
  4. // Interface declaration section  
  5.   
  6. .class public interface myInterface  
  7. {  
  8. //properties and methods  
  9. }  
  10. }  
The interface items exposed or implemented in any class implements keywords. Here, we must specify the full name of the interface as in the following:
 
Syntax 8: Interface implementation in class:
  1. .class public myClass implements testNamespace.myInterface  
  2. {  
  3.    //members utilization  
  4. }  
Listing 4: Interface declaration in IL code:
  1. ..  
  2. .class interface public abstract auto ansi cilInterface.ITestInterface  
  3. {  
  4.    .method public hidebysig newslot abstract   
  5.    virtual instance void sqrt(float64 i) cil managed { }   
  6. }   
  7.   
  8. .class public auto ansi beforefieldinit cilInterface.cInrfDemo  
  9. extends [mscorlib]System.Object  
  10. implements cilInterface.ITestInterface  
  11. {  
  12.    .method public hidebysig newslot virtual final   
  13.    instance void sqrt(float64 i) cil managed  
  14.    {  
  15.   
  16.       .maxstack 2  
  17.       .locals init ([0] float64 cal)  
  18.       IL_0000: nop  
  19.       IL_0001: ldarg.1  
  20.       IL_0002: call float64 [mscorlib]System.Math::Sqrt(float64)  
  21.       IL_0007: stloc.0  
  22.       IL_0008: ldstr "Interface Demo:: Sqrt is {0}"  
  23.       IL_000d: ldloc.0  
  24.       IL_000e: box [mscorlib]System.Double  
  25.       IL_0013: call void [mscorlib]System.Console::WriteLine(string,object)  
  26.       IL_0018: nop  
  27.       IL_0019: ret  
  28.    }   
  29. ..  
  30. }   

Structure Metadata

 
Structures are user-defined types that contain any number of data fields and members that operate on these fields. The structure type must be defined as sealed and belongs to a value type of CTS structure, hence implicitly derived from System.ValueType. A .class directive defines a structure as in the following:
 
Syntax 9: Structure Declaration:
  1. .namespace testNamespace  
  2. {  
  3.    ...  
  4.    // Structure declaration section  
  5.   
  6.    .class public sealed myStructure  
  7.    {  
  8.       //members  
  9.    }  
  10. }  
The following program defines a structure type that has one integer type and a method that performs some operation on the defined variable as in the following:
 
Listing 5: Structure declaration in IL code:
  1. ..  
  2. .module cilStructure.exe  
  3.   
  4. .class private sequential ansi sealed beforefieldinit cilStructure.sTest  
  5. extends [mscorlib]System.ValueType  
  6. {  
  7.    .field public int32 y  
  8.    .method public hidebysig instance void square() cil managed  
  9.    {  
  10.   
  11.       .maxstack 8  
  12.       IL_0000: nop  
  13.       IL_0001: ldarg.0  
  14.       IL_0002: ldc.i4.4  
  15.       IL_0003: stfld int32 cilStructure.sTest::y  
  16.       IL_0008: ldstr "Square is {0}"  
  17.       IL_000d: ldarg.0  
  18.       IL_000e: ldfld int32 cilStructure.sTest::y  
  19.       IL_0013: ldarg.0  
  20.       IL_0014: ldfld int32 cilStructure.sTest::y  
  21.       IL_0019: mul  
  22.       IL_001a: box [mscorlib]System.Int32  
  23.       IL_001f: call void [mscorlib]System.Console::WriteLine(string,object)  
  24.       IL_0024: nop  
  25.       IL_0025: ret  
  26.    }   
  27. }   
  28.    

Enum Metadata

 
The Enum also is also a value type in the CLR. It must, therefore, be marked with the sealed keyword in the IL code and is specified using the .class directive as in the following:
 
Syntax 10: Enum Declarations:
  1. .namespace testNamespace  
  2. {  
  3.    ...  
  4.    // Enumerator declaration section  
  5.   
  6.    .class public sealed myEnum  
  7.    {  
  8.       //members  
  9.    }  
  10. }  
The enumerator typically contains constant fields that must be defined with a value within the range of the underlying type. Here, the following code illustrates enumerators by defining the three constant values as Red, Green, and Blue.
 
Listing 6: Enumerator declaration in IL code:
  1. ..  
  2. .module cilEnum.exe  
  3.   
  4. .class public auto ansi sealed cilEnum.eColor extends [mscorlib]System.Enum  
  5. {  
  6.    .field public specialname rtspecialname int32 value__  
  7.    .field public static literal valuetype cilEnum.eColor Red = int32(0x00000014)  
  8.    .field public static literal valuetype cilEnum.eColor Green = int32(0x00000032)  
  9.    .field public static literal valuetype cilEnum.eColor Blue = int32(0x00000050)  
  10. }   
  11. ..  

Generics Metadata

 
A generic allows us to build unique types that are converted into closed types at runtime. We can build generic classes that contain any integer, string, or object types. Generics are far superior to their counterpart collection classes, such as Arrays because they offer ultimate type safety, especially when boxing and unboxing operations and from a performance point of view.
 
Generics are defined using a single tick ( ` ) in IL coding followed by a numeric value that represents the number of type parameters in the generic.
 
Syntax 11: Generics Declarations:
  1. .namespace testNamespace  
  2. {  
  3.    ...  
  4.    // Generic declaration section  
  5.   
  6.    .newobj instance void class [mscorlib]   
  7.    System.Collection.Generic.List`1<int32>::.ctor()  
  8.   
  9. }  
The previous IL code would be mapped to its corresponding C# code as in the following, where we are defining a generic type that accepts an integer at runtime as in the following:
  1. List<int> gObj= new List<int>();  
Similarly, the following code snippet implements a generic class that accepts integers as the type parameter at runtime and yields the addition of an added number eventually without even bothering about the type conversion at runtime.
 
Listing 7: Generic declaration in IL code
  1. ..  
  2. .module cilGenerics.exe  
  3.   
  4. .class private auto ansi beforefieldinit cilGenerics.cGenrcDemo extends [mscorlib]System.Object  
  5. {  
  6.    .method public hidebysig instance void Addition() cil managed  
  7.    {  
  8.   
  9.       .maxstack 4  
  10.       .locals init ([0] class [mscorlib]System.Collections.Generic.List`1<int32> iCal)  
  11.       IL_0000: nop  
  12.       IL_0001: newobj instance void class [mscorlib]System.Collections.Generic.List`1<int32>::.ctor()  
  13.       IL_0006: stloc.0  
  14.       IL_0007: ldloc.0  
  15.       IL_0008: ldc.i4.s 10  
  16.       IL_000a: callvirt instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)  
  17.       IL_000f: nop  
  18.       IL_0010: ldloc.0  
  19.       IL_0011: ldc.i4.s 20  
  20.       IL_0013: callvirt instance void class [mscorlib]System.Collections.Generic.List`1<int32>::Add(!0)  
  21.       IL_0018: nop  
  22.       IL_0019: ldstr "Generic Demo::Addition is {0}"  
  23.       IL_001e: ldloc.0  
  24.       IL_001f: ldc.i4.0  
  25.       IL_0020: callvirt instance !0 class [mscorlib]System.Collections.Generic.List`1<int32>::get_Item(int32)  
  26.       IL_0025: ldloc.0  
  27.       IL_0026: ldc.i4.1  
  28.       IL_0027: callvirt instance !0 class [mscorlib]System.Collections.Generic.List`1<int32>::get_Item(int32)  
  29.       IL_002c: add  
  30.       IL_002d: box [mscorlib]System.Int32  
  31.       IL_0032: call void [mscorlib]System.Console::WriteLine(stringobject)  
  32.       IL_0037: nop  
  33.       IL_0038: ret  
  34.    }   
  35. ..  
  36. }  

Inheritance Metadata

 
Inheritance of types is a way in which the derived type guarantees the support of all of the type contracts of the base class type. In addition, the derived type usually provides additional functionality or specialized behavior. In IL code, a derived class inherits the base class contracts through the extends keyword as in the following.
 
Syntax 12: Inheritance Declarations:
  1. .namespace testNamespace  
  2. {  
  3.    ...  
  4.    // Inheritance declaration section  
  5.   
  6.    .class public auto ansi beforefieldinit child_class_name extends base_class   
  7. }  
The following sample demonstrates the inheritance implementation over the Father class that serves as a base class to the Child class. Therefore, the child class can use all of the father class functionality as well as add new features.
 
Listing 8: Inheritance declaration in IL code
  1. ..  
  2. .module cilInheritance.exe  
  3.   
  4. .class public auto ansi beforefieldinit cilInheritance.Father   
  5. extends [mscorlib]System.Object  
  6. {  
  7.    .method public hidebysig instance void FatherMethod() cil managed  
  8.    {  
  9.   
  10.       .maxstack 8  
  11.       IL_0000: nop  
  12.       IL_0001: ldstr "this property belong to Father"  
  13.       IL_0006: call void [mscorlib]System.Console::WriteLine(string)  
  14.       IL_000b: nop  
  15.       IL_000c: ret  
  16.    }   
  17. ..  
  18. }   
  19.   
  20. .class public auto ansi beforefieldinit cilInheritance.Child   
  21. extends cilInheritance.Father  
  22. {  
  23.    .method public hidebysig instance void ChildMethod() cil managed  
  24.    {  
  25.   
  26.       .maxstack 8  
  27.       IL_0000: nop  
  28.       IL_0001: ldstr "this property belong to Child"  
  29.       IL_0006: call void [mscorlib]System.Console::WriteLine(string)  
  30.       IL_000b: nop  
  31.       IL_000c: ret  
  32.    }   
  33. ..  
  34. }   

Polymorphism Metadata

 
The .NET framework re-defines the existing method specification depending on the specific needs using polymorphism in which overrides the base class virtual method specification. A virtual method definition can be marked by the newslot attribute in the base class only which creates a new virtual method for defining the class and any classes derived from it. The important point to remember is that the newslot attribute would be specified in the derived class as in the following:
 
Syntax 13: Polymorphism Declarations:
  1. // Polymorphic method declaration section (base class) virtual  
  2.   
  3. .method public hidebysig newslot virtual   
  4. Instance void method_name() cil managed   
  5.   
  6. // Polymorphic method declaration section (derived class) override  
  7.   
  8. .method public hidebysig virtual   
  9. Instance void method_name() cil managed   
The following sample is manipulating the numeric parameters of a method Calculation() inner operation using polymorphism. In the base class, it is doing an addition operation; where as in the derived class, it hides the base class method operability by multiplying the same arguments of the method.
 
Listing 9: Polymorphism declaration in IL code
  1. ..  
  2. .module cilPolymorphism.exe  
  3.   
  4. .class public auto ansi beforefieldinit cilPolymorphism.cBase extends [mscorlib]System.Object  
  5. {  
  6.    .method public hidebysig newslot virtual   
  7.    instance void Calculation(int32 x,int32 y) cil managed  
  8.    {  
  9.   
  10.       .maxstack 2  
  11.       .locals init ([0] int32 z)  
  12.       IL_0000: nop  
  13.       IL_0001: ldarg.1  
  14.       IL_0002: ldarg.2  
  15.       IL_0003: add  
  16.       IL_0004: stloc.0  
  17.       IL_0005: ldstr "Addition is {0}"  
  18.       IL_000a: ldloc.0  
  19.       IL_000b: box [mscorlib]System.Int32  
  20.       IL_0010: call void [mscorlib]System.Console::WriteLine(string,object)  
  21.       IL_0015: nop  
  22.       IL_0016: ret  
  23.    }   
  24. ..  
  25. }   
  26.   
  27. .class public auto ansi beforefieldinit cilPolymorphism.cChild extends cilPolymorphism.cBase  
  28. {  
  29.    .method public hidebysig virtual   
  30.    instance void Calculation(int32 x,int32 y) cil managed  
  31.    {  
  32.   
  33.       .maxstack 2  
  34.       .locals init ([0] int32 z)  
  35.       IL_0000: nop  
  36.       IL_0001: ldarg.1  
  37.       IL_0002: ldarg.2  
  38.       IL_0003: mul  
  39.       IL_0004: stloc.0  
  40.       IL_0005: ldstr "Multification is {0}"  
  41.       IL_000a: ldloc.0  
  42.       IL_000b: box [mscorlib]System.Int32  
  43.       IL_0010: call void [mscorlib]System.Console::WriteLine(string,object)  
  44.       IL_0015: nop  
  45.       IL_0016: ret  
  46.    }   
  47.    ..  
  48. }   

Synopsis

 
This article provided a comprehensive overview of IL coding and syntax. Like the higher language C#, we got a thorough understanding of how to code various inherent types of the CLR using IL opcodes and an analysis of the corresponding metadata generated. We have apparently dug deeper into the coding mechanism of typical CLR programming constructs like constructors, structure, generics, and moreover, the object orient programming implementations such as inheritance, interface, encapsulation, and polymorphism, in terms of CIL coding.