Disassembler Mechanized: Part 2

Before reading this article you must read:

Introduction

The previous paper showcased the essential configuration in terms of importing the external DLLs into the solution and NuGet package installation. As we have stated earlier, the process involves making the custom disassembly using several layers of a development cycle and we have already covered the user interface design, the obtaining of the assembly origin information and the decompiling of assembly members in the previous article. Now, we shall continue our voyage by explaining the process of obtaining the disassembled code in C# and the MSIL language.

The .NET CLR provisions several programming languages such as C#, VisualBasic.NET, F# and managed C++. Components written for example in VB.NET or C++, can easily be reprocessed in code written in another language, for instance C#. As we know, code from these high-level languages are compiled into a common Intermediate Language (IL) that runs in the Common Language Runtime (CLR). There are typically multiple reasons to disassemble code, ranging from interoperability purposes to recover lost source code or finding security vulnerabilities. Disassembling can assist in the audit of the implementation of security sensitive features such as authentication, authorization and encryption. Disassembling .NET clients for security purposes can also facilitate ensuring that the software does the expected tasks without hidden features such as spyware or adware.

UI Design recap
 
Though it doesn't matter how we design this software. In fact, our main goal is to implement the disassembling features. Before moving forward, it is necessary to come across with the controls that are being placed in the user interface design of this software because we shall need to direct a specific form control to respond to an event, for instance a Tree View control that displays an entire member's modules of the assembly, will display the corresponding MSIL or C# source during selection of any methods. Although this software implements numerous forms controls, but depending on the requirement of these papers, we are elaborating on these necessary controls only that will be used in the code.



Getting Started

The moment the user uploads a .NET built assembly, the Treeview control is activated and shall produce the entire contents of the assembly in terms of modules and methods. As per the proposed functionality of this paper, we need to show the corresponding source code of an assembly in the form of C# of IL language. Here, we shall utilize the treeview control that streamlines our job in terms of when we select a specific method or the contents of the assembly. The equivalent original source code (C# and MSIL) will appear in the Rich Text Box located in the Tab control. Hence, we will create an AfterSelect event for the Treeview control and place the following code into it:
  1. private void tvMembers_AfterSelect(object sender, TreeViewEventArgs e)  
  2.         {  
  3.             try  
  4.             {  
  5.                 populateCsharpCode();  
  6.                 populateILCode();  
  7.             }  
  8.             catch  
  9.             {  
  10.                 MessageBox.Show("Expand the Namespace");  
  11.                 return;   
  12.             }  
  13.         }  
We have put two methods inside the tvMemebers_AfterSelect() as populateCsharpCode() that expressions the C# source code and the other one displays the MSIL code.

C# Code Disassembling

In this section we shall express the process of yielding C# source code from a selected method in the Treeview control. We would have seen the process of generating the original source code earlier in erstwhile popular disassembles, for instance ILSpy, ILPeek and Reflector. We are, in fact, implementing the same functionality and features in our software.

Hence, the very first line of code in the populateCsharpCode(), reading an assembly from the text box control into a dynamic type variable and later by using this variable, we are enumerating the main modules residing in the assembly using a loop while loop construct as in the following:
  1. var assembly = AssemblyDefinition.ReadAssembly(txtURL.Text);  
  2. IEnumerator enumerator = assembly.MainModule.Types.GetEnumerator();  
  3. while (enumerator.MoveNext())  
  4. {  
  5.   … ..  
  6. }  
In the while loop, we shall define an object of TypeDefinition type that possess the modules of the assembly and this is also used further to explorer the methods inside any selected modules as in the following:
  1. TypeDefinition td = (TypeDefinition)enumerator.Current;  
  2. IEnumerator enumerator2 = td.Methods.GetEnumerator();  
  3.      while (enumerator2.MoveNext())  
  4.  {..}  
Now, we get the reference of the method from the selected modules, in the MethodDefinition object and create an AstBuilder class object that typically does the de-compilation process.
  1. MethodDefinition method_definition = (MethodDefinition)enumerator2.Current;  
  2. AstBuilder ast_Builder = null;  
We again go through the current modules in the assembly using a foreach construct and pass the current selected method reference to the AstBuilder class to disassemble its C# source code as in the following:
  1. foreach (var typeInAssembly in assembly.MainModule.Types)  
  2. {  
  3.    ast_Builder = new AstBuilder(  
  4.                      new ICSharpCode.Decompiler.DecompilerContext                      (assembly.MainModule) { CurrentType = typeInAssembly });  
In this implementation, we are showing the methods only in the contents portion. Hence, we also need to confirm whether we are selecting methods or other members of the assembly as in the following:
  1. foreach (var method in typeInAssembly.Methods)  
  2.          {  
  3.   
  4.            if (method.Name == tvMembers.SelectedNode.Text)  
  5.              {  
  6.                              ….  
  7.                             }  
  8.                       }  
  9.         }  
Finally in the if condition block, we first flush the data in the Rich Text Box control and pass the selected method parameters in the AddMethod() of the AstBuilder class. Then we produce the output in the Rich Text Box control using the StringBuilder class object as in the following:
  1. rtbCsharpCode.Clear();  
  2.        ast_Builder.AddMethod(method);  
  3.        StringWriter output = new StringWriter();  
  4.        ast_Builder.GenerateCode(new PlainTextOutput(output));  
  5.        string result = output.ToString();  
  6.        rtbCsharpCode.AppendText(result);  
  7.        output.Dispose();  
We have discussed and elaborated on the line-by-line code, meaning so far, in the following table, we can obtain the complete C# source code disassembling code as in the following:
  1. private void populateCsharpCode()  
  2. {  
  3.   
  4.    var assembly = AssemblyDefinition.ReadAssembly(txtURL.Text);  
  5.    IEnumerator enumerator = assembly.MainModule.Types.GetEnumerator();  
  6.   
  7.    while (enumerator.MoveNext())  
  8.    {  
  9.      TypeDefinition td = (TypeDefinition)enumerator.Current;  
  10.      IEnumerator enumerator2 = td.Methods.GetEnumerator();  
  11.      while (enumerator2.MoveNext())  
  12.      {  
  13.       MethodDefinition method_definition = (MethodDefinition)enumerator2.Current;  
  14.       AstBuilder ast_Builder = null;  
  15.   
  16.       foreach (var typeInAssembly in assembly.MainModule.Types)  
  17.       {  
  18.         ast_Builder = new AstBuilder(new ICSharpCode.Decompiler.DecompilerContext          (assembly.MainModule) { CurrentType = typeInAssembly });  
  19.         foreach (var method in typeInAssembly.Methods)  
  20.          {  
  21.   
  22.            if (method.Name == tvMembers.SelectedNode.Text)  
  23.              {  
  24.                     rtbCsharpCode.Clear();  
  25.                     ast_Builder.AddMethod(method);  
  26.                     StringWriter output = new StringWriter();  
  27.                     ast_Builder.GenerateCode(new PlainTextOutput(output));  
  28.                     string result = output.ToString();  
  29.                     rtbCsharpCode.AppendText(result);  
  30.                     output.Dispose();  
  31.               }  
  32.            }  
  33.         }  
  34.       }  
  35.             
  36.     }  
  37. }  
IL Code Disassembling

The previous demonstration of C# source code was pretty exhaustive rather than IL code producing. In this segment, we convert the produced MSIL code from the selected method of the current assembly module. It is, however, nearly the same process as in the earlier section implementation but this time we don't need to rely on or call on the AstBuilder class method in order to disassemble the code. Rather, just a couple of .NET Framework built-in classes such as ILProcessor is sufficient to produce the IL code as in the following:
  1. if (method_definition.Name == tvMembers.SelectedNode.Text &&    !method_definition.IsSetter && !method_definition.IsGetter)  
  2.      {  
  3.         rtbILCode.Clear();  
  4.         ILProcessor cilProcess = method_definition.Body.GetILProcessor();  
  5.         foreach (Instruction ins in cilProcess.Body.Instructions)  
  6.         {  
  7.            rtbILCode.AppendText(ins + Environment.NewLine);  
  8.          }  
  9.      }  
Of the select method. Here what we are doing in the foreach loop construct is, we are just enumerating all the corresponding IL code instructions and placing them into a Rich Text Box control. The following table presents the entire code of the IL code disassembly:
  1. private void populateILCode()  
  2. {  
  3.   
  4.    var assembly = AssemblyDefinition.ReadAssembly(txtURL.Text);  
  5.    IEnumerator enumerator = assembly.MainModule.Types.GetEnumerator();  
  6.   
  7.    while (enumerator.MoveNext())  
  8.    {  
  9.       TypeDefinition td = (TypeDefinition)enumerator.Current;  
  10.   
  11.       if (td.Name == tvMembers.SelectedNode.Parent.Text)  
  12.        {  
  13.          IEnumerator enumerator2 = td.Methods.GetEnumerator();  
  14.          while (enumerator2.MoveNext())  
  15.          {  
  16.             MethodDefinition method_definition = (MethodDefinition)enumerator2.Current;  
  17.              if (method_definition.Name == tvMembers.SelectedNode.Text &&     !method_definition.IsSetter && !method_definition.IsGetter)  
  18.               {  
  19.                 rtbILCode.Clear();  
  20.                 ILProcessor cilProcess = method_definition.Body.GetILProcessor();  
  21.                 foreach (Instruction ins in cilProcess.Body.Instructions)  
  22.                  {  
  23.                    rtbILCode.AppendText(ins + Environment.NewLine);  
  24.                  }  
  25.                }  
  26.           }  
  27.         }  
  28.     }  
  29.  }  
Testing

It is important to test both of the implementations that we have described earlier. We shall show the C# source generation process. In order to fulfill our goal, we need an exe or DLL file that has source code, we shall generate it using this software. The following DumySoftware.exe application is typically a login authentication mechanism and it restrains our way in the case of not entering a correct user name and password as in the following:



Hence, we open this application exe file into the Spyware Injector and Decompiler software. It will display the exe file contents with its origin information. The moment we expand the main modules of this assembly in the Tree View control and select a method, we will find its C# source code in the Tab control as in the following:



We can also view the MSIL code such as the code we saw using the ILDASM.exe utility. The process of MSIL code disassembly is similar to C# code de-compilation. We first need to select the method from the Tab control and switch on the IL code tab as in the following.

 

Final Note

This is the second part of “disassembler mechanized” providing additional features developed for the custom disassembler. The goal of this paper is to summarize the knowledge of, how to make a disassembler that produces code from a .NET assembly in both C# and IL format languages. We have observed the process of obtaining C# code in a step-by-step detailed manner along with the generation of MSIL code too. In the next part we shall present the development of a custom exe or code injection tactics in the form of both a message box and spyware.