JIT Coding

Filip Bulovic
19y
19.8k
0
0

Article

One not so well known feature of .NET platform is possibility to invoke compiler and practically create code and assembly from running instance of application. It is possible to do that in two ways. First one is a bit simpler and involves namespaces System.CodeDom and System.CodeDom.Compiler, second one is more efficient and utilizes namespace System.Reflection.Emit. Since there are very few examples about how to use System.CodeDom.Compiler I will start with it.

System.CodeDom.Compiler introductory example

As usual introductory example is about writing assembly which will write another assembly which will write to console string (incidentally "Hello World") received from first assembly. Very convenient. Here is code to do all that:

using System;
using System.CodeDom;
using System.CodeDom.Compiler;
using System.Reflection;
class test
{
static void Main()
{
CodeCompileUnit compunit = new CodeCompileUnit();
CodeNamespace TheName = new CodeNamespace("TheName");
compunit.Namespaces.Add(TheName);
TheName.Imports.Add(new CodeNamespaceImport("System"));
CodeTypeDeclaration Class1 = new CodeTypeDeclaration("Class1");
TheName.Types.Add(Class1);
CodeMemberMethod Say = new CodeMemberMethod();
Say.Parameters.Add(new CodeParameterDeclarationExpression(typeof(string),"s"));
Say.Statements.Add(new CodeExpressionStatement(new CodeSnippetExpression("System.Console.WriteLine(s)")));
Class1.TypeAttributes = TypeAttributes.Public;
Class1.Members.Add(Say);
Say.Attributes = MemberAttributes.Public;
Say.Name="Say";
CompilerParameters compparams = new CompilerParameters(new string[]{"mscorlib.dll"});
compparams.GenerateInMemory=true;
//uncomment following if you like to write dll to disk
//compparams.OutputAssembly=" HelloWorld.dll";
Microsoft.CSharp.CSharpCodeProvider csharp = new Microsoft.CSharp.CSharpCodeProvider();
ICodeCompiler cscompiler = csharp.CreateCompiler();
CompilerResults compresult = cscompiler.CompileAssemblyFromDom(compparams,compunit);
if ( compresult == null || compresult.Errors.Count > 0 )
Environment.Exit(1);
object o = compresult.CompiledAssembly.CreateInstance("TheName.Class1");
Type test = compresult.CompiledAssembly.GetType("TheName.Class1");
MethodInfo m = test.GetMethod("Say");
object[] arg=new object[1];
arg[0]=" Hello World!";
m.Invoke(o, arg);
}
}

More or less code is self-explanatory, you will declare namespace, imports, type and its name, method and its name and so on. Only unusual thing is:

Say.Statements.Add(new CodeExpressionStatement(new CodeSnippetExpression("System.Console.WriteLine(s)")));

which allows you to insert expression using familiar syntax into the body of the method. Finally we will invoke compiler and using System.Reflection activate instance of that newly compiled assembly. I wanted to have that taking place in memory but if you like to examine more closely result of compilation uncomment:

//compparams.OutputAssembly=" HelloWorld.dll";

If you are not sure that example is sufficient don't worry there is another one later in the article and also you can try to find on the web example published by Clemens Vasters - (c) 2001 newtelligence AG. To try to explain everything would be quite difficult so I must ask readers to use here .NET Framework SDK Documentation to find out more about System.CodeDom.

Introductory example for System.Reflection.Emit was supplied by Microsoft and link to it could be found on "Microsoft .NET Framework SDK QuickStarts, Tutorials and Samples".

Now when you went through it and got understanding of these two namespaces it is time to attend to some weaknesses of C# and IL in particular.

Mimicking C++ templates from C#

Since IL doesn't support templates you can't have them supported in C# or VB. I'm talking about C++ template here and this is elementary example:

template <class T>
T Sum(T a, T b)
{
return a + b;
}
void main()
{
cout << Sum(2, 3) << '\n' << Sum(1.1, 2.2) << '\n';
}

Only at runtime will be known type of variable and return type. Why are they important? During many years they were the base for generic programming and due to popularity of C++ most of generic software using them. Before I start long-lasting discussion let me mention interfaces. If that function should return bigger of two values we will use interface IComparable which is defined in System but there is no defined IAddible so I will also skip defining it. How is C# handling the same problem? Usually using Object where casting to Object is certain because all types must inherit from it.

But object doesn't know how to add itself to another object. So you will have to roll out new class where operator + knows how to add two objects depending on their original type. Conventional solution of this problem I am leaving to readers and I'm jumping right onto unusual one. First comes the easy one involving System.CodeDom.Compiler.

using System;
using System.CodeDom;
using System.CodeDom.Compiler;
using System.Reflection;
class T
{
static void Main()
{
Console.Write("Value = {0}\tType = {1}\n",Mimic(1,2).ToString(),Mimic(1,2).GetType());
Console.Write("Value = {0}\tType = {1}\n",Mimic(1.1,3.14).ToString(),Mimic(1.1,3.14).GetType());
Console.Write("Value = {0}\tType = {1}\n",Mimic("Hello ","World!").ToString(),Mimic("Hello ","World!").GetType());
Console.Read();
}
static object Mimic(object a,object b)
{
if(a.GetType() != b.GetType())
Environment.Exit(1);
Type rt=a.GetType();
CodeCompileUnit compunit = new CodeCompileUnit();
CodeNamespace TheName = new CodeNamespace("TheName");
compunit.Namespaces.Add(TheName);
TheName.Imports.Add(new CodeNamespaceImport("System"));
CodeTypeDeclaration Class1 = new CodeTypeDeclaration("Class1");
TheName.Types.Add(Class1);
CodeMemberMethod AddT = new CodeMemberMethod();
AddT.ReturnType=new CodeTypeReference(rt);
AddT.Parameters.Add(new CodeParameterDeclarationExpression(rt,"a"));
AddT.Parameters.Add(new CodeParameterDeclarationExpression(rt,"b"));
AddT.Statements.Add(new CodeMethodReturnStatement(new CodeSnippetExpression("a+b")));
Class1.TypeAttributes = TypeAttributes.Public;
Class1.Members.Add(AddT);
AddT.Attributes = MemberAttributes.Public;
AddT.Name="AddT";
CompilerParameters compparams = new CompilerParameters(new string[]{"mscorlib.dll"});
compparams.GenerateInMemory=true;
//compparams.OutputAssembly="factory.dll";
Microsoft.CSharp.CSharpCodeProvider csharp = new Microsoft.CSharp.CSharpCodeProvider();
ICodeCompiler cscompiler = csharp.CreateCompiler();
CompilerResults compresult = cscompiler.CompileAssemblyFromDom(compparams,compunit);
if ( compresult == null | compresult.Errors.Count > 0 )
Environment.Exit(1);
object o = compresult.CompiledAssembly.CreateInstance("TheName.Class1");
Type test = compresult.CompiledAssembly.GetType("TheName.Class1");
MethodInfo m = test.GetMethod("AddT");
object[] arg=new object[2];
arg[0]=a;
arg[1]=b;
object result=m.Invoke(o, arg);
return result;
}
}

Code is self-explanatory and similar to first example. Again if you like you can write assembly to disc by means of doing uncomment of following line:

//compparams.OutputAssembly="factory.dll";

Speed is far from desirable but option to pass different code snippet together with variable if desired makes this approach even more interesting.

In order to gain more speed it is necessary to use System.Reflection.Emit, IL knowledge is also required. Following is my raw code where OpCodes.Add and OpCodes. Add_Ovf (for int types) should be altered accordingly, also checking if both values making pair are of same type is necessary; if they are not it is possible to try to convert them to same type and if that is not possible, it is always possible to throw exception. Since this code is intended only as illustration and in order to make it shorter I omitted these steps.

using System;
using System.IO;
using System.Reflection;
using System.Reflection.Emit;
using System.Threading;
public class ASM
{
public Type build(string currentType)
{
AssemblyName assemblyName = new AssemblyName();
assemblyName.Name = "genAssembly";
AssemblyBuilder assembly
= Thread.GetDomain().DefineDynamicAssembly(assemblyName, AssemblyBuilderAccess.Run);
ModuleBuilder module = assembly.DefineDynamicModule("genAssembly");
//, "genAssembly.dll");
TypeBuilder genClass = module.DefineType("Gen", TypeAttributes.Public);
FieldBuilder valField = genClass.DefineField("_val",
Type.GetType(currentType),
FieldAttributes.Private);
Type[] arg = new Type[1];
arg[0] = Type.GetType(currentType);
ConstructorBuilder cb = genClass.DefineConstructor(MethodAttributes.Public,
CallingConventions.Standard, arg);
ILGenerator ctorIL = cb.GetILGenerator();
ctorIL.Emit(OpCodes.Ldarg_0);
Type objectClass = Type.GetType("System.Object");
ConstructorInfo bc = objectClass.GetConstructor(new Type[0]);
ctorIL.Emit(OpCodes.Call, bc);
ctorIL.Emit(OpCodes.Ldarg_0);
ctorIL.Emit(OpCodes.Ldarg_1);
ctorIL.Emit(OpCodes.Stfld, valField);
ctorIL.Emit(OpCodes.Ret);
MethodBuilder mb = genClass.DefineMethod("GetVal", MethodAttributes.Public|MethodAttributes.HideBySig,
Type.GetType(currentType), null);
MethodBuilder getValMethod = mb;
ILGenerator methodIL = mb.GetILGenerator();
methodIL.Emit(OpCodes.Ldarg_0);
methodIL.Emit(OpCodes.Ldfld, valField);
methodIL.Emit(OpCodes.Ret);
arg = new Type[2];
arg[0]=genClass;
arg[1]=genClass;
mb = genClass.DefineMethod("op_Addition", MethodAttributes.Public|MethodAttributes.Static|MethodAttributes.HideBySig|MethodAttributes.SpecialName,
genClass, arg);
methodIL = mb.GetILGenerator();
methodIL.Emit(OpCodes.Ldarg_0);
methodIL.Emit(OpCodes.Callvirt,getValMethod);
methodIL.Emit(OpCodes.Ldarg_1);
methodIL.Emit(OpCodes.Callvirt,getValMethod);
methodIL.Emit(OpCodes.Add);
methodIL.Emit(OpCodes.Newobj, cb);
methodIL.Emit(OpCodes.Ret);
Type[] argTypes =new Type[1];
argTypes[0]=Type.GetType(currentType);
MethodInfo convToStrMethod = typeof(Convert).GetMethod("ToString", argTypes);
arg = new Type[1];
arg[0]=genClass;
mb= genClass.DefineMethod("op_Implicit", MethodAttributes.Public|MethodAttributes.Static|MethodAttributes.HideBySig|MethodAttributes.SpecialName,
Type.GetType("System.String"), arg);
methodIL = mb.GetILGenerator();
methodIL.Emit(OpCodes.Ldarg_0);
methodIL.Emit(OpCodes.Callvirt,getValMethod);
methodIL.Emit(OpCodes.Call,convToStrMethod);
methodIL.Emit(OpCodes.Ret);
return genClass.CreateType();
}
}
public class Pair
{
object a,b;
string current;
public Pair(object A,object B)
{
a=A;b=B;current=A.GetType().ToString();
}
public void Sum()
{
ASM asm=new ASM();
Type t=asm.build(current);
a = Activator.CreateInstance(t, new Object[] { a });
b = Activator.CreateInstance(t, new Object[] { b });
Object obj = t.InvokeMember("op_Addition", BindingFlags.InvokeMethod, null,a, new Object[]{ a,b});
obj = t.InvokeMember("GetVal", BindingFlags.InvokeMethod, null, obj, null);
Console.WriteLine("Type is " + obj.GetType()+ " and sum={0}",obj.ToString());
}
}

Class Pair is a friendly wrapper and saves user from System.Reflection.Emit intricacies. The code is intended to be compiled as library. Strings as pair are not supported; to support them it could be done inside class Pair on high level language. Immediately at the beginning of Sum insert something like this:

if(current=="System.String")
{
Console.WriteLine(Convert.ToString(a)+Convert.ToString(b));
return;
}

Changing "current" from string to type and accordingly "build" method to accept type instead of string eliminates few Type.GetType(currentType) calls but I didn't find any significant performance improvement. So here is code from test app:

Pair p=new Pair(1,2);
p.Sum();
p=new Pair(1.2,2.4);
p.Sum();
p=new Pair(3.2f,5.4f);
p.Sum();
Console.Read();

Don't forget to reference library csc /r:<file name> ...

Going from comfort related with writing code in high level language like C# or VB to difficulties related with work in assembly language will be probably big minus for this approach if we are talking about the number of potential users (developers, not final users). But increase in speed is significant, while first example takes (on my machine which is not very fast) around 5 seconds to execute, second executes in 0.2 seconds. Changing significantly the content of the method in the second example will probably result in work similar to that one which is required for writing compiler. But if we restrict number of possible methods it is always possible to prepare number of templates for creation of IL code, so that would make these changes possible to some degree. Combining that with factory pattern could result in something that is not that difficult to work with.