// TODO: Sorry, this document write in progress.
- Sorry, currently IL2C developing environment supports on Visual Studio 2017 only. If you installed 2017, you can use side-by-side installed Visual Studio 2019.
- Run "init-tools.bat", then run "build-runtime.bat". "build-runtime" can take a argument for configuration name both "Debug" and "Release", will implicit apply "Debug" if you don't give it. (The names are important case sensitive.)
- Open il2c.sln by Visual Studio. Your environment requires enabling C#, VC++ and NUnit3 vsix addin.
- Build with "Debug - AnyCPU" configuration.
- If this don't show any errors, kick starts unit tests at the Test Explorer (Run All).
- The unit tests need a long time for the first execution. It's because these tests automatically download mingw platform and run on it.
- After all tests passed, you are ready to hack!
- Essentially, the unit tests compare and verify results executed on .NET CLR with results executed by the native code gcc compiled.
- You can see "tests/IL2C.Core.Test.Target project". The translated code: "tests/IL2C.Core.Test.Fixture project" of subfolder "bin/Debug/net462."
- About the CI engineering, you can get more information: "azure-pipelines.yml" file.
- Currently IL2C contains the VC++ project file because it's better for debugging mates. You don't need this because "IL2C.Runtime.vcxproj" project file DOESN'T REQUIRE for any building (manually and CI buildings.)
- If you want to see internal IL2C, I think these slides help you: Making archive IL2C #6-55: dotNET 600 2018 session slide
We can use VC++ debugger from the failed unit test item. It illustrated below:
// TODO: details
Uses with "stdint.h", "stdbool.h", "float.h" and "wchar.h" headers.
Full type name | Alias |
---|---|
System.Boolean | bool |
System.Byte | uint8_t |
System.Int16 | int16_t |
System.Int32 | int32_t |
System.Int64 | int64_t |
System.SByte | int8_t |
System.UInt16 | uint16_t |
System.UInt32 | uint32_t |
System.UInt64 | uint64_t |
System.IntPtr | intptr_t |
System.UIntPtr | uintptr_t |
System.Single | float |
System.Double | double |
System.Char | wchar_t |
System.Void | void |
IL2C uses these types sometimes using "Mangled type name." For example: "System.Int32" mangles to "System_Int32". A variable with simple declaration has an alias name:
void foo()
{
// Using alias name
int32_t i32Value;
}
But if IL2C uses into the custom operators:
void foo()
{
int32_t i32Value;
System_ValueType* value;
// It has to apply mangled type name instead alias name.
value = il2c_box(&i32Value, System_Int32);
}
Implementation: IL2C.Utilities.GetCLanguageTypeName()
// TODO: details
The System.String gonna never change string body. So, IL2C will place string body in the rdata section with "IL2C_CONST_STRING()" macro.
The macro marks the constant flag into the header. The GC will ignore constant string instances. But will track if make string instance at the runtime (dynamic generate with string constructors).
// Const literal string places into read only section.
IL2C_CONST_STRING(string1__, L"Hello world IL2C!");
// ...
// Const literal string is assignable.
System_String* str;
str = string1__;
// String instance come from the UTF8 string.
// It places into the heap (will copy string body).
System_String* str;
str = il2c_new_string_from_utf8("Hello world dynamic string!");
Macro: IL2C_CONST_STRING(symbol, str)
Dynamic generate: il2c_new_string(pString)
// TODO: details
IL2C aggregates from all array types into a single array type named "System.Array". The translated the pseudo type name with the macro il2c_arraytype(elementTypeName) because readability.
It's pseudo generic implementation, use the macros il2c_array_item0ptr__(array) and il2c_array_item(array, elementTypeName, index) can access array elements with tiny overheads.
// The array type variable (int[])
il2c_arraytype(System_Int32)* arr;
// Create array instance (arr = new int[100])
arr = il2c_new_array(System_Int32, 100);
// Access array element (arr[10] = 12345)
il2c_array_item(arr, System_Int32, 10) = 12345;
The array instance create with il2c_new_array(elementTypeName, length).
// TODO: details
The delegates have too complex implementation at the .NET Framework/.NET Core. So IL2C way is simply combined unifying structure:
typedef const struct IL2C_METHOD_TABLE_DECL
{
System_Object* target;
intptr_t methodPtr;
} IL2C_METHOD_TABLE;
struct System_Delegate
{
System_Delegate_VTABLE_DECL__* vptr0__;
uintptr_t count__;
IL2C_METHOD_TABLE methodtbl__[1];
};
The "System_Delegate" has multiple method target with "IL2C_METHOD_TABLE" entries. And you'll surprise because IL2C's delegate is variable storage size same as System.String. Stretches the field "methodtbl__". If we combine multiple delegates with "System.Delegate.Combine()" method, combines all delegate target into a single instance.
System_MulticastDelegate type is same as single cast delegate. It means it has always multicast capabilitiy for IL2C solution.
Type definition: System_Delegate
Macro: il2c_new_delegate(typeName, object, method)
// TODO: details
The C language's enum types cannot declare with storage size likely C#. Because the C language cannot do it because by language design.
// Enum type in the C Language:
enum Test_Colors /* : uint8_t */ // can't do it
{
None, // And these symbols not structured by type names - places at globally
Red,
Blue,
Green
};
So, IL2C defines enum type values with simple way:
// Storage size
typedef uint8_t Test_Colors;
// Mangled with type name prefix.
static const Test_Colors Test_Colors_None = 0;
static const Test_Colors Test_Colors_Red = 1;
static const Test_Colors Test_Colors_Blue = 2;
static const Test_Colors Test_Colors_Green = 3;
The System.Enum type implementations only use for the boxed instance.
IL2C can boxing and unboxing operation. If a value type boxed, it gonna allocate on the heap (malloc) with instance header and will copy value body into it.
Unbox operation will get naturally direct refer pointer. Do dereference pointer if we wanna read a value:
// Boxing
int32_t value = 123;
il2c_boxedtype(System_Int32)* boxedInt32 = il2c_box(&value, System_Int32);
// Unboxing (with dereference)
int32_t unboxedInt32;
unboxedInt32 = *il2c_unbox(boxedInt32, System_Int32);
The boxed type is declared to a raw pointer with C language. But it's different to both managed object reference (objref) and managed reference (&). The "il2c_boxedtype()" macro makes pseudo boxed type same as "il2c_arraytype()" macro because readability too.
And if call the method with real (unboxed) this pointer:
il2c_boxedtype(Test_Foo)* boxedfoo;
boxedFoo = ...;
// Unboxing, but not dereference for will call instance method:
Test_Foo* unboxedFoo = il2c_unbox(boxedFoo, Test_Foo);
// public struct Foo { public void FooMethod(int a, int b); }
// Maybe it requires mutable access on the value type instance method, the "this" pointer can do it.
// (And the virtual and interface implemented method aren't mutable (makes copy).)
Test_Foo_FooMethod(unboxedFoo, 123, 456);
TIPS: Do you know that "System.ValueType" is objref? ;) All boxed instance inherited from System.ValueType at the runtime.
Type definition: System.ValueType.
// TODO: details
A managed reference almost same as a raw pointer.
// void Foo(ref int value)
void Test_Foo(int32_t* value)
{
// ...
}
The object reference (objref) has pointer form. So it will be double pointer form.
// void Foo(ref string value)
void Test_Foo(System_String** value)
{
// ...
}
The managed reference doesn't track by garbage collector. Because it always refers trackable (linked from another) instance.
// TODO: details // TODO: I'll change the structures in the future.
The IL2C requires runtime type information, structure named "IL2C_RUNTIME_TYPE_DECL" and refer pointer named "IL2C_RUNTIME_TYPE". It's public but the body is opaque.
It contains these fields:
struct IL2C_RUNTIME_TYPE_DECL
{
const char* pTypeName;
const uintptr_t flags;
const uintptr_t bodySize; // uint32_t
const IL2C_RUNTIME_TYPE baseType;
const void* vptr0;
const uintptr_t markTarget; // mark target count / custom mark handler
const uintptr_t interfaceCount;
//IL2C_MARK_TARGET markTargets[markTarget];
//IL2C_IMPLEMENTED_INTERFACE interfaces[interfaceCount];
};
Toughly you can understand meaning these fields, I'll tell you important fields:
-
flags: The field contains flag values declared here. It's characteristics for the type. For example, "IL2C_TYPE_REFERENCE" is a object reference (objref) type, "IL2C_TYPE_VARIABLE" is a variable storage type (only array, string, delegate and thread types).
Symbol Description IL2C_TYPE_REFERENCE A objref type. IL2C_TYPE_VALUE A value type. IL2C_TYPE_INTEGER A integer (numeric but not floating point) type. The boxing operator uses on bothe narrowing and widing storage size. IL2C_TYPE_VARIABLE A variable type, only using with array, string, delegate and thread types. IL2C_TYPE_MARK_HANDLER The GC traverser using custom mark handler for this type. IL2C_TYPE_UNSIGNED_INTEGER A unsigned integer (numeric but not floating point) type. The boxing operator uses on bothe narrowing and widing storage size. IL2C_TYPE_STATIC A static type (sealed abstract). It doesn't have the VTables. IL2C_TYPE_INTERFACE A interface type. It has lesser fields than another types. -
vptr0: The primary VTable pointer for the types. The il2c_get_uninitialized_object__(IL2C_RUNTIME_TYPE type) function setup instance using it.
-
markTarget, markTargets: Variable fields and count for garbage collector mark target (below).
-
interfaceCount, interfaces: Variable fields and count for implemented interfaces (below).
These are multiple entries by following structures:
typedef const struct IL2C_MARK_TARGET_DECL
{
const IL2C_RUNTIME_TYPE valueType;
const uintptr_t offset;
} IL2C_MARK_TARGET;
typedef const struct IL2C_IMPLEMENTED_INTERFACE_DECL
{
const IL2C_RUNTIME_TYPE type;
const void* vptr0;
} IL2C_IMPLEMENTED_INTERFACE;
The mark targets "IL2C_MARK_TARGET" append tracking ability for the object references by the garbage collector. It has two way usages (mark target count or custom mark handler pointer).
The implemented interfaces "IL2C_IMPLEMENTED_INTERFACE" uses for two ways:
- Setup interface VTable pointers into the allocated instance.
- The dynamic cast calculation at the runtime.
// TODO: details
The IL2C's allocation strategy for the object reference (objref) type has two steps:
- Allocate memory on the heap (malloc). Then initialize object reference header (IL2C_REF_HEADER). All objref instance has this header. It's linked list, holds "IL2C_RUNTIME_TYPE" pointer and the "gcmark" field. Initializer sequence setups the vptr0 and interface vptrs (See below sections). It does only objref types.
- Call the constructor (.ctor) method.
// public class Foo {
// public Foo(int num, string str) { ... }
// }
Test_Foo* foo;
foo = il2c_get_uninitialized_object(Test_Foo); // returns "pReference"
Test_Foo__ctor(foo, 123, string1__);
The objref's memory layout is:
// IL2C_REF_HEADER structure
struct IL2C_REF_HEADER_DECL
{
IL2C_REF_HEADER* pNext; // Next instance pointer (all objref instances are linked this field)
IL2C_RUNTIME_TYPE type;
interlock_t gcMark;
};
+----------------------+ <-- pHeader
| IL2C_REF_HEADER |
+----------------------+ <-- pReference -------
| : | ^
| (Instance body) | | bodySize
| : | v
+----------------------+ -------
"pReference" is real pointer value. "pHeader" isn't public, it'll recalculate from "pReference" at the runtime.
The value types are different for objref's:
- Clear value with storage size (Using memset() directly).
- Call the constructor (.ctor) method if available.
// public struct Bar {
// public Bar(int num, string str) { ... }
// }
Test_Bar bar;
memset(&bar, 0, sizeof bar);
Test_Bar__ctor(&bar, 123, string1__);
Of course, the value type storage doesn't include "IL2C_REF_HEADER."
Helper function: il2c_get_uninitialized_object__(IL2C_RUNTIME_TYPE type)