Improve Your Technology

Just another blog for techology

The .NET Profiling API and the DNProfiler Tool




Profiling Definition:


Profiling is a data surveillance technique which is little-understood and ill-documented, but increasingly used. It is a means of generating suspects or prospects from within a large population, and involves inferring a set of characteristics of a particular class of person from past experience, then searching data-holdings for individuals with a close fit to that set of characteristics.


It is rather different from better-known data surveillance techniques such as front-end verification and data matching. It raises rather different issues, and requires rather different regulatory measures. This paper surveys the limited information available, defines and describes the technique and its social implication, and argues the case for action by regulators.


NET Profile:


The CLR runtime is so central to any .NET program, it provides a logical place to insert hooks to see the guts of .NET. Anticipating the needs of tools developers and system-level programmers, Microsoft used the clean slate of .NET to create a very detailed and consistent way to observe the inner workings of a process (and in some cases, change the way it operates). This month, I’ll describe this mechanism, as well as present a program (DNProfiler) for logging the operations of the .NET runtime.


The .NET Profiling API:


 One means by which you can observe the .NET runtime in action is by using the profiling API. The profiling API is badly named, since it’s useful for all sorts of things beyond just profiling. Consider the list of .NET actions that can be seen via the profiling API, shown in Figure 1.


Figure 1 Functions of the Profiling API


CLR startup and shutdown

Application domain creation/shutdown

Assembly loads/unloads

Module loads/unloads

Class loads/unloads

COM interop VTable creation/destruction

JIT-compiles, code pitching, and pre-JITed method searches

Thread creation/destruction/suspends

Remoting activity

Exception handling

Managed/unmanaged code transitions

Garbage collection and managed heap objects

Method entry/exit


Currently, the best documentation for using the profiling API is the Profiling.doc file in the .NET SDK, in the Tool Developers Guide\docs directory. For the Beta 2 time frame, this file is slightly out of sync with the actual implementation. In this column, I won’t exhaustively cover every aspect of the profiling API, but will instead concentrate on the big picture of what it can do.


One of the main selling points of .NET is that it moves away from the need to use COM. It’s ironic, therefore, that the .NET profiling API is COM-based. Using the profiling API involves creating an in-process COM server that implements a single interface. Each interface method represents a single event from one of the categories shown in Figure 1. Inside these callback methods, the code can use another COM interface (provided by the CLR) to retrieve information about the observed events.


Once you have a COM server that implements the required profiling interface, the next task is to coerce the CLR into loading the COM server and invoking the methods. This trick is accomplished by setting environment variables! Mucking around with environment variables does seem so old school compared to the registry or XML files.


The .NET Profiling API Interfaces:


 The primary interface I’ll be examining is ICorProfilerCallback, which is defined in CorProf.IDL from the .NET Framework SDK’s Include directory. An implementation of this interface is required to use the profiling API. Your job is to supply the implementation. While there are dozens of methods, never fear. For the vast majority of them, you can simply return S_OK or E_NOTIMPL. To take specific action for a specific event (such as a method being JITed), write the desired code in the appropriate callback method.


Examining the callback methods, you’ll see that most of them take parameters giving additional information about the event. For instance, the JITCompilationStarted method takes a UINT parameter named functionId. What can you do with this? The answer lies in another interface, ICorProfilerInfo.


ICorProfilerInfo is practically one-stop shopping for any sort of information from the profiling API. Given the functionId just mentioned, you can call ICorProfilerInfo:: GetTokenAndMetaDataFromFunction, which gives you back a metadata interface instance and the metadata token of the function. Using the metadata interface and the token, you can look up the function’s name, its class, and practically anything else you’d want to know.


In short, the profiling API consists primarily of two COM interfaces. The incoming interface, which you implement, is ICorProfilerCallback. Each method in this interface is called when a CLR event occurs. The outgoing interface, ICorProfilerInfo, is provided for you by the common language runtime, and is for you to use while inside event callbacks.


A common pattern in the ICorProfilerCallback parameters is the extensive use of IDs to represent functions, classes, modules, assemblies, and so on. ID values are opaque handles. Meaningful information about them is extracted via the ICorProfilerInfo interface. Also, certain ID values (such as a FunctionId) can change during the execution of the program if the function’s code is unloaded and then re-JITted. There are callbacks to let you know when this happens, however.


The ICorProfilerCallback methods break down into logical groupings. In most cases, events (such as assembly loads) have both a Started and Finished method, and are called in pairs. Let’s go through them now and get a better idea of what you can observe within the CLR. Unless stated otherwise, all methods listed in the next paragraphs belong to the ICorProfilerCallback interface.

Initialize/Shutdown Methods


Initialize is the first method called when a .NET process uses the profiling API. Your code gets its ICorProfilerInfo pointer from this method. The single parameter to Initialize is an LPUNKNOWN, on which you call QueryInterface to get the ICorProfilerInfo pointer. In addition, Initialize is where you tell the profiling API which events you’re interested in.


To indicate interest in events call ICorProfilerInfo::SetEventMask, passing in a DWORD with the appropriate bits set. The flag values are from the COR_PROF_MONITOR enum in CorProf.H. The lower value flags are named COR_PRF_MONITOR_XXX, and tell the CLR which ICorProfilerCallback methods to invoke. For example, if you want the ClassLoadStarted method to fire, you must set the COR_PRF_MONITOR_CLASS_LOADS flag.


The remaining flags to ICorProfilerInfo::SetEventMask alter the behavior of the CLR in one way or another. For example, the COR_PRF_ENABLE_OBJECT_ALLOCATED flag must be set if you want to monitor object allocations at any time during execution. Similarly, the COR_PRF_DISABLE_INLINING tells the CLR not to inline any methods. If a method is inlined, you won’t receive method enter and leave callbacks for it.


You can modify which events are monitored later on by calling ICorProfilerInfo::SetEventMask again. However, certain events are immutable, meaning that once they’re set in Initialize, they can’t be changed later.


The Shutdown method fires when the CLR is shutting down the process. In certain cases it may not fire, but for a normal .NET program with a normal lifetime, it should fire.


Application Domain Creation/Shutdown:


 The methods in this category are AppDomainCreationStarted, AppDomainCreationFinished, AppDomainShutdownStarted, and AppDomainShutdownFinished. Their names are self-explanatory. The primary token for these methods is the AppDomainID. It’s worth noting that the token passed to AppDomainCreationStarted can’t be used during the callback because the AppDomain doesn’t exist yet. However, upon receiving the AppDomainCreationFinished callback, you can pass the token to ICorProfilerInfo::GetAppDomainInfo to get information specific to the new AppDomain.


Assembly Loads/Unloads:


AssemblyLoadStarted, AssemblyLoadFinished, AssemblyUnloadStarted, and AssemblyUnloadFinished are methods that are involved in loading and unloading assemblies. The primary token for these methods is the AssemblyID. Note that the token passed to AssemblyLoadStarted can’t be used during the callback. That’s because the Assembly doesn’t exist yet. However, upon receiving the AssemblyLoadFinished callback, you can pass the token to ICorProfilerInfo::GetAssemblyInfo to get information specific to the new assembly.


Module Loads/Unloads:


The methods specific to loading modules are ModuleLoadStarted, ModuleLoadFinished, ModuleUnloadStarted, ModuleUnloadFinished, and ModuleAttachedToAssembly. The first four methods are self-explanatory. The primary token for these methods is the ModuleID. Like the AssemblyLoadStarted token, the token passed to ModuleLoadStarted can’t be used during the callback. That’s because the module doesn’t exist yet. But when you receive the ModuleLoadFinished callback, you can pass the token to ICorProfilerInfo::GetModuleInfo to get information specific to the new module.


 The last method, ModuleAttachedToAssembly, occurs when the CLR associates a module (a DLL in Win32 terminology) with a specific assembly. While a module and an assembly are often the same file, it’s possible to have assemblies with multiple modules.


Class Loads/Unloads:


The methods concerned with loading classes are ClassLoadStarted, ClassLoadFinished, ClassUnloadStarted, and ClassUnloadFinished. The method names are self-explanatory. The primary token for these methods is the ClassID. The token passed to ClassLoadStarted can’t be used during the callback because the class doesn’t exist yet. However, upon receiving the ClassLoadFinished callback, you can pass the token to ICorProfilerInfo::GetClassIDInfo to get information specific to the new class.


JIT Compilation:


The primary token in use for the JIT compilation methods shown in Figure 2 is the FunctionID. Here, the terms function and method can be used interchangeably. The FunctionID can be passed to ICorProfilerInfo::GetFunctionInfo to retrieve information specific to the function.


The JITCompilationStarted method is interesting, as it gives you the ability to view and modify the IL before it’s JITed. See ICorProfilerInfo:: GetILFunctionBody for details, and don’t expect it to be easy! The JITCachedFunctionSearchStarted method indicates when the CLR is searching for an existing copy of the method that’s already been JITed to native code. You can force the runtime to use a newly JITed version of the method regardless of its pre-JIT status by setting the pbUseCachedFunction output parameter to FALSE.


JITFunctionPitched indicates that a previously JITed method has been removed from memory. This doesn’t normally happen except in low memory situations. Finally, the JITInlining method indicates that the JITer is about to inline a method inside another method. If you’re counting on seeing the enter/leave notifications for that method, you can disable the inlining by setting the pfShouldInline output parameter to FALSE. It’s also possible to disable inlining process-wide via a flag passed to ICorProfilerInfo::SetEventMask.




The threading methods can be seen in Figure 3. The ThreadCreated/Destroyed methods bracket the lifetime of a CLR thread. A CLR thread isn’t exactly the same as a Win32 thread. For example, conceptually a CLR thread can execute on top of more than one Win32 thread during its lifetime. The ThreadAssignedToOSThread indicates which Win32 thread is currently underpinning the CLR thread.


As the CLR executes, it’s necessary to suspend some or all of the threads to perform actions such as garbage collection. The RuntimeSuspend and RuntimeResume set of methods indicate that all CLR threads are being suspended. (It’s actually more complicated than that, but I won’t get into the details here.) A parameter to RuntimeSuspendStarted indicates why the suspension is occurring. Finally, the two RuntimeThread methods indicate that a specific thread is being suspended, and seem to always be bracketed by RuntimeSuspend events.


COM Interop :


When the CLR interfaces with classic COM, it does so via proxy interfaces. These two methods—COMClassicVTableCreated and COMClassicVTableDestroyed—indicate when the proxies are created and destroyed. The information conveyed for the creation method includes the .NET ClassID, the corresponding COM interface IID, a pointer to the proxy’s vtable, and the number of vtable entries.


Managed/Unmanaged Code Transitions:


 UnmanagedToManagedTransition and ManagedToUnmanagedTransition are called when managed code calls into unmanaged native code, and vice versa. Both methods are passed a FunctionID representing the callee, which can be passed to ICorProfilerInfo::GetFunctionInfo for more information.


Garbage Collection and Managed Heap Objects:


Although there is no actual garbage collection method, the firing of the events ObjectAllocated, MovedReferences, ObjectsAllocatedByClass, ObjectReferences, and RootReferences indicates that a garbage collection is in process. The information conveyed by these notifications is sufficiently complex that I won’t attempt to explain it all in this space. Instead, I’ll just highlight the key points.


 The ObjectAllocated method (which requires the COR_PRF_ENABLE_OBJECT_ALLOCATED flag to be set) is called whenever a new object is allocated from the managed heap. The object is identified by an ObjectID token and the ClassID is also given. The class ID can be passed to ICorProfilerInfo::GetClassIDInfo for additional details on the class.


MovedReferences tells you that a particular object (specified by an ObjectID) has been moved in memory. ObjectsAllocatedByClass tells you which classes have had instances created since the last garbage collection. ObjectReferences provides a complete list of objects referenced by a given object. It’s called for every outstanding object. Finally, RootReferences is a list of all the objects that are root references.


Remoting Activity:


The remoting methods shown in Figure 4 detail the various points of execution that occur during remoting calls. The RemotingClient methods fire when the program is acting as a client, while the RemotingServer methods fire when the program is acting as a remoting server. In both cases (client and server), a distinction is made between the invocation of the remote method, and the actual message transmission that carries the request/reply.


Exception Handling:


The profiling API’s support for exception handling is rather complex and quite complete. I’ll defer to the documentation for all the details. In a nutshell, however, the API gives you before and after notifications for each phase of the exception handling sequence. Figure 5 shows the exception methods.


The ExceptionThrown method is the first indication you receive of an exception. The ObjectID parameter can be passed to m_pICorProfilerInfo::GetClassFromObject to get the type of the exception. For every managed method on the stack, these methods tell you when the CLR’s exception code searches for a try block in the method, executes a filter function in the method, or finds a method to handle the exception. It also tells you when the exception code unwinds a method, executes a finally block while unwinding, or executes the handler code.


Receiving Method Entry and Exit Notifications:


One of the coolest features of the profiling API is the ability to be told when any managed method begins executions, and when it is about to return to its caller. This is possible because .NET methods need to be JITed before they’re used. If you tell the profiling API that you’re interested in method enters and leaves, it can communicate that fact to the JITer. The JITer, in turn, inserts calls at the beginning and end of the method to call special functions that you define. There’s also a tailcall notification which is conceptually similar to an optimized function leave. I’ve never seen one generated in .NET, however.


Unlike all the other events I’ve described, the method enter/leave callbacks don’t belong to the ICorProfilerCallback interface. Rather, you use ICorProfilerCallback and ICorProfilerInfo to set up things so that the JITed methods will call the functions you supply. Just to make things interesting, the callbacks have a restriction that they cannot modify any registers. As such, it’s highly likely that you’ll write part of the code in assembly language, and use a __declspec naked function (an example is shown in Figure 6).


A single parameter (a FunctionID) is passed to the enter/leave callbacks. The documentation warns of dire consequences if your callbacks block or spend a significant amount of time before returning. However, in my experience, I’ve not had problems in using ICorProfilerInfo to look up the name of the FunctionID and write it to a file.


With a little effort, you can use the enter/leave capabilities to create a decent spy program that shows you every method as it executes. If you do this, you’ll be astounded at how many methods are called just to start up a trivial .NET program. Because there’s both enter and leave callbacks, you can see the nesting of method calls within other methods.


It’s important to note that the .NET runtime can make use of methods that have been pre-JITed. Unless you take special steps, these methods won’t have the requisite callouts, so you won’t see them with these callouts. In some cases, this can be fixed by monitoring JITCachedFunctionSearchStarted and returning FALSE in the pbUseCachedFunction parameter.


Caveats with the Profiling API:


There are a few important restrictions to know about if you intend to implement something using the profiling API. First, the COM server is assumed to be free-threaded. The .NET runtime doesn’t guarantee any sort of synchronization in calling the ICorProfilerCallback methods. If you’re using global data, you’ll probably want to guard it with critical sections where necessary. In other cases, you might want to use thread local storage, as I did in the DNProfiler code that is shown in Figure 7.


Another restriction is that inside the ICorProfilerCallback methods, you can’t invoke any managed code, directly or indirectly. The profiling API code isn’t intended to be reentrant, and calling managed code, while tempting, would definitely put you in that boat.


The profiling API provides a way to get to the ICorDebug interface. This allows you to get at certain details such as the call stack. However, when you’re executing in-process, not all of the ICorDebug methods are usable. Check the ICorDebug documentation for the list of what’s usable in-process and what’s not.


Finally, the profiling API is so cool that I can imagine lots of tools wanting to use it simultaneously. Unfortunately, only a single profiling COM server can be active at one time. If you’re truly after timing information, you’d only want one profiler to be active. However, for the sake of non-profiling tools, it would be nice if Microsoft could come up with a solution where well-behaved tools could play nicely together and chain along events to one another.


The profiling interface provided by Microsoft for .NET is one of the best things to come along for tool developers in a long time. It’s very comprehensive, fairly flexible, and I think that a lot of .NET tools (besides profilers) will be based on this API. Here, I’ve given a reasonable overview of what it’s capable of, but be sure to check out the documentation and Microsoft sample profilers for even more information.


October 22, 2008 Posted by | Profiling, Technology | , | 3 Comments