Alois Kraus


  Home  |   Contact  |   Syndication    |   Login
  133 Posts | 8 Stories | 368 Comments | 162 Trackbacks



Post Categories


When dealing with application responsiveness issues you often need to get data from the actual users to be sure what is going on. There are plenty of reasons why your application does not respond. Most prominent examples are

  • Network
    • A bad network connection can totally ruin the user experience. Wireshark is your best friend to check if you have a network issue at all.
  • Paging
    • If your application has a high memory footprint you will experience sudden slowdowns because the OS did silently page out unused memory. When you touch it again the memory needs to be read in from the page file again which is about 1000 times slower than plain memory access. A SSD can greatly improve the user experience. (Check performance counter Memory\Pages Input/sec).
  • Disc IO
    • If your app hangs because it is just busy writing 500MB to disc you should never do this on the UI thread although it is much easier to program.
  • Garbage Collection

After you have looked at network (we need that data), paging (we need so much memory), disc IO (we need to read/write so much data) you should also check if garbage collection is an issue. GC times can become quite high if you create a lot of temporary objects. Especially de/serialization is a source of pain for the GC. How do you find out if you have a GC issue? First of all you need to check the usual performance counters which do tell you how often the GC kicks in and how much CPU is burned by cleaning up your mess. If you stay low e.g. 10-20% then you do usually not need to worry much about it except if your application hangs too often during a UI interaction. In that case you need to dig deeper.

The best tool to check managed memory issues including GC induced latency is PerfView from Vance Morrison. When you uncheck everything except "GC Collect Only" you can let it run for a long time on any machine to check how many GCs you have.



When you have pressed Start/Stop Collection you get an etl file that contains a GCStats section:


To complete your analysis you need to double click GCStats to get a nice overview for each process how much memory it did allocate and how much GC pause times your application did experience.


If you want to drill deeper you can export the data to Excel and check every single GC event if you wish to. That is all nice but what if the customer does not want to install strange tools on their machines? In that case you still can get the data and analyze it on another machine. PerfView relies on the CLR ETW providers which can be enabled with the command line tool logman as well. Logman is part of Windows since XP. To check which data the CLR ETW provider can give you you can execute:

logman query providers ".NET Common Language Runtime"   

The most interesting keywords are

Keyword (Hex) Name Description
0x0000000000000001 GCKeyword GC
0x0000000000000004  FusionKeyword       
Binder (Log assembly loading attempts from various locations)
0x0000000000000008  LoaderKeyword        Loader (Assembly Load events)
0x0000000000008000      ExceptionKeyword Exception

The most prominent keyword is 0x1 which is GC. The other also very interesting keyword is Exception (0x8000) which logs every thrown exceptions in all managed applications. If you want to check for GC latency issues and all exceptions I have created a little script to automate this task:


@echo off
REM enable GC Events 0x1
REM Exception tracing is 0x8000
logman start clr -p ".NET Common Language Runtime" 0x1,0x8000 0x5 -ets -o "%temp%\logmanclr.etl" -ct perf -bs 1024 -nb 50 500 -max 500
REM Needed to decode clr events later
logman start "NT Kernel Logger" -p "Windows Kernel Trace"  (process,thread)  -ets -o "%temp%\logmanKernel.etl" -ct perf -bs 1024 -nb 50 500 -max 500
logman -stop clr -ets
logman -stop "NT Kernel Logger" -ets
if "%1" EQU "-Merge" (
xperf -merge "%temp%\logmanclr.etl"  "%temp%\logmanKernel.etl" "%temp%\merged.etl"

You can send your customer this script and tell him to start it and execute the use and then press any key to stop collecting data. The result are two files named logmanclr.etl and logmanKernel.etl which can be zipped and mailed back to you. On your machine you need to merge the two etl files to be able to load the data into PerfView. This is what the script does with the -Merge option on your machine if you have the Windows Performance Toolkit installed. The other logman options are there to prevent loosing events even if very many of them come in and to prevent filling up the hard disc by allowing for each file a maximum size of 500MB.

For GC relevant data PerfView is the best tool to analyze the captured data. If you are after exceptions you can also use WPA to check which exceptions were thrown while your and all other applications where running on this machine. This will give you only the exception type and exception message but no stacks because a very limited set of ETW providers is enabled which is well suited for long running tests. 


Since each garbage collection has a start and stop event which does roughly correlate to the stop times of your application you can also create a regions file to visualize the GC events. This makes it extremely simple to check if your app was hanging because some long GC was running which did (probably) suspend your thread which tried to allocate just more memory. When concurrent GC is enabled you can allocate data and even new GC segments (for Gen 0,1)  but not large objects or Gen 2. Most of the time GC time does correlate with app pause times.


Here is the regions file I came up with to make the graph above.


<?xml version='1.0' encoding='utf-8' standalone='yes'?>
<?Copyright (c) Microsoft Corporation. All rights reserved.?>
     <RegionRoot Guid="{d8d639a0-cf4c-45fb-976a-0000DEADBEEF}" Name="GC" FriendlyName="GC Times">
          <Region Guid="{d8d639a0-cf4c-45fb-976a-000000000001}" Name="GCStart" FriendlyName="GC Start">
              <Event Provider="{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}" Id="1" Version="0"/>
              <Event Provider="{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}" Id="2" Version="0"/>

          <Region Guid="{d8d639a0-cf4d-45fb-976a-000000000002}" Name="GCStart_V1" FriendlyName="GC">
              <Event Provider="{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}" Id="1" Version="1"/>
              <Event Provider="{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}" Id="2" Version="1"/>

           <Region Guid="{d8d639a0-cf4d-45fb-976a-000000000003}" Name="GCStart_V2" FriendlyName="GC">
              <Event Provider="{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}" Id="1" Version="2"/>
              <Event Provider="{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}" Id="2" Version="1"/>

Not bad for such a simple file to make WPA GC aware. To make this file active you need select in the menu Trace - Trace Properties and add the file to it. Then a new Regions of Interest graph is added to your existing graphs which you can add to your analysis pane as usual. 


To check when a GC did start/stop this overview is really nice. If you need a deeper analysis there is no way around PerfView which gives you all the numbers pre analyzed in a per process view. When you have managed memory leaks you should check out the Memory menu of PerfView to take GC Heap dumps from running processes (nice) but also from memory dumps (even better). The latter functionality is essential if you only have a dump from a crashed process but it is not obvious who is holding most objects. I know .NET Memory Profiler from SciTech which also can import dump files but I got lost in the fancy UI. PerfView on the other hand hand looks old fashioned but it does its job much better than the UI might suggest:


Here I do see that I keep 100% of my data (10GB in a static variable in 10003 instances) which is very easy too see. Other profilers can do this also very well. But if you double click on the type you get the shortest roots to all objects holding your data nicely printed in a tree view. Cyclic references are broken up and the relevant nodes are marked bold. Other commercial tools had a hard time to tell me who is holding my data if graph cycles are involved. PerfView was able to show me it directly without much navigation in a complex interdependent object graph. To check who is allocating too much data you can also use PerfView when you enable ".NET SamplAlloc" to sample your allocations with minimum performance impact. Then you get the allocation stacks nicely lined up by object type. If you can afford it performance wise you can also enable ".NET Alloc" to track every managed object allocation which gives you exact numbers. That's all for today. I am constantly amazed how much useful data you can get out of ETW and shocked at the same time that so few people know about it. If your new years resolution is already history how about a new goal? Learn how to get most out of ETW!

posted on Monday, February 17, 2014 6:09 AM


# re: Does Garbage Collection Hurt? 3/20/2014 4:42 AM Haithem
Gab es da nicht ein API um den GC anzustoßen? oder war der nur speziell für uns gemacht?

# re: Does Garbage Collection Hurt? 3/20/2014 5:46 AM Alois Kraus
Yes you can call GC.Collect explictly if you are really sure what you are doing (most probably not). PerfView will mark these collections in the GCStats page as Induced GC. That makes it easy to see if someone is calling GC.Collect too often.

Post A Comment