Memory issues are amongst the worst one to solve because pointing precisely the source is often difficult and painful. Memory leaks are not an exception, especially with real-world application: most of the time, programmers start to worry about it when the application outputs some “out of memory” errors. At this moment, you have to find which one, among thousands of functions and many more allocated blocks, causes the application to leak and eventually to crash.
Let’s summarize what you really need when you have a memory leak (in addition to a way to reproduce the issue):
- You want to find which object(s) are leaking
- You want to know why they are leaking: is there some static reference to it, or maybe they are not freed?
The process described today deals with the first one, which is often the most difficult.
Sometimes bug happens before you have the chance to attach a debugger to the faulting process. Most of the time it’s because it is launched by another process (a service, the compiler used to create a Xml serializer of a .NET software, a batch script, etc.) and you don’t have the time to get the command line with ProcessExplorer. And even if you can get it, a process may expect some context coming from its parent. And obviously, sometimes you don’t have a clue about how a process is launched, all you know is that it crashes and you need to see what’s inside before it do so.
After a few tryout to pause the process (Process Explorer is your friend) before it crashes, or some tentative to slow down your computer so you have the time to attach a debugger, you’re starting to get frustrated. Hopefully I have some solutions for you.
How to break on a function only when a parameter have a specific value (without source code, in WinDbg or… Visual Studio 2010!)
A few days ago, I had to break into a graphic application just after I clicked on a button. Sadly I didn’t have the source code, so my purpose was just to get the name of the applicative function called just after a user event (in my case, a click). Of course, when the function handling an event is called, I expect to see a Windows user mode function in the call stack. So I designed a small MFC application with just a button, made a function named OnBnClickedButton to handle clicks, added a breakpoint on this function, and tried to find on the call stack which function is always called when an application process an event.
I eventually found USER32!SendMessageW, and I was quite happy with it: this function is well-known for every MFC programmer because it allows you to send a Windows message to any application (including yours). A click on a button is of course a Windows message, and I was pretty sure I found my entry function. So I started the former graphic application, attached to it with WinDbg, and try to get the focus back to my application so I can click on the button. Sadly, my debugger broke before I could…why? Well, trying to put the focus on an application that is not visible triggers (at least!) a WM_PAINT message, processed by USER32!SendMessageW. And it is not the only one: a simple graphical continually receives A LOT of various messages. I clearly had to break only on a specific message. Hopefully the prototype of USER32!SendMessageW is well known: the second parameter is an unsigned int containing the message ID. Sounds nice, but how can you break on a function ONLY when a parameter have a specific value?
To cope with the various needs of any kind of software, every operating system offer a generalist memory management system. As it works out-of-the-box most of the time, developers usually don’t dig into it to find how it behaves under the hood. But you may be surprised to know that you can not access all the memory of your process, even when some parts seem to be free. Beyond available memory and fragmentation of user space, there is another barrier that can prevent you to get the remaining free bits : the heap service and the minimum size of a virtual allocation call.
Imagine a software who, after a few hours of uptime, starts to throw out-of-memory exceptions. First thought is that, indeed, you don’t have enough memory left. Let’s say that 1Go is free, and that you just ask for a small block of 30k. If you’re familiar with memory issues, second thought may be: “memory fragmentation! Check the largest contiguous block!”. And if I say that a block of 60k is available ? Well, if you are like me before this issue happens at my work, you’re stuck. But let’s go back to the beginning.
Maybe you already had this kind of details while using the ‘!clrstack‘ command from the SOS debugging extension. And by the rules of the Murphy’s Law, it always happens specifically on the local variable that you really need to see in order to slash the bug you’re working on.
This <no data> stuff can be seen especially in release compilation because the Jitter will use all the optimization tricks it can, among which you can find inlining and massive use of registry for temporary variables, so reading local variables is almost impossible (remember C++ hum ?). Hopefully the CLR offer a simple way to disable any kind of optimization (beware, doing this have a major impact on performances!).
If you ever tried to write a post on a technical blog, maybe you already experienced a strange thing: start with the idea of an article, and ends up with a totally different one. This is definitely the case of this one, as I just tried to have a decent local variable value in a call to the famous “!clrstack” function of the Windbg extension SOS, and I had a lot of trouble trying to break on the main function.
Most internet resources (even the help documentation of SOS itself !) advise the user to subscribe to the CLR load notification, and then load SOS and setup a deferred breakpoint. Well I don’t have any CLR load notification on my computer (Windows 7, and I tried x86/x64 applications targeting 4.0, 3.5 and 2.0), and I remembered experiencing the same behavior on my computer at work. As I’m not the only one having this issue (check this), I decided to write a short post about it.
Maybe you thought when you read this title: “well it’s kind of easy, I just have to attach any debugger to my running service“. And you’re definitely right.
But sometimes you have to debug the very beginning of your service (just after the “Start” control), or even before, when the main() function has just started. Or you’re experiencing a bug that happens only with a specific user, or only in a context of a Windows service (could be environment variables, registry keys, etc.). Hopefully, with a few tricks, you can easily setup a debugger that will attach to a process just after its creation.