The art of reporting bugs

winstep · **Joined:** Thu Feb 26, 2004 8:30 pm **Posts:** 11957

Every program out there has bugs (and yes, has much as I hate to admit it, sometimes even Winstep applications). Most of them will be minor and - hopefuly - you will only find them when Saturn is aligned with Mars and the sun is just over the horizon.

If you never created software you might have a hard time understanding this - why aren't all programs perfect and bug-less?! - but, if you have, you know how complex it is to write even a simple application. If you can sometimes find bugs on code that is just 2 or 3 lines long, imagine the potential for disaster in programs that have hundreds of thousands of lines of code!

Programming is also basically an exercise in futurology: you have to predict everything the user might do and every way each piece of code is going to interact with the other. It's like a juggling act! You have to come up with different scenarios in your mind and then test them one by one to make sure everything is working as it should and that nothing breaks. Of course, programmers are only human, so there is always going to be something we missed or didn't think about.

This is where a good team of testers comes in: they are going to do things with your program you would never have dreamed of, and, in the process, probably uncover some potential problems. You then fix these problems and you pray that those fixes didn't break something else in your code that was previously working fine. And yes, we do a lot of praying. :wink:

So, getting back on topic, what does the art of bug reporting consist of?

Well, first it consists on actually reporting the bug. Yes, pardon me for stating the obvious, but you have no idea on how many users run into a problem and then don't report it, either because they can't be bothered at the time or because they think we must know about it already. Well, in the later case, if it is a common bug then eventually somebody will report it to us, but, if you still see it on the next release, then you can be pretty sure we don't know about it. Winstep takes pride in fixing bugs as soon as they are reported in.

The second most important thing about bug reporting is STEPS TO REPRODUCE. Let me say that again: what do I have to do to reproduce that bug? If all you're going to tell me is that the application crashed, then all you'll get in return is a blank stare. What were you doing when the application crashed? Can you make it happen 100% of the time or is it one of those hard-to-fix 'it only happens sometimes' bug? How can I reliably reproduce the bug here?

To fix a bug, I must be able to reproduce it here so I can at least have an idea on where to look for the source of the problem. It helps a lot if the user does a bit of detective work first since this will eliminate many possible causes, and, sometimes, even provide that 'ah-ah!' moment you need to figure it out. 'The application will crash if option x is on and option y is off, but not if option x is also off'

When you are reporting a bug, besides the steps to reproduce, you should also be as specific and provide as much detail as possible. Most often that not, the cause of a bug is some obscure feature that I NEVER use here (or used once when I coded it and then forgot all about it). If you don't mention that you're using that feature, I will probably have a hard time figuring out what is causing the problem.

And as far as bugs go, you should be aware that they are not all equal. You can divide them into three distinctive categories:

1) The bug that can be reproduced 100% of the time by following the user's directions.

2) The pseudo-random bug. The one that happens sometimes but not all the time.

3) The bug that only happens on your system.

Bug type 1 is the easiest and quickest to fix. 'nuff said about it.

Bug type 2 is a nightmare: it will only happen when very specific conditions meet, most of them out of your control. Since you can't reproduce it reliably, you have no idea what is causing it, so you have to approach the problem based on a trial and error method: you suspect it might be because of y, so you change y and then wait to see if the bug rears its ugly head again. Usually you will only know what was actually causing the problem when one of your several 'blind' attempts to fix it finaly works - this will give you the first clue to the real cause.

A short story to ilustrate a type 2 bug (you might have a hard time understanding this if you are not a programmer):

When adding GDI+ and PNG file support to WorkShelf, I suddenly started getting random Access Violation exceptions. An Access Violation exception happens when a program is trying to access memory that doesn't belong to it any more. This exception would happen only when using GDI+ to draw a GDI+ bitmap created from an icon image in memory, but it didn't happen all the time. I could run the same code hundreds of times without a problem, and then suddenly, without any particular reason - bang, Access Violation.

I checked and double-checked the code, but, no matter how hard I looked, I couldn't find anything wrong. I was releasing objects when I should and not before, I had no GDI or memory leaks, everything seemed peachy, but... I was still getting those errors from time to time.

The solution to this problem only occured to me after solving yet another, apparently un-related, issue that was also getting on my nerves: any PNG bitmap file used as the background of a Desktop Module would become locked for as long as the associated Desktop Module was visible, i.e.; if I made a modification to the original bitmap and then tried to save the result, Photoshop would tell me that I couldn't because the file was locked. I had either to close the desktop module that was using that bitmap or change themes.

Now, this makes no sense because, for performance reasons, when WorkShelf loads a bitmap from disk for the first time, it makes a copy of the original bitmap in memory and uses that copy instead from then on - this way it doesn't have to perform an expensive disk access every time it needs to use that bitmap.

However, to decode PNG files I had to use GDI+. And GDI+ was, for some strange reason, locking the source file, only releasing the lock when the copy of the bitmap in memory was destroyed.

After searching the net for a while, I managed to find an obscure entry in the MS Knowledge Base explaining why. You see, it seems that when you create a GDI+ bitmap from a file, a stream or a memory bitmap, that bitmap will ALWAYS hold a reference to the original source. This is because the developers of GDI+ thought it would be ok for GDI+ to release the memory used by a bitmap whenever it felt like it (i.e. it might, but it might also not, with you having no control on the process), as long as that bitmap was a copy of some other bitmap.

If it did decide to release the memory used by that bitmap, then it would later on refer to the source bitmap to re-construct it when necessary. This explains why GDI+ locks the source file - it might need to access it again if, in the mean time, it decides to destroy the copy it has in memory.

For me, this is absolutely insane and goes against everything you would expect, specially when instead of having a file as the source of a bitmap, you have, say, an icon or another bitmap in memory!

You see, I (as everybody else would) assumed that once you converted an icon image in memory into a GDI+ bitmap, the resulting GDI+ bitmap would be a copy of the original bitmap. And it is. Except that it might also be destroyed at any time too!

Since you don't want two identical bitmaps using up valuable memory space, what you logically do after converting the source bitmap into a GDI+ bitmap, is to destroy the original bitmap. After all, you will only be using the GDI+ version from then on to draw on GDI+ surfaces.

With GDI+ you can't! You MUST keep the original source bitmap lying around until you dispose of the GDI+ bitmap. But where is this very important piece of information stated? In bold letters in every document related to GDI+ as it should? No, in an obscure one page KB resource, which you will only find when you've already run into the problem and know what you are looking for!

Anyway, this explains why I was getting those random Access Violation errors. If GDI+ decided to destroy its version of the bitmap, it would then refer to the original source whenever necessary. Since I had already destroyed the source bitmap, GDI+ would be trying to access memory that had already been discarded - instant Acess Violation!

Because GDI+ only decides to release the memory used by a bitmap whenever the sun is on the horizon and saturn is aligned with mars, most of the time your code runs smoothly and you get a type 2 bug!

Ah, in case you're wondering, the solution to the file locking problem is to create a blank bitmap with the same dimentions of the original, draw the file based bitmap into it, dispose of the file based bitmap, thus releasing the lock, and using the copy you made from then on. Nothing but a waste of CPU cycles.

The solution to the Access Violation error is to keep the source bitmap available at all times until you dispose of the GDI+ bitmap, at which time you can safely destroy the source bitmap as well. Nothing but a waste of memory.

And talking about waste, these two problems made me waste a lot of my time - I can only hope I never run into one of the Microsoft GDI+ developers, it won't be pleasent for him. :wink:

Anyway, getting back on track: type 3 bugs are, well, un-fixable. See, it's not really a problem with the application per se. It's something external, and it can be anything from your video driver to DLL hell (where you have mismatched versions of Dynamic Link Libraries on your system). Unfortunately it's very hard to explain to a user that it is a problem with his machine and not with the application itself - he will only understand that when you finaly talk him into trying the application on a friend's machine and he realizes that it doesn't happen there. Nor on his machine at work.

Gdiplus · **Joined:** Sat Mar 25, 2006 11:37 am **Posts:** 4

I have been bitten by this GDI+ bug and just wasted a few hours debugging my threading code. I looked around but no mention of this "feature" in the MSDN. A bit of googling and I find your article. Anyway just registered to thank you for writing this article

winstep · **Joined:** Thu Feb 26, 2004 8:30 pm **Posts:** 11957

Feel free to let me know if you need more help with GDI+, it has *lots* of quirks! For instance:

GdipCreateBitmapFromFile fails to load True Color (32 bit) icons without an alpha channel and also loses the alpha channel of XP type icons.

GdipCreateBitmapFromHICON and GdipCreateBitmapFromHBitmap lose the alpha channel of bitmaps and icons.

The solution to the above means a lot of fiddling with DIBs, mixing LoadImage with GDI+, creating GDI+ bitmaps with GdipCreateBitmapFromScan0, you name it. A real PITA.

Gdiplus · **Joined:** Sat Mar 25, 2006 11:37 am **Posts:** 4

Thank God I only need the most basic features of GDI+ right now. Currently trying to figure out how to do animated GIF's.

I still think GDI+ is an OK library though. With plain GDI I would be totally lost right now.

winstep · **Joined:** Thu Feb 26, 2004 8:30 pm **Posts:** 11957

Well, a buggy GDI+ is better than no GDI+ at all, that's for sure... still, I only wish they had implemented it right. Plus it's a LOT slower than plain old GDI.

The lack of documentation is also appalling. Have you noticed that the quality of information at MSDN has lately been going down hill? No examples, bare bone descriptions... Yuck. For example, implementing layered windows with per pixel alpha is pretty straightforward once you know how... but to get there you have to sweat simply because there is NOT ONE example at MSDN on how to use UpdateLayeredWindow.

Gdiplus · **Joined:** Sat Mar 25, 2006 11:37 am **Posts:** 4

GDI+ performance is apparently a result of hiding too much stuff away. Same for the file locking mechanism which is why I got here in the first place.

And yeah while searching for GDI+ stuff I noticed excactly what you said. It is like bare-bones doxygen pages only with different style (that hasn't changed in the last 6 years as far as I remember).

Even worse, most of the methods related to a function are not hyperlinks, just printed in bold black, no links to related topics etc.

Looks like the guys at MS are too busy at the moment getting a new build of Vista ready

winstep · **Joined:** Thu Feb 26, 2004 8:30 pm **Posts:** 11957

Ah-ah! If only!

Just the other day I was reading a blog usually frequented by MS employees, and all hell broke loose in there the day MS said Vista was not comming out before 2007 (huh? what else is new? it's only been 5 years). Lots of inside information from a lot of disgruntled (anonymous) programmers. The basic consensus is that Vista is nowhere ready for public consumption.

MS is becoming more and more like IBM was in the IBM PC days (meetings about meetings, comitee based decisions on every little thing) i.e.; more time is spent on getting authorization to do things than actually writting code.

And then there is the old vs new school inside MS: the old school believed and was proud in maintaining backward compatibility (e.g.; Raymond Chen) while the new school believes on re-inventing the wheel. The result is monsters like .NET which, for the first time in MS history, gives us a programming language version that is totally incompatible with code written in previous versions (VB).

Unfortunately the new school is winning, and the result is here for all to see.

Gdiplus · **Joined:** Sat Mar 25, 2006 11:37 am **Posts:** 4

On a related note what is your plan to migrate your Winstep product line to Vista? I'm actually not so familiar with Winstep (landed on your forum by accident) nor with the planned features of Vista, but I believe you'll have a ton of work to migrate your code to all the new desktop features etc.?

Will you build a new product line from scratch? Maybe it's too early to ask anyway so don't bother replying if you dont want to.

winstep · **Joined:** Thu Feb 26, 2004 8:30 pm **Posts:** 11957

Quote:

On a related note what is your plan to migrate your Winstep product line to Vista? I'm actually not so familiar with Winstep (landed on your forum by accident) nor with the planned features of Vista, but I believe you'll have a ton of work to migrate your code to all the new desktop features etc.?

Although Winstep applications may look like shell replacements, they're not. They're regular applications. So, except for the odd change here or there, I expect them to work correctly in Vista.

For instance, they worked in XP 64bit straight out of the box.

Vista by itself, for me, is more of a chance to add new features (because hopefully MS will by then have corrected some bugs and perfected some of the new features you can already find in XP, plus added a ton more).

winstep · **Joined:** Thu Feb 26, 2004 8:30 pm **Posts:** 11957

This article has been indexed by Google and, because it talks about a common problem with GDI+, it led to some people emailing me about it for clarification. As previously stated, the best solution to avoid GDI+ Access Violation errors and source image files being locked, is to make an actual COPY of the original bitmap instead of just a reference to it. This is how you do that in VB 5/6:

Code:

Public Function GDICopyBitmap(GDIBitmap As Long) As Long

Dim GDIGraphics As Long
Dim sngWidth As Single
Dim sngHeight As Single
    
    If GDIBitmap Then
        
        ' The following code is necessary in order to prevent the source of
        ' the GDI Bitmap (file or memory)
        ' from being locked by GDI+ for the life of the bitmap (annoying
        ' little bugger)!
        ' If the source is a file, the file remains locked, if the source is
        ' another bitmap or array data
        ' and we dispose of it, later we will probably run into an Access
        ' Violation exception when GDI+
        ' tries to retrieve data again from the original source!
        ' We create a new blank bitmap with the same dimensions, get a
        ' graphics surface to it,
        ' draw the bitmap on this surface and then dispose of the original
        ' bitmap so the file lock is released.
        
        ' Get bitmap width and height.

        Call GdipGetImageDimension(GDIBitmap, sngWidth, sngHeight)

        ' create blank bitmap with same dimensions as the original.

        Call GdipCreateBitmapFromScan0(Int(sngWidth), Int(sngHeight), _
        0, PixelFormat32bppARGB, ByVal 0&, GDICopyBitmap) 

        ' Create a Graphics context into the blank bitmap.        

        Call GdipGetImageGraphicsContext(GDICopyBitmap, GDIGraphics)

        ' Draw the original bitmap into our blank bitmap.

        Call GdipDrawImageRect(GDIGraphics, GDIBitmap, 0, 0, _
        sngWidth, sngHeight)

        ' Destroy original bitmap.

        Call GdipDisposeImage(GDIBitmap) 

        ' Dispose of the graphics surface.

        Call GdipDeleteGraphics(GDIGraphics)
    
    End If

End Function

The GDIBitmap argument is the original GDI+ bitmap, which is destroyed at the end of the function (thus releasing any references to the original file/memory bitmap) with an handle to the actual copy being returned in GDICopyBitmap.

Another bug with GDI+ that might be worth mentioning here is that a GDI+ session will not recognize newly installed fonts until you re-start the session (or exit and re-start the application).

Hope this helps,

pirate_jonno · **Joined:** Wed Dec 06, 2006 10:22 am **Posts:** 1

Hi everyone

I've come across what appears to be a bug in GDI+ and i have pretty much no idea how to fix it/report it.

I'm using GDI+ to draw two versions of similar things, except that one is right-side up at the bottom half of the window and the other is upside-down at the top. Anyway, I have a function that draws one thing, as if the origin was (0,0), and the WM_PAINT code calls it twice with two transformations: a vertical translation for the bottom half, and a 180 rotation/translation for the top half.

Everything seems to go ok as long as I only call Graphics::FillRectangle, Graphics::FillArc, etc. However, when I try to draw something (e.g. Graphics::DrawRectangle or Graphics::DrawImage) the result is translated (+1, +1) on the upside down portion of the client area but not the rightside up one.

Essentially I'm working with something like this (C++):

Code:

//inside WndProc
case WM_PAINT: {
    int height = 100;
    int width = 50;
    PAINTSTRUCT ps;
    HDC dc = BeginPaint(&ps);
    Graphics g(dc);
    g.TranslateTransform(0, height);
    paintItem(g, ...);
    Matrix rotate(-1.0f, 0.0f, 0.0f, -1.0f, 0.0f, 0.0f);
    g.SetTransform(rotate);
    g.TranslateTransform(-width, -height);
    paintItem(g, ...);
    EndPaint(&ps);
}

//helper function
void paintItem(Graphics g, ...) {
    //works fine
    g.FillRectangle(brush, 5, 5, 20, 20);
    //ends up being at (4, 4, 19, 19) on the second call
    g.DrawRectangle(color, 5, 5, 19, 19);
}

The code's a bit improvised but thats the general idea.

Essentially what I would end up with there is a 20x20 box with a 20x20 border that is 1px down and right from the upper-left corner of the box, and a standard 20x20 box with a 20x20 border underneath it.

It seems that GDI+ uses a different point transformation algorithm for filling objects as opposed drawing them, but i have no idea why.

Any help on this would be greatly appreciated. I have also tested very similar code in Java and it did exactly the same thing.

pitmodano12 · **Joined:** Wed Sep 19, 2007 7:17 am **Posts:** 1

PEOPLE ARE REPORTING BUGS IN ITUNES 7.?David Pogue, The New York Times of the bugs, like problems with the new auto cover-art downloading feature.

Winstep

Software Technologies

Winstep Forums

The art of reporting bugs