Bernhard's Blog: 2009

Tuesday, 6 October 2009

How many bugs

I learned this method from a colleague who used to work with me; Duncan Kennedy. He explained that if we both test the same functionality in our product, we can determine the probable number of bugs by comparing the ones we find. The simple formula requires that two people test independently and then compare the bugs they find. Let A be the number of bugs that the first tester found and B be the number of bugs that the second tester found. Let M be the number of bugs that are matched (same bug found by both testers). The estimated number of bugs in the feature/product under test is:

A x B ÷ M

or Person 1 bugs multiplied by Person 2 bugs divided by the number of bugs that match.

Thought tis might come in handy one day.

Wednesday, 24 June 2009

Source Control and Diff software

For some time now I've been perplexed that most source control systems, although designed and built for source control, are actually nothing more than file control systems. You can put any sort of file in a source control system, and it is treated as a file. Yes, a file; not source code, but a file that might (might not) contain source code. We need this because there are so many different file types we need to store along with our source these days that it wouldn't make sense to only store code.

Some SCMs have special comparison programs that allow comparison of image files, not just text. That got me to thinking; when I use a VCS, I would like it to know that a source code file should be stored in a certain format. This would allow me to use tabs and Allman style, whilst other colleagues could use spaces for indentation, and K&R style. I suspect it would then either be a case of the SCM storing the code in the most efficient format and converting to user-style when the file is retrieved from the repository. This would still allow me to use my preferred text-based diff tool to compare code written by myself and others, without having to worry about extra blank lines and braces in the wrong place, tabs and space indentation. The focus would not be on code-representation, but the code itself.

I know some source control systems allow hooks to be put in place that would allow this to be done. If only I had, or could make the time to do this.

Tuesday, 28 April 2009

The stages of a new technology

I noticed a blog about the phases of Unit Testing and thought that it was a good indication of the stages we tend to go through whenever we take on a new technology, pattern, technique or "thing" in general.

I tend to start with exploration: download, install, poke at it. This leads on to the learning stage: read blogs, read articles, possibly read a book. It's during these first two stages that I'll form my opinion on the worth of the thing and decide whether to proceed or not. It's also during this period that you'll find the most active discussion.

Once adopted, the learning stage tends to continue, increasing knowledge and understanding whilst becoming more familiar with the thing and accepting it's drawbacks or failings. Whilst not an authority on the subject, I can hold my own and use most of the features of the thing.

I seldom achieve "authority" status on a thing. Mostly because the next best thing has come along and my energy is focused on stage 1 of that new thing.

Tuesday, 27 January 2009

File Systems

I was reading Coding for a Living: A Pattern for Fluent Syntax and started a reply that went so off topic and was too long in my opinion for a comment, so I decided to post it here.

I couldn't agree more with what Richard wrote about the shortcomings of the file system. Folders for files are fine when you have a few files, but I repeatedly find myself wanting folders organised for different points of view.

For example, say you file quotations and invoices on your server. Are the top level folders the company name, the status of the client (Archived, Active, Potential)? There will be a greater need, as we produce more electronic files, to view the files by their categorisation rather than their location within folders. Most of the time I don't want a Windows Explorer, I want a Windows File Finder where I can say I want to see the files tagged with "Invoice" and "Acme Corporation" and "created between October and December 2007". Sound like SQL to you? That's where I hope the file system will go one day.

Monday, 19 January 2009

Reading XML? Use XmlReader, not SqlDataReader

I just know I'm going to want to look this up one day, so here it is for my reference:

If you have SQL that produces XML, you might be tempted to try read that XML using a normal SqlDataReader. But doing it that way will not work if the resulting XML is large. In order to read large XML results from SQL Server, you will need to use the XmlReader rather than the SqlDataReader.

Friday, 9 January 2009

How unit testing becomes coding

Recently I was writing unit tests for some non-trivial methods on a few classes. I noticed that a lot of the tests were similar, but varied in terms of their input values and expected result values. Which then led to some refactoring within the unit tests to produce code with less duplication. Which led me to think about how I was going to test the test code I had just refactored. The thought process took me on to think about domains; in particular the fields of mathemetics or finance. My thoughts ramble here, so bear with me or leave if you like.

Let's assume we have a clever mathematician or actuary called Tess. Tess produces a formula for us to code up in a business object. My thought is this: "How would Tess test the formula in her domain?" We would attack a formula such as this with inputs and expected outputs because that's what our unit tests lean towards. But how does Tess validate the function? She steps through every part of the forumla, proving with known rules, that the formula is sound. She never uses inputs and outputs to verify the formula because she knows that this is statistically insignificant. That is to say, even 100 test values out of the possible hundreds of millions is insufficient to be considered a representative sample.

My conclusion after all this nonsense is that code reviews are under rated and far too infrequently employed. And, and this is what bites, we don't do them well enough. I know that some people do very good code reviews, and I am going to find out what they do that makes their code reviews more effective and complete than the code walkthroughs I've seen in the past.