I've been heavily involved in performance testing for the past few days. We have a home-grown solution that works reasonably well (and happens to have been my pet project). Given the size of our organization, we were more concerned with getting a good "feel" for the results for a reasonable cost, as opposed to the nitty-gritty fine details a product like LoadRunner would have provided (at it's enormous fee).
One area I've been looking at is how to visualize test results. We receive as input a set of page load times for a webpage, and are asked "is it fast?" Calculating averages and plotting histograms were the first two things we tried. More recently, I've decided I like this approach (click on the thumbnail for a larger view):
This graph shows the percentage of requests completed versus the time it took. I like it because we can quickly determine management-friendly statements like "90% of request are fulfilled within X milliseconds".
Incidentally, the data shown was from a technology evaluation. We were trying to determine which solution worked best under load. The dashed red line was pure ASP.NET, the blue was pure ASP, red was a custom COM component called by ASP and the other two were in-house variations on this theme. I was really impressed by how well ASP.NET held up under load.
Does anyone else have any thoughts or tips for visualizing data like this in a productive manner?
The Daily WTF, in today's post, showed a potentially difficult regular expression. I'm a reg ex geek, so I really liked this one, which was pointed out by an anonymous poster:
For those of you who don't find that inherently obviously, it's used by a Perl module to validate strings against RFC-822 (the standard defining the format of email messages and email addresses on the Internet). Anything that matches the above reg ex is a valid email address.
The author of the module, not without a sense of humour, comments that this reg ex "somewhat pushes the limits of what it is
sensible to do with regular expressions."
In today's post, we're going to look at something every tester should be watchful for: bad error messages. This one comes from a well known RDBMS system's GUI admin tool. Without going into too much detail, one essentially does this:
1. Run a query that retrieves results. 2. Modify a result in the middle of the result set. 3. Close the window.
What happens?
At this point, if you're anything like me, you utter a few unspeakable syllables followed by: "Firehose Mode? Firehose mode?! What the @#%! is Firehose Mode??"
And therein lies the problem. This is a bad error message because it doesn't tell the user anything he or she needs to know.
A good error message explains what caused the error, why the operation cannot be completed and how the user can get the application to do what was intended.
This is an acknowledged problem with the product. We're going to need the information in that support article when we rewrite this error message. What we're seeing here is clearly an API-level message that has somehow gotten into the finished application. It's a perfectly fine error if you're producing a database library. It's unacceptable in an end-user application.
What went wrong? The app tried to write to a cursor and it failed. Why won't it work? "Firehose mode" doesn't support writes mid-stream. How can the user fix it? Scroll down to the last row before making the change.
How about
"Your change has not been written to the database due to limitations in the way this application communicates to the database server. To make your change, scroll to the last row in the results and then try again."
I've had an idea brewing in the back of my mind for a response to the eXtreme Programming philosophy of "no manual tests" for some time. I disagree with the lack of manual testing for a few key reasons.
In terms of testing accessibility, simulating real-user use of the system, and detecting some types of events (ie: sounds), automated tests can be time-consuming and difficult to write - especially if one is working from a black-box or purely functional level. I believe that a test is worth automating if it is going to be run more than twice. This law of return does not hold for some UI tests. You can often spend many hours writing test code to verify something that can be ascertained visually in a matter of seconds. Automation is only worthwhile if it saves overall human effort.
In XP, the problem of vague requirements does not exist, as the customer is always on-site for consultation. Writing automated scripts exercises the application carefully and ensures that each piece works as intended along specific paths. What it does not do, however, is act like a real user. When real users finally begin to access the system, they may find that the on-site customer did not understand their requirements, or that the system will be used in novel and unintended ways once its power has been realised. A manual tester, combining real-world exploratory testing with expert knowledge of user interface design, can help to detect some of these issues before they become a reality.
The most compelling reason to have manual tests is that automated tests cannot detect the presence of bugs; they can only prove the absence of specific defects. When an automated test case reports a pass, we know that the application has no bugs in the fields examined along the path specified. This is a highly valuable activity, but it does not find new bugs. If an automated case fails, it is up to a human tester to characterize if the failure represents a bug, or a script problem, or a misunderstanding of the person writing the test. A human tester, on the other hand, can immediately detect bugs along novel paths. He or she can see an unexpected change and follow up by constructing new tests on the fly to force counterintuitive bugs to the surface. A human tester can explore the system. An automated tester can only prove that it saw what it expected.
There is a place in the development cycle for both automated and manual tests. Automation, particularly in the unit test phase, can provide excellent feedback by proving that certain bugs do not exist. Manual testers can bring their intuition, experience and expertise to the application to find what is truly unknown. Where automation gives us reliability and repeatability, the human tester gives us creativity and intelligence. All four of these attributes are necessary if a software project is to succeed.
The current HTML specification contains, in my opinion, a perfect example of an inconsistent interface. Consider how one places radio buttons and dropdowns on to a web form. Conceptually, dropdowns and radio buttons serve exactly the same purpose; they allow the user to pick one (and only one) item from a list of choices.
To create a dropdown, you write something like this:
This block of code generates a dropdown that looks like this:
You can then talk to this control from your web application by referencing the value of the "mycontrol" field. Details of how this happens vary from implementation to implementation.
Having seen this, I would immediately assume that to create a set of radio buttons, I would do something like this:
From the point of view of the application receiving the data, there is no difference between radio buttons and dropdowns. Either way, the webserver will tell the application "the user picked value X for field Y." So why are there two interfaces to do exactly the same thing?
The answer, of course, is that dropdowns and radio buttons are layed out differently on the screen. A dropdown is always a nice, tight, predictable package that fits into its little rectangle. Radio buttons, on the other hand, can be spread out all over the place on a page. Sometimes they're in a tight group horizontally, vertically, or in a box. Sometimes they're spread almost randomly across the page.
This annoys me because I'm writing an HTML Parser for my Random Walker (more on this later) that needs to understand the controls and doesn't care about the layout. This effectively doubles the amount of work I need to do to handle these (conceptually) identical cases. If I were to remake HTML, this would not be an issue - there would be a separation between what the controls do and how they are presented to the user. I think every application should strive for this separation. User interfaces are much more varied than just a screen, mouse and keyboard. What about test or automation scripts? What about screen readers for the blind? An application should be able to speak to a wide variety of interfaces.
How do screen readers handle HTML radio buttons? Dropdowns are easy. I'm sure they say something like "Please choose one of the following options: [...]". Radio buttons, on the other hand, could be totally chaotic. "You have this option... there might be more later."
My company is, I'm sure, very much like most out there. What do we do before a software release?
Run all test scripts.
Re-test closed tickets.
If all that is successful, launch.
I recently got caught on this and ended up signing off on what turned out to be a production bug.
Why? I didn't re-test my open tickets.
Why would I test something I know is broken? Why would you release something you know is broken?
It's quite simple... we take our software through a staging environment before launch so that we can separate incomplete development on future releases from the current release candidate. What happened here was the something not finished got out into the real world.
The moral of my story: test your open tickets too to make sure they're not tagging along for a ride.