Friday, April 25, 2008

EDSK ... you won't get rid of all bugs, add all the features or write all the programs

Every developer should know that you won't get rid of all bugs, add all the features or write all the programs you might want to.


Why is this important?

You need to set limits and prioritise.

There isn't enough time to do everything. So you need to focus on what's important and do that!


There isnt' the time to fix every bug. So fix the ones that cause the biggest problems or affect the most people (or whatever criteria you use to prioritse bugs) first.  By the time you've made those changes, circumstances may be different.

Make sure you're adding the feature which will bring the most benefit.  Not just the one that is easiest to code, or most fun.

Make sure you're creating a new program that is of use, doesn't already exist and will actually benefit others.
 


What do you do once you know this?
Prioritise!
Make sure you're doing something that is worth doing.
Don't do one thing if there's another that is more important.
Understand the costs and benefits of fixing the bug, adding the feature or creating a new program, before you write the code.

rethinking EDSK

I think it's a suitably big understatement to admit that I'm not meeting my posting deadline on Every Developer Should Know.... In an attempt to address this and help me fit in a bunch of other stuff I'm trying to do, a bit of a restructure and reprioritisation is in order.
From an EDSK point of view I'm gonna start thinking more short term. I had long term plans to extend beyond tips that are generic to all developers and also include content more targeted to developers using specific technologies or targeting specific environments/platforms.
I had starting collecting references to relevant pieces on the web, but to help me focus on what I'm working on right now I'd stop. I also thought I'd post up the link so they don't go to waste.

Linux Developers should know:
Ten Commands Every Linux Developer Should Know

.NET Developers should know:
What Great .NET Developers Ought To Know
MSDN Webcast: What Every Developer Should Know About the .NET Framework, But May Have Missed Along the Way (Session 5) - Level 200
What Every Developer Should Know About the .NET Framework, but May Have Missed Along the Way
Visual Studio Add-Ins Every Developer Should Download Now
Ten Must-Have Tools Every Developer Should Download Now
.NET Framework General Reference - Design Guidelines for Class Library Developers
Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries
Database Developers should know:
... about normalisation. Not just how to do it but why.

Java Developer should know:
Ten Things Every Java Developer Should Know About Unix
Best Java book available
I've been using Java since 1995 and have owned this book since 2001 and it's the only Java text I still turn to. I recommend every Java developer, no matter what level you're at, read this book and read it again every year for the remainder of your career.

Ten Things Every Java Developer Should Know About Unix

PHP Developers should know:
10 Tips That Every PHP Newbie Should Know
10 Tips That Every PHP Developer Should Know, Part 2
make life as a PHP developer a whole lot easier

Web Developers should know:
8 Firefox Add-ons every Web Developer should know about!
Speed up your web pages with YSlow
Color Oracle takes the guesswork out of designing for color blindness by showing you in real time what people with common color vision impairments will see.
Advanced JavaScript Debugging Techniques
What Every Web Developer Should Know
... the differences between REST and SOAP.

Windows Developers should know:
... to avoid using the clipboard programmatically
Eight resources every developer should know about
... about COM
... about User Account Control (UAC)

Tuesday, April 15, 2008

Displaying data in a better way

Here's an example of improving the usability of data displayed in a grid.



Here's a part of the original grid:

The data relates to these categories:


Immediately there is a duplication of data which could be reduced. The colour and description can be combined as they are show the same thing. That would lead to a grid which looks like:

It can still be better though.

The category name is supperflous information. The colour indicates the category. This is something that the user will either know or can look up when necessary.

A roll over tooltip could also be added to display the category description for each field. This will be a more practical solution for any user who does not know what the colour means and will save them having to look it up.

Thursday, April 10, 2008

Tuesday, April 08, 2008

Mobile UI Tips

Challenges:
-Screen
--Size, orientation, resolution, layout
-Input
--SIP, keyboard, dedicated buttons, stylus
-User Interaction
--Standing up on a moving bus
-Understand System.Windows.Forms
--Compactness
--Form and Control classes

Do not try to create non-full screen forms

On the screen:
-Top strip
--Don’t hide the title bar
--Use the same title in owned forms
-Bottom strip
--Don’t use a toolbar control
--Don’t use more than two menus
--Don’t hide the bottom strip
-Main Area
--Place tappable controls near the bottom
--TextBoxes or anything requiring the SIP, near the top

Screen aware:

  • Size
  • Orientation
  • Resolution
  • Touch enabled?

Input:
-Keyboard
-SIP
-Dedicated Buttons
-HardwareButton
-Stylus or Finger
--Tap
--Tap and Hold (Avoid)

Aim for single handed operation (ideally stylus free: Designing Pocket PC Application for Stylus-Free Usage (one-handed))

Object Thinking : David West




Book based on assumptions:

  • Agility/XP essential for profession/industry to improve
  • XP offers way to create better developers
  • Can't do XP without understanding object thinking
  • Won't appreciate the benefits of XP if don't fully understand 'object thinking'

Simplicity:

  • OT based on problem domain, not potential program
  • OT leads to smallest number of things (classes) possible
  • Objects doing the least amount of work, in the most direct and simplest way
  • Focus on coordination of autonomous objects - not mgmt of unruly modules and passive data structures

Simple design:

  • Fewest no. of classes
  • Fewest number of methods per class
  • Simplest coding of methods
  • Avoidance of control, centralization & mgmt classes
  • Simple scripts to simulate simple stories

Refactoring:

  • Allow 'lazy' objects to give work to other objects

The true differences between programming languages are those that reflect philosophical ideals and values

If you think about design using a language - your design will be enhanced or severely restricted by that language

Object culture:

  • Collaboration rather than mgmt
  • Coordination and cooperation, rather than control
  • Rapid prototyping instead of structured development

Prerequisites to OT:

  • Everything is an object
  • Simulation of problem drives object discovery and definition
  • Objects must be composable
  • Distributed coordination and communication must replace hierarchical centralized control as an organizational paradigm

Programs may be thought of as data and functions - but the real world isn't

Assuming data and functions:

  • Programs are more complicated than need be
  • Complex code is difficult to understand and test
  • Complex code is brittle and difficult to modify when requirements change
  • Resultant code lacks composability - not reusable outside original context

Object principals - software principles:

  • Solve complex problems by solving a series of intermediate, simpler problems
  • Appreciate human cognitive limitations
  • Correctness is unaffected by movement between equivalent contexts
  • Correctness is unaffected by replacement with equivalent components
  • Modular design
  • Portable design
  • Provides compositional flexibility
  • Appropriate use of abstractions
  • Limited set of conceptual forms

Brooks' 4 essential difficulties of software development

  • Complexity
  • Conformity (to the world, rather than the other way round)
  • Changeability (to the world, which changes frequently)
  • Invisibility

Metaphors:

  • Help discovery
  • Help make design decisions
  • Provide handy ways to remember principles of object thinking
  • Help avoid non object thinking

It is convenient to build something large from smaller (but not the smallest possible) components

The complexity of object-oriented programs is in the scripting, not the objects themselves

Hierarchical and centralized control is anathema in the object paradigm.

Ants, not autocrats: - do your thing and react to messages from those around you.

Behaviour is the abstraction to use to differentiate between objects and is the criteria to base taxonomy on.

Creating taxonomies based on internal structure leads to numerous problems

The programming language used does not mean that doing object programming

Object vocabulary is first and foremost a technique to help developers avoid the mistake of thinking about solutions using old mental habits

Essential terms:

  • Object
  • Responsibility (task)
  • Message
  • Protocol

Computations must be by an object on itself: (e.g. number adds another to itself. rather than an object and two other number objects together)

  • Helps enforce simplicity

Multiple inheritance:

  • Is unnecessary
  • There are alternative
  • Adds needless complication

Methods and model must:

  • Support natural decomposition
  • Recognize 2 complementary processes: domain modeling; application assembly
  • Aid discovery and evaluation
  • Enable measuring progress and 'goodness'

Formal methods do have their value

Blending methods and approaches is hard

Using ideas from both approaches is equally difficult

Need criteria to evaluate:

  • Self
  • Progress
  • Products

Need to understand the domain - how computers work is not part of the domain

Starting a journey with one step in the wrong direction can have an enormous impact on the end result

Mistakes at the beginning of a process are more costly:

  • Have less knowledge, so more likely to make mistakes
  • More tempted to think about what do know (the computer) rather than the domain

Never: think about what the code will look like and then create objects to support that code.

Set aside your own culture when attempting to understand users and user domains

To understand users: (their domain and their tasks)

  • Go and spend time with them
  • Observe them
  • Talk to them

Can't define everything up front - do one thing (story) at a time

Object definition:

  • Most critical aspect of discovery
  • Define in terms of actual or intended use
  • Define within domain, not just application space
  • Not the same as object specification
  • Specification will involve making design decisions

OT suggests you should generalize responsibilities so that they can be used in any context

Let objects assume responsibility for tasks that are wholly or completely delegated to other objects in cases in which responsibility reflects natural communication patterns in the domain.

Delegate responsibilities to get a better distribution and increase reusability.

Delegation can lead to the temptation of management. - If you delegate, delegate:

  • Don't try and control
  • Don't guess what the result will be
  • Don't do own error checking or evaluation

Responsibilities should be distributed among the community of objects in a balanced manner

Avoid responsibilities that are characteristic specific, that focus on providing a potential user with the value of a single characteristic of the object.

Beware the dangers of GUI-in design!

The two kinds of relationship of interest between objects:

  • is a kind of
  • collaborates with

Single line of descent based only on the behavior of the object

Collaborations are almost always hard coded - due to complexity of relationship between objects

OT: data is information or knowledge that objects need to complete a task

Traditional data modeling: all the data that a system must remember about objects

Model: - comprises objects engaged in the objectives of the application

View: - hierarchically organized collection of objects

Coordinator: - tasks involved in sending messages to other objects, and notifying subscribed objects of a state change

Objects are not and should not be aware of their clients, even when their clients are not other software objects

Scripts as first class objects:

  • Ordered collection of messages

Events as cues to object interaction

Constraints and rules are objects themselves

Rules should not be complex (coz they're objects)

Objects will often have a collection of 'self evaluating rules'

Rules:

  • Evaluate
  • Error handling/recovery

XP maturity levels:

  • Out of the box
  • Adaptation
  • Transcendence

Objects are not something you do - objects are something you think

There are circumstances in which it is difficult, if not impossible to apply object thinking

  • e.g. RDBMS, GUI

Database philosophy is almost totally inconsistent with the philosophy behind object thinking

Need to remember the functional advantage of databases as well as their persistence services

XP created users who don’t want large monolithic software but a collection of small, targeted applications which do specific tasks.

  • In contrast to 80/20 law

Object cube:

Side 1: Responsibilities

Side 2: Description and stereotype

Side 4: Knowledge required

Side 5: Message protocol (methods)

Side 3: Contracts (public or private methods)

Side 6: Events


objectionary ?

Friday, April 04, 2008

How I justified the move to unit testing

Update. This memo was written while at a previous job. The IT director had previously claimed there was no value in unit testing and we shouldn't do it. I submitted this memo to all developers at one of the fortnightly update meetings. It did lead to unit tesing being adopted by other developers as part of the development process.

The following is from a memo I wrote:



Why we must use unit testing in the future.

or
How I saved myself almost three days work and met a deadline.

Plus saved countless days work in the future.

or

SLA calculation and bank holiday identification was so broken it's ridiculous!

I have a confession to make. I have been creating unit tests for some of the code I have been writing recently.
The purpose of writing these tests has been to make my task easier and to try and ensure that the code I have written does not contain any preventable errors.
I think it has worked:
  • Of the code I have written unit tests for, none has failed testing. (So far.)
  • I have saved time by not having to repeatedly perform manual checks.
  • I have removed a lot of the opportunity for human error in reviewing my manual testing.
  • I have created a resource which will save time in the future, if and when any code in this part of the system is changed and the tests reused.
To show how unit testing has helped me, I'll use CCP15852 (errors in SLA calculations) as an example.

The calculation of dates and times as part of an SLA is non trivial.
(In the last 9 months there have been 6 changes (by 3 other developers) to this code, to try and get it to work correctly. That's a pretty good indication that this isn't simple to get right. – Or test that changes made have fully corrected the problem.)

Please note that this is not a criticism of those who have previously written or tested code in this area. I am simply trying to highlight that this is a complicated area which is difficult to test and our current practices have proved inadequate.

There is a lot to consider in SLA calculation:
  • Non working days
  • 24 hour working days
  • Working days for specific hours
  • Not working on bank holidays
  • Working on bank holidays
  • Starting on a non working day
  • Starting on midnight on a 24 hour day (beginning and end)
  • Starting on a bank holiday
  • Starting before the start time on a day with specific working hours
  • Starting on the time a specific working period begins
  • Starting on the time a specific working period ends
  • Starting after a specific working period ends
  • Calculations limited to the same day
  • Calculations spanning multiple days
  • Combinations of the above (E.g. Starting at midnight on a 24 hour working day, adding 112 working hours, but not working the next two days, then having a bank holiday and then working between 9 and 5 on the two days after that, before going back to a 24 hour working day!)

All this adds up to a very large number of possible situations to test.

In making the changes for this CCP, I started out by creating some tests to find out where the errors in the code were. Which calculations were affected, etc.

In total I ended up creating 155 tests which performed a total of 580 checks.
(All these tests could be performed in around 10 seconds.)

When running these tests against the original code 94 tests failed. A pass rate of just 39%.

The originally reported problem was with using the '24 Hour day' setting in a 'Working week'.
It was assumed that calculations based on days with a start and end time were being performed correctly.
I identified 56 tests (of the 155) which covered SLA calculations using such days.
Of these 56, 36 failed. (A pass rate of just 35%)


Admittedly, many of these tests are edge cases so it is unlikely that end users would see only 39% of calculations being performed correctly. It is concerning, however, that of all the different situations that need to be accounted for, more than 3 in every 5 will be done incorrectly.

As soon as I started adding tests for dates which are affected by bank holidays it became apparent that the code to determine if a date was a bank holiday was also broken.
How broken? Well...
If there was only 1 bank holiday in the system, it would never be found.
If there were 2, only 1 would ever be found. (The date in position 1)
If there were 3, only 1 would ever be found. (The date in position 2)
If there were 4, only 1 would ever be found. (The date in position 2)
If there were 5, only 1 would ever be found. (The date in position 5)
If there were 6, only 3 would ever be found. (The date in positions 1, 3, 5)
If there were 7, only 3 would ever be found. (The date in positions 2, 4, 6)
If there were 8, only 3 would ever be found. (The date in positions 2, 4, 6)
If there were 9, only 3 would ever be found. (The date in positions 2, 4, 6)
If there were 10, only 2 would ever be found. (The date in positions 3, 5)
If there were 11, only 6 would ever be found. (The date in positions 1, 3, 5, 6, 9, 11)
If there were 12, only 6 would ever be found. (The date in positions 1, 3, 5, 6, 9, 11)
If there were 13, only 6 would ever be found. (The date in positions 1, 3, 5, 6, 9, 11)
If there were 14, only 6 would ever be found. (The date in positions 1, 3, 5, 9, 11, 13)
If there were 15, only 7 would ever be found. (The date in positions 2, 4, 6, 8, 10, 12, 14)
If there were 16, only 7 would ever be found. (The date in positions 2, 4, 6, 8, 10, 12, 14)
If there were 17, only 7 would ever be found. (The date in positions 2, 4, 6, 8, 10, 12, 14)
If there were 18, only 5 would ever be found. (The date in positions 3, 5, 9, 11, 13)

(I only checked with up to 18 bank holidays in the system, but there is no way that it would magically start being able to find all dates if there were more.)

Clearly the number of records in the system and which position the one that was being searched for appeared in that list affected whether it would be found, or not.

Obviously the smaller number of bank holidays in the system is less likely to be an issue as we ship with 2 years worth of values in the database, but as the number of records gets bigger, there are still large numbers of records being missed.

To get the above results (on the IsBankHoliday function) I created 172 tests (those listed above plus no bank holidays in the system at all) with each test performing 1 check.
All these tests could be performed in around 20 seconds (It takes longer than the above tests because of all the database changes made in setting up and restoring the database.)
Of these tests, originally 103 failed. Only a 40% pass rate.

Of all 327 tests created, 197 failed when using the original code. A pass rate of only 39%.

I didn't avoid manual testing altogether though. I did still use the UI to test that the results reported by the tests matched what was displayed in the program. But I only had to do this at the end once I was confident all the calculations were correct.

So how much did writing these tests help?

Well, the process of sitting down and listing all the things (and combination of things) to test caused me to identify a large number of situations to test.
When writing some of these tests it prompted it to think of other situations the code had to account for, and which I might not have originally considered.
When some of the tests failed, it highlighted other situations which should also be tested.
The act of thinking about writing tests helps identify more tests. This leads to more bugs being found, which leads to fewer bugs being shipped.

(Some) opportunity for human error was removed.
Doing lots of SLA calculations manually can be very mentally taxing. As more are done, the opportunity for error increases.
Creating tests meant that the calculations only had to be performed once and the computer could check that what was returned was what was expected.
Manually checking that lots of similar dates and times are the same and performing lots of similar, but non trivial calculations, are tasks which it is easy make mistakes in.
Time was saved.

How much time was saved?
As I ran all the tests many times, I would say that I easily performed over 1000 tests (I actually think this is a very conservative estimate.)

To perform these tests manually would involve:
To test SLA calculation:
  • Log a call entering all details as needed, including setting the SLA start time.
  • Save the call.
  • Load the form to view the calculated times.
  • Check the times are as expected.

To test bank holiday identification:
  • Set the right number of bank holiday entries in the database (deleting and adding as required)
  • Perform the test (as above) to test the SLA calculation, but over the required bank holiday.

I estimate it would take an average of 1 minute to do each of the above tests.

That adds up to nearly three days of manual testing.

Even if it was only necessary to do half as many tests manually and they could be done twice as fast, it would still take the best part of a day.

Or look at it this way. Let's say someone spotted something in the code when doing a code review. It's just a minor change but we want to be sure that in making the change nothing else has been broken.

There are now 300+ tests to do to make sure the code runs as intended. If you are going to manually test them (at 30 seconds each) it would take at least 2.5 hours.
Or I can run my unit tests and be done in 30 seconds.

I have an opinion on which I think is best for the speed and quality of product development. Not to mention tester sanity.

What can we learn from this?

Based on previously shipped software, it is not possible (or practical) to manually test that SLA calculations are performed correctly, in all circumstances, when just using manual testing. (This will inevitably also apply to other parts of the system.)
Performing unit testing is faster then manual testing.
Having unit tests makes regression testing much faster than relying entirely on manual (re)testing.
Unit testing makes it easier to ensure the accuracy of the code written, leading to fewer bugs being included in released software.
In the short term, unit testing does not greatly add to the developer’s workload.
In the long term, unit testing saves a great deal of time. (For developers and testers)
Unit testing is a tool which (when used appropriately) can help us improve the quality of the shipped product and help us with the issues of increasing workloads and software with increasing quantities of existing code.
Obviously, it is not appropriate to write unit tests for all code, and it is not intended to replace manual testing. It is simply an available tool and I think we are making things harder for ourselves by not using it.

I am aware that I have raised the issue of unit testing before.
I do so again to make sure everyone is clear on the benefits of its use.
If it is decided that we still have no desire to incorporate unit testing as part of our development process I will not raise the subject again.

Thoughts?
Comments?

Wednesday, April 02, 2008