Measuring software productivity

This page is published under the terms of the licence summarized in the footnote.


I interviewed Grant Rule FRSA and Managing Director of Software Measurement Services Ltd.

If you want further information, training, benchmarking, estimating or Scope Management services using COSMIC or any other software measurement methods, try Software Measurement Services Ltd.

Preface by Grant Rule

The interview

Why are Function Points no longer considered a proper measure?



Preface by Grant Rule

The discussion below should not be taken as an argument for avoiding a quantitative approach, nor as advocacy for popular but misconceived ideas such as:

1) ‘story points’ which are defined differently by individuals and teams, and therefore inconsistent between individuals, teams and projects over time.

2) the estimation of project cost based only on the effort input - ignoring outputs.

3) the arbitrary classification of projects based on dubious estimates of effort (e.g. ‘small’ projects take less than 10 workdays of effort, but what is the ‘value’ to be delivered?);

4) the subjective but very popular ‘This is like one we did before’ approach, which ignores the human predilection to view the past through rose-tinted glasses.


Metrics need to be carefully selected.

Using just any old metrics can be worse than using no metrics at all.

The metrics you choose can mislead and trick you into making decisions based on false assumptions.

The only safe basis for decisions is a fact & evidence-based approach to problem solving and decision making.

This applies to developers, engineers, managers, executives, politicians, everyone.

You wouldn’t buy a car, a hi-fi or even a washing-machine without a careful examination of the performance measures.

So why commit to much larger IT investments based on subjective opinion and the promise that ‘It will be all right guv - honest!’?

The interview

Q) I’m very suspicious of published software metrics; so I’d like to ask you some questions.

First, surely real projects tend to be less productive (e.g. I’ve known < 4 fppm) than published averages). But they don’t publish their bad metrics?
A) Agreed.

Only a very limited number of organisations participate in benchmarks and they often include only the ‘better’ of their projects.

And of course, they only ever include projects that run to completion.

The net effect being to suggest a level of performance across the ‘industry’ that is somewhat better than that actually achieved.

Of course, many agilists would claim the reverse… that good, effective projects are too busy delivering value to waste time & effort participating in benchmarks.

So the published performance levels are below that actually achieved.

You pays your money, and you takes your choice.

Recommendation: firms should use their own, local data whenever possible.
Q) I hear large projects are less productive than small ones, but there are not enough metrics from very large projects to trust them.
A) Barbara Kitchenham showed this to be a myth (at an ESCOM Conference some years ago).

Some large projects are more productive than small projects.

Actually, there would appear to be diseconomies … and also some economies… of large and small scale.

Performance is actually governed by the end-to-end network of process steps…not by project size nor by individual capabilities. (Ref: Dr. Edwards Deming.)
Q) I observe that distributed programming tends to be less productive than local programming.

A) Agreed that programming for a distributed system can be less productive (constrained by more non-functional requirements).
Q) Do estimation methods take this into account?
A) Some estimating methods (COCOMO.II.2000 for instance) account for this and other non-functional challenges.

Q) It also seems to me that server-side (business/data layer) coding is generally less productive than client-side (UI layer) coding. Do estimation methods take this into account?
A) Estimating methods (COCOMO.II.2000 for instance) can be used in such a way as to take this into account.

But it does require intelligent use of the method… and separating components that are built using radically different technologies and processes (and teams?) to be estimated separately.

Pretending a system is monolithic when it is not, obviously hides much detail and makes the results more uncertain (less predictable) than not hiding such detail.

Q) It seems to me OOPLs have been less productive than 3GL procedural languages or SQL, but people don’t like to admit this.
A) This is difficult to say one way or another.

I’ve seen Smalltalk be tremendously productive; I’ve also seen Assembler teams be very productive.

Language is only one issue… and by no means the most important.

Where I suspect OOPLs, or more to the point OOA/D principles, win is during the support & maintenance of long-lived systems.

The ‘whole life’ costs are usually multiples of the initial development cost.
Certainly good design is very important to productivity.

But OO-specific principles are less important than general design principles discussed in Software Architecture papers at
Q) Many metrics were gathered when software was hand coded.

Much is now generated using configuration or inheritance devices.

And this generates hugely more lines of code for the same function.
A)  Absolutely agreed.

That’s one reason why Capers Jones has said for many years that ‘anyone using source lines of code measures should be charged with professional misconduct’.

The number of SLOC produced depends on methods, style, language, the use of automated tools, etc.

Source lines of code are not the value the customer wants delivered… they are only a means to an end, not the end (the desired outcome).

So counting SLOC focuses attention on activities not upon results.

That’s why functional size measures are the preferred surrogate of ‘customer value’.
Q) For all these reasons, do you agree it is wise to treat published productivity metrics with caution, if not suspicion?
A) With caution, yes. With suspicion… possibly… there is an especial need to consider the source and the alternate agenda of the source (if any).

But firms need to be encouraged to collect, analyse, use, and publish their data because we need software professionals to take a more data-based, fact-based approach to problem solving and decision making.

Often people want public-domain data before they are prepared to collect their own local data, as for some reason this seems to build confidence that measuring process performance is worthwhile.

Currently, the majority of software development is conducted as a craft, with a mediaeval craftsman’s personalised approach to developing one-off, bespoke solutions.

Engineering it ain’t.

As a result, most (75-80%) organisations (and their projects) score one or less on a 5-point scale of effectiveness.

The remaining 20-25% form a long tail to the right on the performance distribution graph, achieving x4, x5 or even better performance over ‘the norm’.

The endemic poor effectiveness & efficiency exhibited by most firms & projects represents a massive unnecessary cost imposed on all customers and citizens.

It’s time we stopped putting up with this.

It’s time for consumers and citizens to insist on improved performance.

If you are interested in achieving such improvement, then do please join the UK Rightshifting Network on LinkedIn.

You can do so by visiting:


Why are Function Points no longer considered a proper measure?

Q) The cognoscenti tell me Functions Points (FPs) were superseded (many years ago) by the COSMIC FSM Method. What is wrong with function points?


A) Allan Albrecht invented the concept of FP Analysis in 1977/78.

His concept and desired outcome were excellent; no-one argues otherwise.

But the method has a number of flaws, not least being the fact that it breaks basic measurement rules by adding together categories of different things.

The worst problem is the fact that the scale is a non-linear scale, using order categories.

Unhappily, apparently equal steps on the IFPUG FPA scale do not amount to equal steps in functional size!
The use of the weighting tables means that the size of any one function type has to be one of a number of 'steps' and the method puts a threshold on the size of any specific function type.


The result is that:



Q) So what is better about COSMIC FSM?


A) As you would expect over 30 years, many practitioners and academics have put a lot of work into removing the flaws from Albrecht's original concept.

Following an international effort by representatives from some 19 countries, the result was the launch, ten years ago in 1998, of the COSMIC Functional Size Measurement method.

This is a product of a design authority, the COmmon Software Measurement International Consortium ( appears to be a broken link).


The COSMIC FSM Method adheres to all of Albrecht's basic precepts and objectives and more.

It provide a ratio, linear scale of functional size in which equal divisions on the scale give equal differences in functional size.

Additionally, the COSMIC FSM Method does define a recognisable 'FP', that is one Data Movement.
The smallest conceivable and practical change in functional size is 'one COSMIC FP' i.e. one Data Movement.

Furthermore, a key objective for the COSMIC FSM Method was a functional size measurement method that was applicable to a wider range of software-intensive system domains.

This was achieved, and COSMIC is currently being used for business information systems (e.g. in finance, banking, insurance, retail, etc), for real-time embedded systems (e.g. in aerospace, automotive, defence, and combined s/w h/w products of many kinds), for telecommunications systems, for operating systems, etc.


The COSMIC FSM Method has been made available in the public domain.

The Measurement Manual is free and is (or will shortly be) available in 8 or more languages (European, Arabic, Chinese, Japanese, and Turkish).

The method is ISO/IEC 19761 and is recognised as a National Standard in Spain and Japan.
The British Computer Society in July 2006 recognized the COSMIC FSM Method as a 'Technology Award Medallist' in the 'Services' category.


For anyone starting to use functional size measurement, or wanting to get to grips with estimating and/or Outcome-Based Contract Management, I strongly recommend the COSMIC FSM Method.

You can even participate, for free, in the COSMIC/ISBSG Benchmark ( see: ).
Only if your organisation already has a significant investment in old IFPUG FPA data could I recommend the use of IFPUG FPA.

And even then I suggest you consider the true value of old data from old projects and its relevance to future work.
The University of Montreal in Canada provides resources for COSMIC Size Users at .
There is a COSMIC Size Users Group on LinkedIn (join at: ).



Footnote: Creative Commons Attribution-No Derivative Works Licence 2.0

Attribution: You may copy, distribute and display this copyrighted work only if you clearly credit “Avancier Limited:” before the start and include this footnote at the end.


No Derivative Works: You may copy, distribute, display only complete and verbatim copies of this page, not derivative works based upon it.

For more information about the licence, see