Featured Article I
Saving the World One Project at a Time:
Planning by the Numbers
|
By: Jim Mayes (IT Metrics Strategies,
Cutter, December 2000)
|
"Your history
will help you better manage the future." -- Michael Mah [1]
Recently, as I
described a software project risk assessment that I had done, Michael Mah
remarked rather facetiously: "Jim, you're just saving the world, one
project at a time!" I may not be "Saving the world before bedtime!" as
the "Powerpuff Girls" do, but I do believe that project level metrics,
when applied to project planning, can have a significant impact on the
outcome of individual software projects. "Planning by the Numbers" (PBN)
actually illustrates how the principles outlined in Michael Mah's ITMS
article "Sizing Up Your Promises and Expectations" [1] can be put into
practice.
PBN is the
concept of using historical project level metrics to aid the project
manager (PM) in discovering facts about software projects. While not as
simple or cookbook as "painting by the numbers", the PM with support from
a Metrics Specialist, can use this information for project planning,
validation, and risk assessment. In many ways I do believe that the
Metrics (or Estimation) Specialist performs the role of "Chief Memory
Officer"; that is, someone who remembers the past, so that we can avoid
the same mistakes in the future. [1] As a software development manager, I
became interested in software metrics for providing decision support
information related to my projects. Software metrics provide many
benefits for such things as managing outsourcing contracts, process
improvement, productivity benchmarking, and balanced scorecards; however,
you would be missing one of the greatest benefits, if historical software
metrics were not used for making decisions and planning projects.
The analysis
of project level data has taught me some lessons over the years, which I
will share. All of this analysis was done using QSM’s SLIMtm
tools. This includes the primary components of the PBN process, which
are to:
1. Collect, slice
and dice internal historical project data
2. Normalize the
software project data for comparison
3. Trend the data
4. Analyze the data
to discover facts
5. Use the analysis
results for planning software projects
Collect, Slice,
and Dice the Data
To begin with,
data must be collected for internal historical projects. This can be done
via a productivity benchmark [2, 3] or this could be accomplished by less
formal methods; however, the data must include the SEI CMM core measures
of size, effort, schedule, and defects [1]. Data consistency is the key
to having a good decision support database. In order to insure
consistency, document the rules in a data dictionary. Also, when building
an internal historical project repository, the project data points should
be categorized. Allow for enough stratification such that apples to
apples project comparisons can be made if needed; such as, project type,
environment, user organization, development organization, language, etc.
Next, the
Metrics Specialist needs a "slicer and dicer," with the capability to
extract and analyze the data statistically. This can be accomplished
using software metrics tools such as QSM’s SLIM-Metricstm
and DataManagertm. Specific project data can be
compared to the entire database, specific subsets of data, or both.
Ideally, if you have access to industry data (QSM, IFPUG, etc.),
benchmarking specific project data against industry data is also
beneficial as an additional decision point. However, the benefit of
having internal historical project data cannot be overemphasized.
Normalize the Data
One of my
metrics discoveries was the importance of normalizing project size
data for statistical trending. This is not to argue the merits of Source
Lines of Code (SLOC) versus Function Points (FP), since I use both
measures. However, my objective is to describe how I have been using
these measures, continuously improving the results as I refined my
techniques by using discovery metrics. [7] There is no perfect software
sizing measure; therefore, I use the best aspects of both the SLOC and FP
measures, which in my opinion achieves better results, as long as you use
consistent counting methods. I primarily use FP's for sizing project
estimates. Because FP's reflect the user view of a software project, a FP
count can be accurately determined based upon the software requirements.
As the project requirements change, the FP count can be updated to reflect
changes in project size.
I first tried
using Function Points as the size component for statistically trending
historical project data. However for this purpose, I discovered that
there is not a strong convergence of data, unless the project size values
are normalized based upon programming languages and code mix. Therefore, I
convert FP's to SLOC. This can be done using Industry Factors such as
those provided by SPR (Capers Jones) or as illustrated in the September
2000 issue of ITMS [1], but preferably you should use
SLOC/FP factors that have been internally calibrated. I have also found
that it is important to normalize project size, in relation to the effort
required to produce the code (new, modified, reused, tested, etc.), for
determining the productivity rate to use for a project estimate.
In the early
1980's, Robert Tausworthe of the Jet Propulsion Laboratory of the
California Institute of Technology, determined a relationship between the
effort to develop new code and the effort to modify code. Basically, the
theory suggests that if the effort to develop a line of new code is taken
as unity, then the effort to modify a line of existing code is some
fractional value. The values that Tausworthe found on a number of JPL
rehosting contracts are shown in Table 1. [4, 5] I do not always use the
full range of Tausworthe factors, simplifying it most of the time to using
factors of 1.0 for new/conversion code, and .24 for modified/deleted code
(.24 is the average of all of the effort ratios except for "New Code").
Also, if the project involves a substantial amount of regression testing,
as is the case with COTS (Commercial-Off-The-Shelf) software customization
and integration projects, then it is important to apply the "Tested" ratio
of .12 as well, to all of the unmodified code.
I apply the
Tausworthe factors to the SLOC/FP language conversion factors and not to
the Function Points. This allows a normalized ESLOC comparison for
trending a wider range of projects (new development, enhancement, and COTS
integration) as shown in the scatter charts later in this article. It
also allows for additional analysis using Function Points, which have not
been normalized, for analyzing the cost of the functionality delivered.
Based upon my experience, this is where I believe the distinction lies
between using FP's versus ESLOC. FP's provide a better measure of the
functionality delivered, and the related cost of that functionality for
metrics such as Cost per FP. ESLOC provide a better productivity view to
be used for estimation, risk analysis, and project data trending. I am
sure that others have their own views on this matter based upon their own
experience, just as my views are based upon my experience. Just make up
your own mind as to what works best for you, and that you can substantiate
using your own discovery metrics.
|
This
table shows six ways in which existing code can be modified. For each
type of modification the ratio of the effort-to-modify to
effort-to-code-as-new is given. Note: The number of lines of code
added, changed, and deleted are subsets of the number of lines
reused. Therefore, their sum must be less than or equal to the number
of reused lines. [5] |
|
Type of
Modification |
Effort
Ratio |
|
New Code:
subject to entire development process |
1.0 |
|
Reused:
the lines of code in modules that will be reused, but will be modified
by additions, changes and deletions. |
.27 |
|
Added:
the lines of code to be added to reused modules. |
.53 |
|
Changed:
the lines of code in the reused modules to be changed. This effort is
typically less than the effort to add lines. |
.24 |
|
Deleted:
the lines of code to be deleted (line by line) from reused modules |
.15 |
|
Removed:
the lines of code to be removed in modules or programs as whole
entities. Testing must takes place to check reused modules that
interface with the removed modules. |
.11 |
|
Tested:
the lines of code from the unmodified but reused modules which
required no modifications but still exist and require testing with new
and modified software. |
.12 |
Table 1: Software Project Size
Normalization Effort Ratios
Trend the Data
For longer
than I care to admit, I used averages for productivity calculations
related to software project estimates. Then as I began benchmarking my
estimates against project data trended for size, I noticed a great
disparity related to the project productivity averages and the project
size trends.
For example:
As shown in Figure 1, I selected data for 19 projects from my repository,
similar to the project that I was estimating. (NOTE: The charts and
values used in this article are for illustrative purposes only, and are
not related to any specific data set; however, they are based upon actual
findings.) The average productivity rate for these 19 projects, using
the QSM Productivity Index (PI) scale, was 12.9. When I plotted my
project estimate on a productivity graph trended for size, I noticed that
the average productivity was 33% lower than it should have been for a
project of that size. The 3 project projects (A, B, and C) highlighted on
Figure 1, of similar size, type, and the same development team, had a
average PI of 19.1. Therefore, I used the trended PI of 19.1 for my
estimate instead of 12.9. It was still an average, but it was a trended
average.
There are many
reasons why project size affects productivity. One reason is that there is
a smaller percentage of effort allocated to overhead for larger projects
than there is for smaller projects. Larger projects also usually take
longer, which is also a factor, due to the effort/schedule ratio.
However, there is a limit that is reached where bigger projects are no
longer better, and there is an optimal effort/schedule ratio, as we learn
in the following section.

Figure 1: Using Trended Data Relative
to Project Size
Analyze to Discover
Facts
Data analysis
can be very enlightening and exciting as we discover new information. We
can discover the answers to such questions as, How can we get more "bang
for the buck" from our software projects? How do we increase business
value? How do we staff our projects relative to schedule, in order to
optimize the business value? With regard to the first two questions, one
of the ways to do this is to increase productivity. Therefore, we may
want to discover the effect that project size has on productivity. This
can be analyzed and illustrated as shown in Figure 2.
The trend
shown in Figure 2 illustrates the "Bell Shaped Curve" that I discovered in
relation to project size versus productivity. The data is on a log-log
scale, which causes the trend lines to appear linear. However, you will
notice that the data points form a curve represented by the dashed line
that was added to Figure 2 to reflect this tendency. As the size
increases, the productivity increases; however, it reaches a size where
productivity starts declining. Specifically, prior to the 30K ESLOC mark,
the data points consistently trended upward, then they leveled off. After
the 150K ESLOC mark the data points displayed an overall downward
tendency. With this analysis, I discovered that the projects sized
between 30K and 150K ESLOC provided the best "bang for the buck" and
business value. This size range also has the highest convergence of data
points within +/-1 SD.
Another fact
that we may want to discover is the optimal relationship between effort
and schedule associated with the projects in the optimum size range.
Using the Putnam Manpower Buildup Index (MBI) scale, where MBI = Total
Effort/(Development Time)3 [5, 6], Figure 3 illustrates how we
can analyze the appropriate ratio of effort to schedule. The MBI is a
measure of schedule compression where projects with lower MBI's are less
compressed than projects with higher MBI's, and less compression equates
to a lower cost and higher quality. In this case, using internal project
data, the optimum MBI is in the 4 to 6 range. The MBI also seems to
converge more for the projects in the optimum size range.

Figure 2: The Bell Shaped Curve
Related to Project Size and Productivity

Figure 3: Analyzing Optimal
Effort/Schedule Ratio Using Putnam MBI Scale
Use the Results for
Planning
The primary
benefit of PBN is for project planning. We can put to use the information
that we learned during data analysis for developing software release
strategies, determining whether estimates are reasonable, and for risk
assessment. The PBN process supports the use of trended data for this
purpose. When I prepare a project estimate for an internal project or
validate a vendor estimate, I want to see how it compares to the
historical project data trends. For example: I had a client tell me that
my estimate seemed high for a 250K ESLOC project; however, when I plotted
the data point on a trend chart it was within the ballpark for a project
of this size. In fact, the cost was right in the middle of the trend.
Therefore, the client was much more inclined to accept my estimate. This
concept would also apply to a vendor estimate that was being validated in
this manner; such that if the vendor cost was in the high range on the
chart, it could be considered unreasonable. Estimate validations and risk
assessments are done in much the same way. The following examples
illustrate how the PBN concepts are used for project planning;
specifically, Example 1 illustrates planning software releases, and
Example 2 illustrates estimate validations and risk assessments:
Example 1 -
Project Planning: Strategy Analysis
Figures 4 and
5 compare monthly versus bi-monthly release strategies for an internal
client. These illustrations were used for encouraging the client to
consider alternatives that would provide greater business value. Table 2
documents the results of this analysis, which indicate that for the same
yearly cost (effort) and a 35% longer schedule per release, that the
yearly code output can be increased by 267% and the quality by 75%, just
by changing from monthly releases to bi-monthly releases. The quality
improvement is significant since it reduces the resources required for
error correction between releases. Notice that the bi-monthly release
strategy uses a project size within the optimal size range illustrated in
Figure 2, and that the MBI is in the optimal MBI range illustrated in
Figure 3.
|
Release
Strategy Analysis |
|
Assumptions - Monthly Releases (Figure 4) |
* Monthly releases are based on an average of historical small
releases; i.e., 5.2 months duration, 6,000 ESLOC, 6.7 MBI, 47 Staff
Months (SM) of effort, .63 Mean Time to Defect (MTTD), including the
Planning, Analysis, and Main Build Phases.
* Illustrates 12 monthly releases that complete during a calendar
year. |
|
Assumptions - Bi-Monthly Releases
(Figure 5) |
* Bi-monthly releases are based on an average of historical larger
releases; i.e., 7 months duration, 44,000 ESLOC, 5.4 MBI, 95 Staff
Months (SM) of effort, 1.1 Mean Time to Defect (MTTD), including the
Planning, Analysis, and Main Build Phases.
* This illustrates 6 Bi-monthly releases during a calendar year |
|
Conclusion |
Six Bi-monthly releases during a calendar year could produce 267% more
code than 12 Monthly releases, same yearly effort/cost, 35% longer
schedule per release, & 75% better quality:
* Twelve Monthly Releases produce 72,000 ESLOC and require 564 SM’s of
effort.
* Six Bi-monthly Release produce 264,000 ESLOC and require 564 SM’s of
effort.
* Individual Bi-monthly Releases are 7 months duration each, versus
5.2 months each for Monthly Releases
* MTTD (at deployment) for each Bi-monthly release is one defect every
1.1 days, versus one defect every .63 days (1.6 defects per day) for
each Monthly release. |
Table 2: Release Strategy Analysis
(Figures 4 and 5)

Figure 4: Release Strategy Analysis -
Monthly Releases

Figure 5: Release
Strategy Analysis - Bi-monthly Releases
Example 2 -
Project Planning: Estimate Validation and Risk Assessment
This example
is divided into two parts, illustrating how estimate validations and risk
assessments were performed for the SRO and GPA projects. But first, it is
important to understand the concepts of validation and standard
deviation. Validation is the concept of assessing the reasonableness or
risk of an estimate, using trended historical project data. The same
concept would apply when validating an internal estimate or a vendor
estimate. [8] With both validation and risk assessment we use the
statistical trending measure called standard deviation (SD), which is the
measure of variation equal to the square root of the variance. Figure 6
shows the percentage of projects that would be within the SD trend lines;
that is, +2% (+3 SD), +14% (+2 SD), +34% (+1 SD), -34% (-1 SD), -14% (-2
SD), and -2% (-3 SD). Therefore, roughly 68% of projects are within +/-1
SD, 96% are within +/-2 SD, and 100% are within +/-3 SD. Average values
fall on to the "mean" or center trend line. The values outside +/-3 SD
would normally fall within what Putnam and Myers call the "Impossible
Zone" [5, 6].
When estimates
are plotted for validation and risk assessment as shown in Figure 6,
projects within the +3 SD would have a value higher than 98% of the
projects being compared. Likewise, project estimates within the -3 SD
range would have a value lower than 98% of the historical projects; +2 SD
would be higher than 84%; -2 SD would be lower than 84%; +1 SD would be
higher than 50%; and -1 SD would be lower than 50%. These values are
usually best compared on a log(x) size-linear(y) scatter chart, depending
on the data set size and value range. Project estimates within the +/- 3
SD range could be considered extremely high or low risk depending on the
values being compared. Likewise estimates that fall within +/- 2 SD could
be considered high risk. Estimates falling within +/- 1 SD would stand
the highest probability of success depending on how close to the mean line
they fall. Another factor would also be how the estimate compares to
projects of same size, type, and development team, plotted on the trend
chart.
SRO Project
Validation and Risk Assessment
Figures 6, 7,
and 8 illustrate the risk assessment of an estimate based upon
customer-constrained requirements and schedule. This was done to validate
concerns of the development team, determine the project's viability, and
illustrate the risk to the customer. The data points and trend lines
shown in Figures 6, 7, and 8 are based upon projects similar to the SRO
project. Also, three projects (D, E, and F) of similar size and type are
highlighted, which were previously completed by the development team
assigned to the SRO project. The data analysis indicated that the
productivity required to complete the project, as constrained by the
customer, would be 39% higher than achieved on the 3 similar projects
completed by the development team. Also, Figure 6 shows that the estimate
falls within the +2 SD range; i.e., a production rate of higher than 84%
of all similar projects must be achieved.
Figure 7 shows
that the Main Build Phase (Detailed Design, Construction, and Testing)
must be completed faster than 84% (-2 SD) of all similar projects, and 81%
faster than the three highlighted projects completed by the development
team. This is also confirmed in Figure 8, which illustrates that the SRO
project estimate MBI of 9, is higher than 84% (+2 SD) of all similar
projects and 34% higher than the highlighted projects. This indicates that
the project has an extremely compressed effort/schedule ratio. All of the
analysis results illustrate that the project is high risk; however, as the
saying goes, "The greatest risk, is not taking one." In this case, the
project had to be done. The difference is that with the results of this
analysis, the customer and the PM were now in agreement as to the risk,
and could plan strategies for mitigating and managing the risk.
If you or the
customer choose to accept the challenge of a high risk project, it is best
to know what you're in for, so that you don't wind up in the "impossible
zone". If at all possible, it is best to plan projects such that the
project values fall within +/-1 SD, in order to allow some room for the
unexpected. It is the same concept as not planning projects with a lot of
overtime already factored into the effort when you have a limited staff.
This would not allow any overtime to catch up, or handle tasks that were
inadvertently omitted from the project plan. It is also important to
leave some room for scope changes, as more is known about the project.

Figure 6: SRO
Project Estimate Productivity Comparison to Similar Projects
Figure 7: SRO Project Estimate Main Build Duration
Comparison to Similar Projects

Figure 8: SRO
Project Estimate MBI Comparison to Similar Projects
GPA Project
Validation and Risk Assessment
Figures 9 and
10 illustrate the validation of a vendor estimate. The GPA project
validation estimate was created using a PI for the MB Phase based upon the
three projects highlighted in Figure 1, of similar size and type that
were completed by the same development team. The MB phase was considered
reasonable. However, the vendor estimate had a Planning Phase duration of
3.5 months and an Analysis (detailed requirements and high-level design)
Phase duration of 3.5 months, running consecutively with no overlap of the
MB phase. This was actually done because the customer wanted Planning and
Analysis completed within 3.5 months. This project was 8154 FP's or 430K
ESLOC. My historical project data showed that projects of similar size,
type, and development team, normally required a Planning duration of 26%
of the time required to complete MB, and an Analysis duration of 48% of
the time required for MB.
Figure 9
illustrates that the GPA vendor estimate Planning phase duration is faster
than 84% of all similar projects and twice as fast (3.5 months versus 7
months) as the GPA validation estimate, based on the development team's
historical project data. Figure 10 illustrates the same trend; that is,
the GPA vendor estimate Analysis phase duration is faster than 84% of all
similar projects and twice as fast (4 months versus 8 months) as the GPA
Validation Estimate. Also, 75% of the Planning phase normally overlaps the
Analysis phase and 68% of the Analysis phase normally overlaps the MB
phase. Planning the project with Planning and Analysis phases of only 3.5
months with no overlap of MB, would indicate a High Risk for completion
within a 3.5-month time frame.
This was
discussed with the customer and vendor. It was determined that the
Planning and Analysis Phases would be planned based upon historical
project data. The PM developed a risk mitigation strategy such that
detailed design could still begin within 3.5 months, which was the primary
concern and the reason that the Planning and Analysis phases were being
cut short in the first place. This was done by using an iterative
development methodology; that is detailed design could begin on each
module as soon as the detailed requirements (Analysis) were completed for
that module.

Figure 9: GPA Project Planning Phase Validation and Risk
Analysis

Figure 10: GPA Project Analysis Phase Validation and Risk
Analysis
Conclusion
The one thing
that "planning by the numbers" and "painting by the numbers" do have in
common is that you must have the numbers in order to start. It is
extremely important to collect internal historical project data, which is
a critical success factor for the PBN process. Using the premise of
discovery metrics, where we discover facts about our world and use those
discoveries, we are better prepared to deliver successful software
projects. [7]
At a
conference several years ago, I heard a parable that is particularly
appropriate. It goes like this: There were two hunters who chartered a
pontoon plane to take them moose hunting near a lake in Alaska. The pilot
told them as he dropped them off: "I'll pick you up in one week, but you
can only bring one moose on the plane, due to the weight limit." A week
later, the hunters were waiting and had two moose. The pilot exclaimed:
"I specifically told you that you could only bring one moose on the
plane!" To which the hunters replied: "We know, but the pilot last year
let us bring two moose." Not to be outdone, the pilot reluctantly agreed;
so they stuffed the two moose aboard. The plane took off, barely clearing
the trees on the other side of the lake, and then "BAM", they hit the
middle of a mountain. It was a mess, but luckily everyone survived. As
the hunters pulled themselves up off of the ground, they looked at each
other and said: "Well, at least we made it further up the mountain than we
did last year!" The moral of this parable is to stop running into
mountains with your software projects. By learning from the past, you too
can "save the world one project at a time."
References
[1] Michael
Mah, "Sizing Up Your Promises and Expectations," IT Metrics
Strategies, Cutter Information Corp., September 2000.
[2] Michael
Mah, "IT Organization, Benchmark Thyself," IT Metrics Strategies,
Cutter Information Corp., March 2000.
[3] Michael
Mah, "IT Organization, Benchmark Thyself [Part 2]," IT Metrics
Strategies, Cutter Information Corp., April 2000.
[4] Robert C.
Tausworthe, "The Work Breakdown Structure in Software Project Management,"
The Journal of Systems and Software, 1980.
[5] Lawrence
H. Putnam and Ware Myers, "Measures for Excellence", Yourdon Press, 1992.
[6] Lawrence
H. Putnam and Ware Myers, "Industrial Strength Software", IEEE Computer
Society Press, 1997.
[7] Jim Mayes,
"Achieving Business Objectives II: Building a Software Metrics Support
Structure," IT Metrics Strategies, Cutter Information Corp.,
June 2000.
[8] Jim Mayes,
"Achieving Business Objectives: Balancing Time, Cost and Quality,"
IT Metrics Strategies, Cutter Information Corp., March 2000.
Return to Top
|
|
Mayes Consulting, LLC
Quantitative Software Engineering
Copyright (c) 2001 Mayes Consulting |
1605 Kinsmon Lane Marietta, GA 30062 770-649-8599 or 404-754-2707
jimmayes@bellsouth.net
|
|