In agile software development you want to travel as light as
possible, and the easiest way to do that is to choose
the best artifact to record information. I use the term
"artifact" to refer to any model, document, source code,
plan, and so on created during a software development
project. Furthermore, you want to record information as
few times as possible, ideally only once. For example,
if you describe a business rule in a
use case, then describe it in detail in a
business rule specification, then implement it in
code, you have three versions of the same business rule
to maintain. It would be far better to record the
business rule once, ideally as human-readable but
implementable code, and then reference it from any other
artifact as appropriate.
Why do you want to record a concept once? Three
- Reduce your maintenance burden. The
more representations that you maintain the greater the
maintenance burden because you'll want to keep each
version in sync with each other.
Figure 1 depicts a typical
approach to traditional software development
artifacts. The letters in each artifact
represent a piece of information stored within it.
For example, information A (perhaps our business
rule mentioned above) is captured in the
requirements document, test model, and source code.
- Reduce your traceability burden.
copies the greater your traceability needs will be
because you'll need to relate each version to its
alternate representations, otherwise you'll never be
able to keep them synchronized when a change does occur.
Yes, AM advises you to update only when it hurts but
the more copies you have of something the more likely it
is that it will start to hurt earlier.
- Increase consistency. The more
copies you have the greater the chance you will have
inconsistent information because you very likely won't
be able to keep the versions synchronized.
Figure 1. Traditional
software development artifacts.
It's interesting that traditional processes typically
promote the recording of technical information, such as
representing business rules three different ways. At
the same time they'll also prescribe design concepts
such as normalization and cohesion which lead you to
develop a design which implements concepts once. For
example, the rules of data normalization motivate you to
store data in one place and one place only. Similarly,
in object-oriented and component-based design you want
to build cohesive items (components, classes, methods,
and so on) that fulfill only one goal. If this is ok
for your system design, shouldn't it also be ok for your
software development artifacts? We clearly need to
rethink our approach.
Views, Not Copies
It is clear that you should store system information
in one place and one place only, ideally in the most
effective place. From the point of view of software
development this concept is called
in the programming world
an important aspect of
programming, and in the technical documentation
world it is called
single sourcing. With single sourcing the idea is
to record technical information only once and then
generate various documentation views as needed from that
information. In the case of the business rule example
above it would be recorded using some sort of business
rule definition language. A human readable view would
be generated for your requirements documentation (this
is easier said than done, by the way, but for the sake
of argument let's assume that it's possible) and an
implementation view generated which would either be run
by your business rule engine or compiled as application
Figure 2 depicts a strategy
for single sourcing all of the information contained in
Figure 1, then automatically
generating the original artifacts of
Figure 1 through the use of a generator. The
views are generated on an as needed basis from the most
recent source information, ensuring that they are
up-date as of the generation time. .
Figure 2. An ideal approach
to single sourcing information.
|An important implication of Figure
2 is that although the information is stored in a
single place, it can be rendered in multiple ways for
different audiences. This is called the
Locality of Reference Documentation Principle.
End users will need to see information in a different
format than programmers, for example. Some people
prefer to see diagrams whereas others prefer information
in textual form. Just because information needs to be
viewed, and worked with, in multiple ways doesn't mean
that it needs to be stored multiple times. Just as
you build working software from your source code base,
you would "build" your documentation views from your
single-sourced information base.
Figure 3 depicts a far more
realistic approach. Although you would like to
store information in one place and one place only, the
reality is that your toolset may not allow you to.
Also, because you are only human you are going to make
mistakes and record the same information twice.
Also, Figure 3 shows a common
situation in software development: although a lot of
your documentation can be represented in source code,
for example using JavaDoc comments in Java, some
critical information will still be stored as external
documentation. It is very common for agilists to
have concise system overview documentation, release
notes, and user guides for example.
Figure 3. A realistic
approach to single sourcing information.
Traditional Single Sourcing
To make the traditional single sourcing vision work you need a
common way to record information. The
Darwin Information Typing Architecture (DITA) is an
XML-based format which is promoted for single sourced
technical documentation. There is nothing stopping
you from creating your own storage strategy: single
sourcing is often approached in a top-down manner with
the data structure for the documentation is typically
defined early in a project. The primary challenge with
traditional single sourcing is that it requires a
fairly sophisticated approach to technical
documentation. This is perfectly fine, but
unfortunately many organizations aren't yet able to
achieve this vision and find that they need to back away
from the approach. This doesn't mean that you need to
throw out the baby with the bath water: you should still
strive to normalize all of your software development
Agile Single Sourcing
There is no reason why you couldn't take a more agile
approach where the structure of your system artifacts
emerge over time. This is where the AM practice
Single Source Information comes in. When you
are modeling you should always be asking the questions
"Do I need to retain this information permanently?", "If
so, where is the best place to store this information?"
and "Is this information already captured elsewhere that
I could simply reference?". Sometimes the best
place to store information is in an
agile document, often it's in source code.
There are several AM principles and practices which
support agile single sourcing. They are:
Executable specifications. This is one of
the easiest approaches to single sourcing
information to understand, and often one of the most
productive. By taking a
test-driven development (TDD) approach at both
the requirements level your
customer acceptance tests are not only tests
they are also requirements specifications.
Similarly, with TDD at the design level your
developer tests form the majority of your detailed
Apply the right artifact(s). Technical
information should be captured using the most
appropriate artifact, be that a hand-drawn sketch, a
detailed data model, a use case specification, or
Model with a purpose. You should know why
you are creating an artifact, know who it is for,
and how they're going to work with it. If you don't
understand these three factors then you are very
likely to record more information than you need out
of fear that someone, somewhere, and at some time
may need it. This leads not only to over
documentation but very likely to the unintentional
capture of the same information several times over.
collective ownership. Only when everyone
has access to a shared collection of artifacts is it
possible to capture information in the right place
once and once only. If some people do not have full
access then they are motivated to capture their own
version of the information.
- Build teams of
generalizing specialists. When teams are
made up of specialists who only know how to work
with a small subset of artifacts you will often
capture the same information in several places. For
example, if your team has expert business analysts,
expert coders, and expert testers then each of these
groups will capture a business rule using their own
approaches - perhaps as a UML activity diagram, as
source code, and in the test specification. When
people are generalizing specialists who have one or
more specialties plus a general understanding of the
complete software lifecycle they can work with a
wide range of artifacts, reducing the need to
capture the same information in several places.
Model with others. Effective software
development teams work together in a co-operative
and collaborative manner. When people work alone
they will capture their own version of the technical
information, a version which may be slightly
different than that maintained by their co-workers.
By modeling with others you not only work together
to develop a model or document you also spread
skills and knowledge throughout the team, improving
the chance that your models and documents will be
both consistent and normalized.
Maximize stakeholder ROI. Everyone
on a development team should want to ensure that
stakeholder's money is spent wisely. Not only is
this a good thing to do, it increases the chance
that your stakeholders will want to continue working
with you in the future. Is it really effective to
capture the same information several times, to
increase your maintenance burden, and to increase
your traceability needs? I don't think so.
Other agile techniques which support single sourcing,
at least at the detailed design level, are
database refactoring. With code refactoring you
make a small change to your source code to improve its
design; similarly with database refactoring you make a
small change to your database schema to improve its
design. Many refactorings, such as Extract Method or
Introduce Lookup Table, explicitly increase the
normalization of your system's underlying object or data
It interesting to observe that when an agile team is
made up of generalizing specialists, or at least people
striving to become so, and when they are actively trying
to do the best job possible, that the best place to
store information often proves to be your source code.
This is exactly what many extreme programmers claim,
although in my opinion they've struggled to convey this
message in terms which are palatable to traditional IT
professionals. Many traditionalists claim that if your
documentation is in the source code then it's
effectively lost. What they're really saying is that
it's lost to people who are unable to read source code,
or at least people who don't have tools (such as JavaDoc
perhaps) that can extract critical information from the
source and present it in an alternative format.
Just as it's extremely rare to find a perfectly
normalized relational database I suspect that you'll
never truly be able to fully single source all of your
software artifacts. In the case of databases
performance considerations, and to be fair design
mistakes made by project teams, result in
less-than-normal schemas. Similarly, everyone isn't
going to be able to work with all types of artifacts -
it isn't realistic to expect business stakeholders to be
able to read program source code and the uber-tools
required to support this vision continue to elude us
(and likely always will).