The state of the architecture repository art

This page is published under the terms of the licence summarized in the footnote.


How to manage the structure and behaviour of business systems if you don’t know that structure or behaviour?

How to plan changes that are optimal for the enterprise (rather than suboptimal and localized) if you have no enterprise-wide repository of the system estate?

On the other hand, why is building and maintaining an enterprise-wide architecture repository more challenging than EA frameworks and CASE tool vendors tell you?

(CASE = Computer-Aided System Engineering).


The value of an architecture repository is discussed in other papers, along with some advice.

It is one thing to populate an architecture repository for a discrete solution or migration planning exercise.

It an entirely different thing to maintain an EA-wide repository of the kind envisaged by so many gurus since the 1980s.

In practice, do many enterprises really do this?


Size and complexity. 1

Repository integration. 4

Other concerns. 5

The state of the art?. 5


Size and complexity

Our kind of architecture is very unlike building architecture.

The planning and architectural design of business systems is an odd industry to be in.

The end product of design is not a concrete thing, it is merely description of human and computer activity systems.

The architecture defines system elements (roles, processes, data structures, rules, etc,) and how they relate to each other.

People and computers follow the given descriptions, and perform activities accordingly.


The size and complexity of an enterprise’s business systems – as documented in bottom-level descriptions – is staggering

A large enterprise may rely on a thousand applications (each offering, say, 30 uses cases) containing a billion lines of code.

It may use hundreds of databases, containing thousands of data entities, and tens of thousands of data item types.


A related paper contains a description bv David Eddy of what is contained in a legacy system dictionary/repository.

Let me draw conclusions from comparing this with the Zachman Framework and a TOGAF-style EA repository.


Zachman’s conception was and is of a 6 by 6 taxonomy that maps:

·         “primitive interrogatives” the 6 analysis questions in the column headers below

·         reification - the transformation of an abstract idea into an instantiation” - labeled as in the row headers below.


The upper rows hold 5 successively more abstract system descriptions of the instances in the bottom row

The bottom row is the realisation of the higher level idealised descriptions - as operational system components and processes.




















































The Zachman Framework is only semi-rational. The 6 column headers are naive.

What Zachman means by abstraction from row 6 to row 1 has been poorly expressed and poorly understood.

There is no industry agreement on what abstraction means, and words like conceptual, logical and physical mean.

However, if you’ll allow me to rename the 6 rows as below, that does give us a framework for discussion here.










1 Scope of the system(s)







2 Conceptual models of the system(s)







3 Logical models of the system(s)







4 Physical models of the system(s)







5 Bottommost engineer’s description of the system(s)







6 Instances of elements in the operational system(s)





Points in time



David says the best way to document the reality of systems in row 6 is by reverse-engineering.

That is true. Try it, and you discover how vast and complex system documentation can be.

How to manage this size and complexity?


We take for granted 4.6 billions of years of biological evolution

And c20 years of each human’s learning and education.

We don’t document the parts of people, and write down very little of what people need to know to do a job.

We expect people to work from vague and inaccurate directions, and make it up as they go along where necessary.


We take 50 years of software infrastructure evolution for granted.

We don’t document the parts of platform applications (operating systems or database management systems).

And we rarely document the parts of COTS packages.

At most, we partly document interfaces to those platform and COTS applications.


So what of the business system estate should be documented detail?

Surely, all the bespoke roles and rules of systems that a business controls?

Where to find bespoke code that does work for a business and must be maintained by employees or contractors of that business?

You’d hope to find each application’s source code in some kind of software configuration library.

You’d hope to find what executables are derived from the source code and how they are packaged for deployment to computing devices.


The executables and the source code are far too detailed and too obscure for change impact analysis.

So, by reverse engineering from those, David abstracted a higher level dictionary/repository.

In terms of the Zachman Framework, he created a row 4 physical model, for use by software engineers.


This abstract dictionary/repository still contained 1.7 million inter-related elements.

David says there is no way to create or maintain this documentation without automated discovery and population.


TOGAF’s EA description framework features the Enterprise Continuum, which maps 4 levels of idealisation to 4 degrees of generalisation.

TOGAF’s equivalent of Zachman’s row 4 is called the solution continuum below.





Common Systems

(Fairly generic)


(Fairly specific)


(Uniquely configured)

Requirements and Context





Architecture Continuum (Logical Models)

Foundation Architecture

Common System Architecture

Industry Architecture

Organisation Architecture

Solution Continuum (Physical Models)

Foundation Solutions

Common System Solutions

Industry Solutions

Organisation Solutions

Deployed solutions






TOGAF says its physical models of system elements are “considerably abstracted from solution architecture”.

OK, suppose those elements are described at a level 10 times more abstract than those in David’s physical models?

A TOGAF-style EA repository would still contain, in its solution continuum, 170 thousand elements.

But is anybody actually maintaining an EA repository that detailed? I doubt it.

Which suggests there is something of a disconnect between where the rubber hits the road and the rubber hits the sky.


If David’s enterprise dictionary/repository is 10 ten times more abstract than the implemented code.

Then the highest level architectural documentation might be million times more abstract.

The Catch 22 is this: the more detailed the documentation, the more expensive it is to maintain - and the more obscure it is to anybody but a technician.

The more abstract the documentation, the less useful it is when it comes to doing change impact analysis or productive work.

Repository federation and alignment

Maintaining several levels of documentation in step, in one repository, is hard enough.

Zachman’s proposal to maintain 5 levels of documentation of operational systems (at increasing levels of idealisation) looks impractical.

Further, David Eddy sees not one but a family of partly-related repositories.


The size and complexity of the business system estate is vast.

There are many ways to abstract descriptions from the instances in operational systems.

Different players need repositories with different meta models, not just different levels of abstraction.


Besides David’s bespoke software dictionary/repository, what other repositories might you find?

A CMDB recording computing and network devices, and artefacts deployed to them?

An EA repository documenting, as David put it, "where the rubber hits the sky"?

These two repositories are distant cousins to David's dictionary/repository.

And what else?

Does your HR department maintain a list of roles, mapped to employees?

Does your identity management system record business apps, related to employees who can use them?

Imagine the effort to map the things in each repository together, and then maintain those mappings.

Other concerns


Only the bottom level operational systems are directly testable.

How do we know higher level documentation reflects operational systems?

Reverse-engineering is good idea, but is mostly impractical today.


EA needs repository, content and configuration management technologies and processes.

Traditional CASE tool technology does what it does.

But still, people run into the barriers of the intellectual effort, labour time and cost needed.

Automation of repository maintenance and integration seems to be what is needed.


A CASE tool vendor told me of the need to employ tens of people to maintain the EA repository.

Wikipedia is maintained by perhaps 30,000 unpaid volunteers.

Wikipedia is not coherent or accurate enough for EA, and nobody is running operation systems that correspond to its content.

But two lessons from Wikipedia might be applicable EA CASE tools.

The tool should be open to all, readable by all, and updateable by all.

There must be a human editorial control / governance system - better than Wilkipedia's.


The repository technology surely needs somewhat more Wikipedia-like governance.

Until everybody turns to the repository, and all are motivated to keep it accurate, it won’t be maintained.

The best people (under pressure to do “proper work”) may need sponsorship to maintain it.

The state of the art?

The size and complexity of the business system estate is vast.

There are many ways to abstract descriptions from the instances in operational systems.

Different players need repositories with different meta models, not just different levels of abstraction.


You cannot depend on dedicated repository maintainers; the data volumes are too great and their data entry not reliable enough.

Employees can be expected to maintain some EA-relevant data, such as role name, manager name and base location.

But to populate and integrate the various dictionary/repositories, we need more discipline and automation than the current state of the art.


The mountainous repository Zachman envisages seems impractical today.

In practice, the spotlight is shone on some part of the mountain that needs to be changed.

After the changes, that documentation decays while spotlight shines elsewhere.

With luck, when we come to revisit the first part, we'll find some helpful documentation.

If not, we have to rebuild that part of the mountain to suit the task at hand.


Still, it does helps to understand system theory, and to have tool support for system description.



Copyright conditions

Creative Commons Attribution-No Derivative Works Licence 2.0             10/06/2015 20:49

Attribution: You may copy, distribute and display this copyrighted work only if you clearly credit “Avancier Limited:” before the start and include this footnote at the end.

No Derivative Works: You may copy, distribute, display only complete and verbatim copies of this page, not derivative works based upon it.

For more information about the licence, see