A set and type
theory for business system architects
READERS OF THIS OLDER PAPER SHOULD
ALSO READ THE NEWER PAPER TYPES AND
TOKENS
This paper contrasts static sets and strict types (as in basic maths) with the dynamic sets and fuzzy types that appear in business systems.
It proposes that every system description can be seen as a very large, complex and fuzzy type.
Three things
to know about basic set theory
Dynamic sets
in softer sciences
Set and
types in information systems
More
thoughts about sets and types
There are many different set and type theories.
Encyclopaedia Britannica defines set theory thus.
“Set theory is a logic of classes: i.e., of collections or aggregations of objects of any kind, which are known as the members of the classes in question.”
Here (though not in other places) you can think of “class, “set” and collection” as having the same meaning.
Encyclopaedia Britannica highlights two ways to define a set.
“A particular class may be specified either by listing all its members or by stating some condition of membership… the class consists of all and only those things that satisfy that condition.”
Extensional
definition by enumeration – listing set members
For example:
· Planets {Sun, Moon, Mars, Mercury, Jupiter, Venus, Saturn}
· Weekdays {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday}
· Rainbow colours {red, orange, yellow, green, blue, indigo violet}.
· Diatonic scale {do, re, mi, fa, so, la, ti}.
"The precise requirements for an enumeration (for example, whether the set must be finite, or the list may contain repetitions) depend on the branch of mathematics and the context." Wikipedia
Intensional
definition by typification – conditions for set membership
A "type" is an intensional definition – distilling the concept of a set member.
E.g. Planet: a massive body that orbits a star.
It is an abstraction from one or more things we perceive or describe as being similar.
Idealisation triangle |
Types <form> <abstract concepts from> Typifiers <observe and
envisage> Set members |
People loosely say a type defines a set, but that is misleading, since it defines only a set member.
The solar system is a set; it has attributes such as total weight, breadth and number of planets
The description of a planet is type, which says nothing about the attributes of the whole solar system.
Types are named.
Set members |
are instances of |
The type named |
2,3,4,5,6 |
instantiate the qualities of an |
Integer |
2,4,6 |
exhibit the properties of an |
Even number |
Venus, Mars etc. |
manifest the idea of a |
Planet |
The types name is a sign for the type itself.
The definition gives us least a partial idea, model or description of a discrete thing we are interested in.
It describes what is true of an instance, by defining one or more properties to be found in a member of a set.
The conventional way to form an intensional definition is by genus and difference.
Sign |
Type |
|
More general type |
More specific properties |
|
Number |
A mathematical object |
used to count, measure, and label |
Integer |
A number |
with no fractional part. |
Even number |
An integer |
that is divisible by two |
Planet |
A massive body |
that orbits a star. |
In mathematics, a type is strict,
meaning it:
·
is a necessary and
sufficient definition of a set member’s properties.
· defines properties that belong to all and only the members of a set.
· exactly denotes or specifies the necessary and sufficient conditions for being a set member.
It might provide rules that enable members of a set (e.g. even numbers) to be generated by an algorithm.
Note that members of a set of might be typified using a variety of expressions:
· even number: an integer that is evenly divisible by two
· even number: an integer that when divided by two, yields no remainder
· even number: an integer that is one less than an odd number.
Enumeration (rather than typification) of elements is the primary way to define
a set.
Consider these two sets.
· Set A = {0, 1}.
· Set B = {a, b, c}.
The enumeration defines them; there is no type.
Typification of set members is important to a lot of mathematics.
Yet basic set theory is about the set elements/members rather than the types that define the properties of a member.
References for your interest:
"Two sets are equal if they contain the same elements. I.e., sets A and B are equal if ∀x[x ∈ A ↔ x ∈ B]." https://www.cs.utexas.edu/~schrum2/cs301k/lec/topic06-sets.pdf
The “axiom of extensionality” says a set is determined uniquely by its members (rather than by any particular way of describing the set). https://en.wikipedia.org/wiki/Axiom_of_extensionality
The “principle of extension” says two sets are equal if and only if they have the same elements (rather than elements of the same type). http://www.math.northwestern.edu/~mlerma/courses/cs310-04w/notes/dm-sets.pdf
“Traditional set theory defines sets by their contents -- not relevant to types” (Bill Kent)
A set is static -
meaning that if an element is added or removed it becomes a new/different set.
Consider these sets:
·
The set of prime
numbers never changes.
·
The set of rainbow
colours might be redefined – but that would make a new and different set
By enumeration, the set of people alive at midnight is different every night.
By typification alone, the set of people alive at midnight is the same every night, since a type defines only the properties of a member and says nothing about the total membership.
Enumeration often
(always?) enumerates the names
of set members, rather than actual set members
Consider these set enumerations
· Set A = {0, 1}.
· Set B = {a, b, c}.
· Set C = {cat, dog}.
· Exam grades: {A, B, C, D, E}.
· Rainbow colours {red, orange, yellow, green, blue, indigo, violet}.
Do they list the actual set members, or only the names of set members?
The trouble: we have no option but to discuss things and concepts using words, using their names.
In many discussions and contexts, names represent things, names serve as things for the purposes of discussion.
A set theory expressed only in terms of enumerable sets is limited.
Mathematicians have developed type theories that rigorously define types (of numbers and other mathematical objects) using other types.
Challenges arise when we step from defining mathematical objects using mathematical objects to describing real-world entities using words.
Types range from provable by
mathematical logic and/or testing to the uncertainty or ambiguity of words in
natural language.
Source |
Defined type |
Generic type |
Defining properties |
Mathematics |
Even number |
an integer |
divisible by two |
Softer science |
Planet |
a massive body |
orbiting a star |
Softer science |
Bachelor |
a male bird or mammal |
prevented from breeding by a dominant male. |
Legislation |
Bachelor |
a man |
not and has never been married. |
Academic convention |
Bachelor |
a person |
holds a first degree from an academic institution |
Natural language |
Customer |
a person |
pays for goods or services |
Natural language |
Customer |
a person |
one has to satisfy in some way |
Which came first?
First we developed the ability to perceive and deal with reality in terms of things that are discrete.
We developed the ability to name things and group them (fuzzily) into families
Mathematicians refined names and fuzzy types into numbers and strict types.
A biologist may see these as the end-product of an evolutionary process in which our survival depends on modelling reality accurately enough.
Even the language used to discuss types is messy
A rose bush is a plant that
· instantiates the types “thorny”, “flowering” and “bushy”
· embodies the concepts “thorny”, “flowering” and “bushy”
· exhibits the properties “thorny”, “flowering” and “bushy”
· has or possesses the qualities, features or attributes “thorny”, “flowering” and “bushy”.
The term "property" is used ambiguously to mean
· property type (“height in metres”) and/or
· property instance (1.74 metres).
Softer sciences speak of “sets” when talking about dynamic collections of things.
In the natural, social and engineering sciences, it is common to speak of a set that can gain and lose members.
Here it is called a dynamic set: the ever-changing collection of objects that currently manifest the properties of a type.
Extensional definition |
Intensional definition |
|
Mathematical set |
must be enumerable even if the enumeration continues to infinity |
usually typifiable E.g. {1, 7, 53} lists members of a set with no definable type. |
Dynamic set |
cannot be enumerated members may join and leave while enumeration proceeds |
must be typified by attributes and relations that qualify them a set member |
Dynamic set members must be typified so if you add/remove a member, the dynamic set remains the same
The type can be considered as a concept, an abstraction, a model, a description, a specification a set of conditions or rules for the member of a set.
The set of |
Is |
Might be called |
Is definable by intensional
definition as |
Even numbers |
static |
Mathematical set |
An integer divisible by two. |
Diatonic scale notes |
static |
Artificial set |
A sound pitched to match one in the diatonic scale |
Lions |
dynamic |
Evolved set |
An animal whose DNA conforms to the lion genome (can breed with another) |
Roses |
dynamic |
Artificial set |
A plant that is flowering, woody, thorny, and perennial |
Bachelors |
dynamic |
Artificial set |
A man who is unmarried. |
The current members of a dynamic set might be definable by extension, by pointing to all the instances right now.
But the set is better defined by intension, by typifying a set member, by listing properties shared by instances of the type.
Type ontologies
An ontology defines a type using other (defining) types.
Defining a type gives meaning to a type instance, each defining type connotes a concept that a type evokes.
An ontology relates types in a domain of knowledge into a coherent and consistent structure.
“An ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse.” Wikipedia
Ontologists strive to minimise complexity and circularity by imposing a hierarchy on types.
It could be a pyramid, a strict hierarchy of types in which every type inherits qualities from words above it, up to a single “base type”.
Or a directed graph, with two or more base types that a type may inherit from.
Natural language
Just as an ontology defines a type, a dictionary defines a word using other (defining) words.
A dictionary forms a complex network in which words are defined by other words.
Surely, this reflects the complex and imperfect network of concepts in our mental models?
It may be circular and unprovable by logic.
But it evolved because it works well enough to help us model the world, survive and thrive.
More formal data
dictionaries
A business must monitor and direct things in its environment.
It distinguishes different kinds of real-world entity and event using Signs.
It distinguishes individuals using instance names and numbers.
It records the instances in databases.
A database schema is an ontology, often associated with a data dictionary.
A definition that sounds satisfactory at first may need to be refined for business purposes.
The set of |
Defined in |
|
Bachelors |
Natural language |
A man who is not and has never been married |
Bachelors |
Information system |
A man in the registry of births, but not in the registers of deaths and marriages |
How would you define a member in the set of ballerinas?
· “A dancer who is female and specialises in ballet?” “A female who is a ballet dancer?” Now, or ever?
· “A female who has practiced the art of ballet?” Even if they were inept?
· “A female employed in a ballerina role by one or more companies recorded in our register of ballet companies”.
Sometimes we need and use
business information systems to tell us what things fit a type, belong to a
dynamic set.
"An enumeration is a complete listing of all the items in a collection.
The term is commonly used in mathematics and theoretical computer science to refer to a listing of all of the elements of a set.” Wikipedia
Relational databases
It is commonly taught, as though a fact, that relational database design is based on set theory.
“The theory of relational databases is built upon the mathematical theory of sets.” Wikipedia
The table below is typical of how relational databases are described; notice the use of the terms “set” and “type”.
SQL term |
Relational database
term |
Typically described
as |
Data modelling term
|
Table |
Relation |
A set of rows |
Entity type |
Row |
Tuple or record |
A set of attribute values for an individual entity, e.g. an “Employee” |
Entity instance |
Column |
Attribute or field type |
A attribute type, e.g. "Address" or "Date of birth" |
Attribute type |
People have come to believe that business database structures are based on the mathematical notion of a set.
But the truth is that business data analysis starts with finding or choosing “primary keys”.
People need unique names/identifiers to distinguish between instances of any type that is important in their business domain.
So it would be better to say that the modelling of business systems is implicitly based on some kind of type theory.
Once types have been identified, and instances can be enumerated, then set theory is useful for defining database operators.
Some use the term relation for the current collection of rows in table (it starts empty, then gains and loses rows as they are created and deleted)
A relation can be defined intensionally, as a type that defines the possible extensions of a dynamic set of rows.
Of course, a query on a dynamic database can generate a
static set of elements at the moment the query is made. (See footnote 4.)
OOP
Object-oriented software design also borrows the language (classes and objects) of set theory, but the vocabulary is very confusing.
In Java there are concepts called:
· “Type"
· “Concrete class” a type attached to a dynamic set of objects.
· “Abstract class” a type only, cannot be attached to a set of objects.
· “Class types“ (I give up)
You can read papers that regard OO design as applied set theory, such as the two listed in the references at the bottom of this paper.
The paper on “Visualization of Object Oriented Modeling from the Perspective of Set theory” includes this, clearly showing that sets are dynamic.
For a given class [of objects] {o1, o2, o3}… the size of such sets is time dependent… depends on dynamic objects creation.
Conclusion
A relation is a type - an intensional definition that defines the possible extensions of a dynamic set of rows.
An OO-style class is a type - an intensional definition that defines the possible extensions of a dynamic set of objects.
In mathematics, a type is strict;
it is a necessary and sufficient
definition of a set member’s properties.
But the softer the science, the less strictly instances conform to types.
Is a botanists’ definition of a rose wholly necessary and sufficient to define every rose?
What if some roses don’t have thorns?
Biological species are loose types; sociological types are looser still.
The notion of an ideal type is associated with the sociologist Max Weber
It is abstracted by idealizing the properties of one
or more things,
But those properties may be neither necessary nor sufficient.
An ideal type … is not meant to correspond
to all of the characteristics of any one particular case.
It is not meant to refer to perfect
things… but rather to stress certain elements common to most cases of the given
phenomena.
Max Weber refers to the world of ideas
(German: Gedankenbilder "thought pictures") and not to perfection.
These “ideal types” are idea-constructs
that help put the seeming chaos of social reality in order. Wikipedia
Related papers
propose
Ideal (fuzzy) types are products of biological evolution
Fuzzy types preceded firmer scientific ones.
One type can describe, be manifested in, many different things
One thing can manifest, be described by, many different types.
Every type is a description that is manifestable in many things
Every description is a type that is manifestable in many things.
An architectural system description is a fuzzy and complex type that is manifestable in many operational systems.
The intensional definitions below are expressed as sign-type pairs.
Sign |
Type |
Integer |
A number with no fractional part. |
Lion |
An animal whose DNA conforms to the lion genome (can breed with another) |
Rose |
A plant that is flowering, woody, thorny, and perennial |
Bachelor |
A man who is unmarried. |
Signs give us the shorthand we need for practical discussion.
Mathematicians immediately understand what “integer” means.
Nevertheless, they had first to learn that from a longer definition made of other words.
The Sign only conveys a meaning to people who know the Type.
The Stanford Encyclopedia of Philosophy is mostly beyond me and my audience, but I note
this sentence in it.
“Church uses a notion he calls a concept, where anything that is the sense of a name for something can serve as a concept of that something.”
I read that as saying the definition of a term represents our idea of the thing the term names.
It implies an intensional definition takes form of a name, and a definition giving the sense or concept of the named thing (here a set member).
Compared with mathematical terms, business terms (e.g. strategy, policy, customer, stock) are less universally agreed and understood, more ambiguous.
It is normal, before and during the definition of a business system, to define the business vocabulary, and the rules for types, in some kind of data dictionary.
But we would not get far discussing a very complex system or type, if we had to refer to it by mentioning all its necessary and sufficient properties.
So, we invent words to label complex multi-propertied types (and invent identifiers to distinguish between instances of a type).
At the highest level of definition,
a type is the label or name we use for a kind of thing – presuming a detailed
definition is shared.
Sign as shorthand
Mathematicians may only name a set temporarily, for the purpose of a specific argument, using no more than a single letter.
Outside of pure mathematics, especially in information systems, types are usually given persistent names.
To know the meaning of a sign (say, integer, or bachelor) you must know the type.
The sign serves to remind you of an intensional definition you already know.
Sign-type pair
If you don’t know the meaning of a Sign, then you need the intensional definition as well.
Sign |
Type |
Rosa |
A genus of plant, with the properties flowering, woody, thorny and perennial. |
Colour of the rainbow |
A hue, a sensory perception evoked in a human observer by radiant energy falling between defined upper and lower wavelengths in the visible spectrum. |
Football league member |
An organisation or club that
must meet a long list of qualifying criteria determined by the league’s
governing body. |
Complex types
Not only is a type a description, but a description is a type!
If a type is long and complicated (with many qualifying properties) then several levels of definition might be developed.
The notion of recursive,
multi-level, description is an important tool in complex system specification
and enterprise architecture.
Proving a type
How to prove a simple
type in mathematics? You may present a deductive argument showing that a
statement is always true.
How to prove a complex type such
as an abstract business system design? You test that an operational system
matches the design.
Agile methods encourage getting to
user acceptance testing as fast as possible, with the risk that little or no
definition is left behind for future intelligibility.
Enumeration as a
process using a type
An intensional definition or type can be used to enable enumeration.
And that is a purpose of the types defined in business information systems.
Enumerated means listed; e.g. {red, orange, yellow, green, blue, indigo, violet}.
Enumerable means a set members can be listed, even if it is never actually done.
In mathematics, a set (e.g. positive integers, or even numbers), may be computationally enumerable (aka provable or Turing-recognizable).
This means there is a process that can enumerate set members, can generate them from an intensional definition.
The process might have to run forever, because the set is infinite (though countably infinite).
In a computer activity system, the software written by programmers is nothing more or less than description.
A procedural program is a type used to generate process execution instances.
An object-oriented class is a type used to generate object instances.
A database relation is a type used to generate rows in a database table.
Higher level specifications of
software systems contain abstractions from types in code.
In a human activity system, the same general principles apply to process specification.
We may give people a process to list the instances of type, using the definition we give them.
E.g. look at plants, check what you see against the rose type, and list whatever matches as a rose.
E.g. talk to men, ask if they are unmarried, and list whoever says yes as a bachelor.
We expect humans to make sense of slight, inadequate, ambiguous, inconsistent complex types.
We depend on humans to interpret
and follow vacuous and inaccurate intensional descriptions.
Can you define a type
for every enumerable set?
The test of a good type is that you can test that that only set members match it.
You can define colours of the rainbow by enumeration {red, orange, yellow, green, blue, indigo, violet}.
Can you define a set member by a type?
You might try: “A hue; a sensory perception evoked in a human observer by radiant energy in the range between 380nm and 750nm wavelength.”
But you can match any division of the visible spectrum to that definition.
You might try: “One of seven non-overlapping divisions of the visible spectrum, named by Newton.”
But that type definition depends on a reference to the enumerated set and its enumerator.
You can define collection of random numbers by enumeration {1, 37, 482, 9, 8}.
Can you define a set member by a type?
You might try: “A number selected at random by for set X by the set enumerator”.
But that type definition depends on a reference to the enumerated set and its enumerator.
Seeing an object as
both set member and type instance
People loosely say a type defines a set, but that is misleading, since it defines only a set member.
The solar system is a set; it has attributes such as total weight, breadth and number of planets
The description of a planet is type, which says nothing about the attributes of the whole solar system.
In modelling business systems, the focus is on dynamic sets rather than static sets.
A dynamic set is the ever-changing collection of objects that currently manifest the properties of a type.
Dynamic sets and associated types are two different things.
The members of 1 dynamic set are described by at least 1 type.
1 type describes the members of at least 1 dynamic set.
Murder incidents took place
before there was a murder type - described in a law.
Murder convictions started
only after there was a murder type - described in a law.
There are two sets, murder incidents and murder convictions, different
and unequal in number.
Both sets, or rather their members, instantiate the
descriptive murder type, but in different ways.
A murder incident manifests the law as a behaviour
– a process execution.
A murder conviction manifests the law as a structure – or in UML terms
an artefact, a physical piece of information.
Dynamic set members and instances are two different ways of looking at one thing.
Every member of a dynamic set is at the same time an instance of at least one type.
Every instance of a type is at the same time a member of at least one dynamic set.
A concrete real-world object can be a member of many dynamic sets, and an instance of many types.
Such an object has more properties than any type it instantiates.
E.g. you (being a human) instantiate the type “mammal with heart” and
the type “mammal with lungs”.
But you, as a concrete living
entity, have many more properties that either type declares.
The paper highlights ways that set theory relates to
enterprise architecture description.
The web site has a summary selection of relevant points.
Our interest is in forming a description that specifies the properties of a complex operational system (or several systems of the same type).
The following suggestions could be made:
· type = specification
· system specification = type
· humans manage a complex system specification by using recursively elaborated specifications
· a sign has no meaning on its own, sign users must know the type.
You may deliberately name a type using a meaningless letter or word such as zonga, or foo (used as meaningless name in computing).
The longer and more often that label is used, the more meaningful it becomes to its users.
Eventually a sign (e.g. rosa, city, billing system) conveys meaning to us without recourse to an intensional definition.
Note: it seems the softer the science, the less strictly an instance must conform to its type.
The web site has other papers on type theory.
“The traditional ways of classifying plants have been based on the visible physical characteristics of the plant.
However, since the discovery of DNA, plant scientists have been trying to classify plants more accurately, and to group them according to the similarities of their DNA.
This has led to major changes in plant classification”
The elements of OO solution space such as modules, packages and classes are visualized as individual sets which frames the static design of an application.
The module M in a solution space can be expressed as a set: M= {p1, p2, p3, p4} where p1, p2, p3, p4 are packages within a module M.
A package P in turn is a set of classes which is mathematically represented as P= {c1, c2, c3} where c1, c2, c3 are classes in P.
And at the dynamic design level, a collection of objects constitutes an object set for a given class C. C= {o1, o2, o3}.
The size
of such sets is time dependent; set M size varies as a result of module
enhancement and set C size depends on dynamic objects creation.
Again, one purpose of a business information system is to tell us what things fit a type, belong to a dynamic set.
There are various theoretical and practical issues with this.
Bill Kent wrote about databases as models of reality, or rather, our perceptions of reality.
"We are not modeling reality, but the way information about reality is processed, by people."
On tracking the continuity of an object’s identity over time. (E.g. are “applicant” and “member” two states of one object, or two objects?).
“Once a concept is represented in software, what is the effect of the passing of time on it?
At what point is it appropriate to introduce a new representative into the system, because change has transformed something into a new and different thing?
The problem is one of identifying or discovering some essential invariant characteristic of a thing, which gives it its identity.
That invariant characteristic is often hard to identify, or may not exist at all.”
On the type of an object being determined by its use. (E.g. one “person” may appear to us in two roles, as distinct “employee” and “customer” entities).
“The category of a thing (i.e., what it is) might be determined by its position, or environment, or use, rather than by its intrinsic form and composition.
In the set of plastic letters my son plays with, there is an object which might be an "N" or a "Z", depending on how he holds it.
Another one could be a "u" or an "n", and still another might be a "b", "p", "d", or "q".”
On the difficulty of sharing one “real-world” or business model ever more widely.
“In an absolute sense, there is no singular objective reality.
But we can share a common enough view of it for most of our working purposes, so that reality does appear to be objective and stable.
But the chances of achieving such a shared view become poorer when we try to encompass broader purposes, and to involve more people.
This is precisely why the question is becoming more relevant today: the thrust of technology is to foster interaction among greater numbers of people, and to integrate processes into monoliths serving wider and wider purposes.
It is in this environment that discrepancies in fundamental assumptions will become increasingly exposed.”
A query on a dynamic database can generate a static set of elements at the moment the query is made.
A reviewer writes:
“From a mathematical perspective, it should be possible to convert a dynamic set to a static set by the inclusion of a time function:
· Create a static set S from the dynamic set D by creating the elements of S from a snapshot of the elements of D over time.
· i.e S contains an element s1, which is a tuple consisting of t1 (time of snapshot) and the set D1 (which consists of elements of D at time t1).
Now S is static and can be enumerated.
I think countable vs uncountable infinite sets are a red-herring here.
But even if not, it is easy to side-step the issue for my example here by assuming a certain quantum to the unit of time – or else assuming a certain maximum rate of change in elements of D – which would then make the set S countable.
The above means that in theory at least, the statement ‘members of a dynamic set cannot be enumerated’ may be challenged – though think for all practical purposes your point holds.”
Footnote: Creative Commons Attribution-No Derivative Works Licence
2.0 24/02/2018 01:47
Attribution: You may copy, distribute and display this copyrighted work only
if you clearly credit “Avancier Limited: http://avancier.co.uk” before the start and
include this footnote at the end.
No Derivative Works: You may copy, distribute, display only complete and verbatim
copies of this page, not derivative works based upon it.
For more information about the
licence, see http://creativecommons.org