Language (in social and software systems)

Copyright 2016 Graham Berrisford. One of about 300 papers at http://avancier.website. Last updated 17/10/2017 22:47

 

Enterprise architecture is about business system planning.

It can be seen as applying the principles of general system theory – which we’ll get to later.

First, what theory underpins general system theory?

This paper is one of several on theories of information, communication, language, knowledge and description.

 

Contents         

Preface. 1

Using words to name and typify things. 2

Natural language dictionaries. 3

Formal domain-specific ontologies. 4

Disambiguating the use of terms by actors in a business system... 5

Using categories to define terms. 6

Using predicate statements to define terms. 8

Defining the data structure in a shared memory structure. 9

Defining the data structure in a message. 10

In conclusion.. 10

Footnote on ambiguities. 11

 

Preface

We are looking at things from Charles Darwin's view point.

Animal perceptions of reality (which may be called mental models) preceded verbal languages.

Concepts that describe reality were around long before one actor first communicated a concept to another.

 

In humans, it seems likely that many or most concepts emerge in the brain before being verbalised.

In evolutionary terms, oral language was one giant leap for mankind, and documented language was another.

But both are secondary in the sense they are evolutionary steps from earlier forms of description and communication.

 

So, description does not depend on verbal language.

And basing a philosophy of description and reality on verbal language would be misguided.

However, this paper is about the use of language in particular.

 

Since humans acquired verbal language, it seems thinking and language have co-evolved – mutually reinforcing each other

We are creatures of language, and language influences how we conceive things.

It isn't just that "the structure of a language affects its speakers' world view or cognition" <https://en.wikipedia.org/wiki/Linguistic_relativity>

I write what I think, I read it back and see it is wrong; it is like being able to listen to oneself talking.

Like a face-to-face discussion, writing helps us to consider, correct and refine what is said, honing in on a better description of reality.

 

Humans can translate mental models into and from verbal language statements.

Humans and their machines can translate descriptions in one form into other descriptive forms.

First, a system architect must be able to conceptualise and form a system description on their own.

After that, communication of descriptions using languages is vital to system architects, since their job involves creating a shared understanding.

 

The Communication paper discussed how social system actors exchange information using messages and memory structures.

This paper discusses the use of words in two contexts:

·         in the (run time) communication acts of a system

·         in the (design time) definition of a system.

 

Business systems have evolved over millennia by formalising the information exchanges of social systems.

A business system

Customer

Supplier

Place order

Send invoice

Send payment

Send receipt

 

Formalisation involves standardising the meanings of words used in messages and memory structures..

Above, “customer” and “supplier” are type names - the names of roles instantiated by many different actors.

“Order”, “invoice”, “payment” and “receipt” are also type names - the names of communication acts instantiated countless times.

 

Disambiguating, standardising and formalising the use of language is important to business system success.

This paper discusses the definition of words used in communications, and data structures contained in messages and memory structures.

It starts at the beginning, with how natural language works.

Using words to name and typify things

An infinity of unknown, unnamed and unclassified realities existed before people observed them.

When people first noticed some wandering stars, they named them Venus, Mars and so on.

We use names not only to identify particular things but also to describe their qualities.

Consider “Venus is a planet.

·         planet” is the name of a generalisation, a type (a concept, quality, property or attribute) used to describe things like Venus and Mars.

 

Notice that to use a descriptive word is to go beyond the immediate thing and typify it.

“In describing a situation, one is not merely registering a [perception],

one is classifying it in some way, and this means going beyond what is immediately given.”

Chapter 5 of “Language, truth and logic” A J Ayer.

 

That’s how language works; we identify particular things and describe them with reference to general types.

We use words to name things and describe them using type names such as “planet”, “dangerous”, “tall”, blue” and “beautiful”.

We could use other kinds of symbol in place of words, but words come so naturally to us, we can’t get them out of our minds.

 

Don’t run ahead with the idea that all types are named verbally, or defined by humans

Descriptive types were not invented by man; they were invented by biological evolution..

E.g. Animals recognise and react to a situation that is of the general type we call “dangerous”.

Some animals communicate this to others by sounding an alarm, a signal that symbolises all situations of that type.

 

How do we know an animal recognises a given type?

We ask: Does it repeatedly react to particular things of that type in the same or appropriate way?

If yes, then that type must somehow be encoded/symbolised in the animal’s biochemistry (the “how” is irrelevant).

 

Have you ever engaged a kitten by wiggling a piece of string?

Then you know the type we might call “mouse tail like” is somehow encoded in a cat’s biochemistry.

It may be relevant that Hubel and Wiesel (1959) showed cat’s brains have cells dedicated to detecting movement in slit-shaped spots of light.

 

Honey bees must recognise things of the type we might name “pollen source”.

You may know they also communicate about pollen sources by performing and observing dances.

For a particular pollen source, honey bees communicate particular values for its “distance” and “direction”.

They encode/decode values for the universal types “distance” and “direction” in/from the form of a dance.

The proof that bees “know” the symbols for these two types is that bees do find pollen where it was described.

The dance-form symbols that bees use for these types must correspond to models encoded in their internal biochemistry.

We don’t know, and it doesn’t matter, what form those internal mental models take.

 

Far beyond other animals, humans create and use mental models.

Our ability to do this depends on our uniquely-well developed ability to symbolise types using words.

We say that particular things instantiate (manifest, exemplify, or give values to) universal concepts or qualities.

Shouting “Danger!” informs others that the current situation is an instance of the general type “dangerous”.

Saying “The sky is blue” informs people that the sky exhibits the color widely known as “blue”.

 

We humans evolved to communicate information by using words to name and describe things.

Then, we learnt to record speech, and formalise it, in written records (and later, in digital messages and memories).

Moreover, we play with the very idea of a type; we use words to create new types and change them.

 

The recursive nature of description

There are two abstraction varieties, which might be called

·         idealisation from physical matter and energy to description (thing to type)

·         generalisation from description to description (type to type).  

 

Note that a description can be described, and so on.

A type may be defined using types.

A language may be defined using the same language.

 

The two kinds of abstraction processes might be rather similar mental processes.

The opposing processes are very different since the latter involves making something.

·         refinement or elaboration from description to description (type to type)

·         concretion from description to physical matter and energy (type to thing).

Natural language dictionaries

We naturally define words in a circular way using other words.

We try to define a word using widely-understood other words.

The convention used in ordinary dictionary definition is “genus plus difference”.

This table contains two examples.

Term

Genus

Difference

Type

Super type

Additional properties/types

Word

More generic concept

More specific characteristics

Star

A body in space

Large, remote, incandescent

Planet

A body in space

Largish, orbits a star

 

Thus, we use words to define words, types to define types.

Though note that the generalisation “A body in space” is probably not in the dictionary.

 

It is in the nature of human discourse to use language in circular, fuzzy and fluid ways.

Dictionaries are designed for ease of use, meaning each entry should be understandable on its own.

They must continually be updated and modified to reflect changes in how words are used.

 

Our brains connect ideas loosely, akin to a natural language dictionary.

We continually work with and around the synonyms and ambiguities of natural language.

But natural language is too ambiguous and inconsistent for business systems that formalise communications, perhaps with millions of customers.

So, when formalising systems we have to be more precise about the meanings of words used in those communications.

Formal domain-specific ontologies

Within a stable domain of knowledge, it may be possible to build a formal hierarchical ontology.

At the top are a few axiomatic words/types

Second level words are defined as subtypes that inherit from the axiomatic words

And so on, the end product being a top-down hierarchical vocabulary of this kind.

 

Thing: a describable element of the universe.

Event: a thing that is a transient behavior, occurs in time.

Entity: a thing that is a persistent structure, locatable in space.

Organism: an entity that is an individual life form.

Plant: an organism that consumes carbon dioxide, produces oxygen.

Rose: a plant that is flowering, woody, thorny.

 

One trouble is that multiple inheritance is common, meaning a type inherits properties from several super types.

Another is that position of a term in the hierarchy can change over time (c.f. fragile inheritance trees in OO programming).

Even the top level classifying types may be called into question.

E.g. entities and events may appear to be mutually exclusive, yet sometimes an event is seen as an entity, or vice-versa.

 

Outside of mathematics and the like, language is often too fuzzy and fluid for a deep classification hierarchy.

And it is difficult to impose one hierarchical ontology across several domains of knowledge.

The next section discusses building a semi-formal dictionary of business terms.

Disambiguating terms used by actors in a business system

In formal documents, especially legal ones, it is important that terms are used in a precise, unambiguous, consistent and coherent way.

To this end, terms are defined in some kind of preface or glossary.

Enterprise data/information managers build vocabularies and data dictionaries with the same aim.

 

Semi-formal vocabularies

To begin with, business people may have various (private, internal) mental models of what a business term means.

Terms like “Customer”, “Account” and “Policy” can mean surprisingly different things to different people.

So, how to build a semi-formal dictionary of terms?

One approach might be summarised along these lines:

 

1.      Form a panel of business people.

2.      Define some categories - some axiomatic/base/root terms - say Party, Person, Process, Product, Place, Point in time and Purpose.

3.      Introduce a term (say Customer) and perhaps give it a draft definition.

4.      Ask the panel to vote on the heading that best fits the new term.

5.      Place the word under the category the majority agree it fits best (though it may inherit also from other base terms).

 

At step 4 (if only intuitively) people look for properties of the new term that overlap with properties of the base terms.

Taking Customer as an example: perhaps a minority vote to classify it under Person and the majority vote for Party.

 

The result is semi-formal dictionary organised under a shallow and informal ontology.

The meaning of a term does not require exploration of a deep hierarchical ontology.

The term’s definition should be understandable on its own, supplemented by reference to its base term heading.

 

Data models and dictionaries

Suppose you build a dictionary of terms used within one coherent business domain.

And the business needs data to help it monitor and direct things described by those terms.

Then, the domain-specific language can be further formalised in the definition of data structures.

 

Data structure definition is discussed later; briefly:

·         The data structure of a shared memory structure can be defined in an entity-attribute-relationship (EAR) model.

·         The data structure of a message can be defined using a regular expression.

 

It should be possible to build a separate EAR model for each coherent business domain

But it is usually impossible to build an enterprise-wide EAR model.

(Unless one retreats from business-specific language to the level of generalisations not used in practice.)

The difficulty is that different business domains have developed their own domain-specific languages.

Any two domains may use homonyms (one term with two meanings) and synonyms (two terms with the same meaning).

 

The EA concern is data exchanged between business domains and/or aggregated in some way for management information

To define that, some kind of cross-domain dictionary is needed.

Consideration must then be given to how “name spaces” are distinguished and compound terms are used.

 

Naming terms

Data managers usually develop a naming standard for terms

It likely dictates how compound terms are formed using separators etc. E.g. Supplier-Purchase-Order-Number.

 

Defining the meanings of terms

There may be standard to ensure definitions are precise, unambiguous, consistent and coherent.

A definition may be supplemented and formalised by reference to the place of that term in an EAR model.

 

Creating new terms

Business people may work with and around the synonyms and ambiguities of natural language.

But when digitising a business system, to disambiguate business language, new terms are sometimes needed.

 

Data management

Data stewards may be appointed not only to maintain the definitions of terms within designated knowledge domains.

But also to manage the confidentiality, integrity and availability of data values held by the organisation.  

Using categories to define terms

Type theory leads us to define one term using other terms thus:

Type name: “Actor”. Type definition: a “role taker” and “change maker”.

Notice how a type is defined by listing other types, each being a category that things of the first type belong to.

 

There are many ways to typify or categorise the use of words in the grammar of a language.

Take this particular sentence: The cat sat on the mat last night, in the living room, for a rest.  

The table below maps elements of that sentence to universal types in different schools of thought.

Object of thought

Schools of thought

A sentence

Grammar

Informal analysis

System theory

Predicate logic

The cat

Noun

Who?

Active structure

Subject

sat on

Verb

How?

Process or behavior

Predicate

the mat

Noun

What?

Passive structure

Object

last night

Adverb

When?

Position in time

 

in the living room

Adverb

Where?

Position in space

 

for a rest.

Adverb

Why?

Purpose or aim

 

 

Note that adverbs include phrases that qualify the meaning of a verb, e.g. by expressing its time, place and purpose.

 

Among the various ways to categorise terms, the concepts of general system theory are a useful way.

This table defines element types generally used to define systems.

Informal analysis

System theory

Definition

Who?

Active structure

An actor (a person, organization or component) that can take a role, respond to events and perform behaviors.

It has a current state and relationships to other system elements.

How?

Behavior

A process that inspects or changes the state of one or more structures.

It runs from start to end according to some rules (of logic or law).

What?

Passive structure

An object that is acted on; it may be created, moved, changed or destroyed

It has a current state and relationships to other system elements.

When?

Position in time

A date, time or event that triggers an actor to perform a behaviour.

Where?

Position in space

A place where a structure is found or a behaviour occurs.

Why?

Purpose or aim

A motivation for one or more behaviors.

 

Every abstract system description features general types of things.

Every concrete incarnation of a system features particular things.

It is common to use the same word for type and thing, but it is possible to use different words - as in the table below.

 

Universal type

Particular thing

which each

Space-bound structures

Active structure type, or role

An actor, component or node

take a role, respond to events and perform behaviors

Passive structure type

An object that is acted on

may be created, moved, changed or destroyed.

Time-bound behaviors

Event

An occurrence

triggers an actor to perform a behaviour.

Behavior

A performance

inspects or changes the state of one or more structures

Using predicate statements to define terms

We are far from Vulcans (like Spock) who think purely logically.

Experiments have shown people do not instinctively or readily understand formal logic.

Biological evolution did not create that capability in human beings.

Natural language is loose, uncertain and ambiguous.

 

Nevertheless, it is possible to form natural language sentences that are testable as either true or false.

Predicate logic gives us a simple grammar for forming such statements.

A predicate is a statement that may be true or false depending on the values of its variables.

 

In mathematics, the predicate on X may be expressed as P: X→ {true, false}.

In natural language, a predicate is a verbal phrase, with or without an object, which states something about a subject.

The statement may be expressed in the form of a proposition or definition called a triple.

·         Subject: the particular thing or general type that is described.

·         Predicate: a verb or verbal phrase that either stands alone or relates the subject to the object.

·         Object: a particular thing or general type that is related to the subject, by the predicate.

 

A predicate statement can be used to describe a particular thing on its own or its relationship to another thing.

·         Subject <predicate> object.

·         Mary <was driving>.

·         Mary <was driving> the car.

·         He <stole> the phone Mary bought last week.

·         My mother <is associated with> my father.

 

In the statements below, italics are used to distinguish general types from particular things thus.

Where the subject/object name = name, that is the name of a particular thing.

Where the subject/object name = “name” = that is the name of a type called “name”.

In spoken language, the distinction rarely troubles us; and in written language it can be distracting.

So these papers put type names in italics only where it seems helpful.

 

A predicate statement can be used to relate general types to each other.

·         “Subject” <predicate> “object”.

·         “Mother” <is logically associated with> “father”.

·         “Customer” <is logically associated with> “invoice”.

·         “Person” <is a subtype of> “actor”.

·         “Rectangle” <is a subtype of> “quadrilateral”.

·         “Square” <is a subtype of> “rectangle”.

 

A predicate statement can be used to relate a particular thing to a general type.

·         Subject <predicate> “object”.

·         The man on the moped <is a> “thief”.

·         This rectangle <is> “square”.

·         This rectangle <is a> “square”.

·         The current situation <is> “dangerous”.

·         Pluto <is a> “planet”.

 

By the way: How to decide whether any statement listed above is true or false?

You may choose from the following options:

·         test by logical analysis that the subject relates as described to the predicated object

·         test by physical measurement that the subject relates as described to the predicated object

·         devise test cases, run tests, and compare results with predictions

·         ask a judge or jury to examine the subject, and give a verdict.

Defining data structures used in communications

Defining the data structure in a shared memory structure

A shared memory structure can contain data that can be read from any starting point, in any direction.

You can use predicate logic to define the logical data structure in a shared memory structure of that kind.

 

The semantic web

The semantic web is based on using predicate logic to relate web resources.

The idea is that the subject, the predicate (and perhaps the object also) are identified using domain names or URIs.

 

The vision of a world-wide semantic web (implying a stable global vocabulary or dictionary) seems incredible.

“The reality isn't a linked data web of interconnected resources.

More real is a set of linkable data - marked up or stored in some queryable format, selectively findable and accessible via tools”

https://www.informationweek.com/software/information-management/semantic-web-business-going-nowhere-slowly/d/d-id/1113323

 

Entity-attribute-relationship models (aka logical data models)

An EAR model embodies the conventions of predicate logic to describe stored data.

The model is nothing more or less than a collection of predicate statements that relate general types to each other.

·         An entity type <has or relates to> an attribute type or entity type.

·         “Customer” <has> the attribute type “address”.

·         “Customer” <is a subtype of> “business party”.

·         One “customer” <may place> many “orders”.

·         One “order” <is placed by> one and only one “customer”.

 

A predicate statement can also be used to relate a particular thing to a general type

·         An entity instance <has or relates to> an attribute type or entity type.

·         The customer 999 <has> the attributes “name” and “address”.

·         The customer 999 <pays> “invoices”.

Or describe a particular entity.

·         An entity instance <has or relates to> an attribute value or entity instance.

·         The customer 999 <has> the address 30 High Street.

·         The customer 999 <is due to pay> invoice 9999.

Defining the data structure in a message

A message is a serialised data stream, a sequence of elements, which can be pushed down a communication channel.

You can use the grammar of a “regular expression” to define the logical data structure in a message.

The logic can be limited to three simple constructs – sequence, selection and iteration.

 

A simple notation

Meaning that the element X

Sequence

Element X [Element A > Element B > Element C]

Contains three elements in that order.

Selection

Element X [Element A OR Element B OR Element C]

Contains one of three optional elements

Iteration

Element X*

Is repeated zero, one or more times

 

The following example illustrates the definition of a message with all three constructs:

Message [Single Name OR Long Name [Forename > Middle Name* > Surname]].

In conclusion

Disambiguating, standardising and formalising the use of language is important to business system success.

This paper has discussed the definition of words used in communications, and data structures contained in messages and memory structures.

It started at the beginning, with how natural language works.

Footnote on ambiguities

Remember the convention used in this paper.

Where the subject/object name = name, that is the name of a particular thing.

Where the subject/object name = “name” = that is the name of a type called “name”.

 

The “is a” relationship

The ambiguity of the <is a> relationship is notorious, since it appears in two kinds of proposition.

·         A thing <is an> instance of a type.

·         A type <is a> subtype of another type.

 

Consider: This rectangle <is a> “square”. Here, <is a> means <is an instance of> or <embodies> or <gives values to the properties of> the type named “square”.

Consider: “Square” <is a> “rectangle”. Here, <is a> means <is a kind of> or <is a subtype of> or <shares or extends the properties of> the type named “rectangle”.

 

The “is” relationship

The <is> relationship is also ambiguous since it appears in two kinds of proposition.

·         A Thing <is> Value (meaning – the thing has an attribute with that value, instantiates an attribute type with that value).

·         A Thing <is> Type (meaning – the thing embodies that type).

 

Consider: The sky <is> blue. Here, <is> means the sky has a colour attribute with the value blue.

Consider: This rectangle <is> “square”. Here, <is> means this rectangle instantiates the type named “square”.

 

On states of a finite state automata as types

Consider a natural thing you know that has four states typified as: egg, caterpillar, pupae and butterfly.

This is a Finite State Automaton (FSA); meaning it is has one state at a time, but two or more states over time.

We can define this as one type with four different states.

Or else define it as four different types (so one thing changes from one type to the next).

And if we design an FSA, then modify it during the life of a thing that instantiates it, then that thing must adopt the behavior of the second, different, type.

 

 

All free-to-read materials on the Avancier web site are paid for out of income from Avancier’s training courses and methods licences.

If you find them helpful, please spread the word and link to the site in whichever social media you use.