Wisdom, Knowledge, Information and Data (WKID_

Copyright 2017 Graham Berrisford. One of about 300 papers at http://avancier.website. Last updated 25/03/2018 16:34

 

A role of enterprise architects is to observe and envisage information systems

So, you might assume it is universally agreed what "information" is; but this is far from the case.

Is your telephone number data, information, or knowledge? Or all three?

Contents

Preface: What is information? (repeated from system ideas) 1

Communication. 4

Data. 4

Knowledge. 6

Wisdom.. 6

Conclusions. 7

Appendix: Eight other ways people propose to differentiate data from information. 8

 

Preface: What is information? (repeated from system ideas)

connected with system theory is… communication.

The general notion in communication theory is that of information.” Bertalanffy

Our main interest is in social and business systems in which animate and/or computer actors exchange information.

 

The Oxford English Dictionary lists more than half a million words.

Consider data, information, knowledge and wisdom; also signal, symbol, description, representation, meaning and model.

Given those ten words, how many clearly distinct concepts are there?

 

You may have come across something called a WKID triangle, pyramid or hierarchy.

This version is compatible with the information and communication theories that follow.

 

Wisdom

The ability to respond effectively to knowledge

Knowledge

Information that is accurate or true enough to be useful

Information

Meaning created or found in a structure by an actor

Data

A structure of matter/energy in which information has been created or found

 

All communication utilises a structure

The medium for information storage or communication is a matter/energy structure of some kind.

To communicate, animals use sound waves (calls), smells, gestures, etc.

Humans use sound waves, written text, flags, etc.

Computers use electronic signals, radio waves, etc.

 

Every structure has information potential

There are infinite structures in the matter/energy of the universe.

Some equate structure with information.

Here, we say a structure has information potential to actors.

There is actual information when actors use some information potential to create or obtain a meaning.

 

There is information potential in the variable

There is actual information when

angle of the sun’s rays

a human reads the time from the shadow on a sundial.

a sunflower perceives the position of the sun and turns to face it

nerve impulses (electrical charges)

an actor responds by removing its hand from a hot plate

bending of a bi-metal strip

a thermostat responds by switching a heater on or off.

movements of a honey bee

honey bees dance to communicate a location of pollen.

open or closed state of an office door

actors share a vocabulary in which an open door means “you have permission to enter”.

lengths of dots and dashes (in sound, light, braille…)

actors use Morse code to communicate.

quantity in a number

an actor says 20 in reply to a request for a fact (say, the speed of a bicycle in miles per hour).

 

Information is meaningful to its sender and/or receiver

Senders encode meanings in data structures, and receivers decode meanings from them.

The meanings include descriptions, directions, decisions and requests for them.

Descriptions are usually divided into facts (tasty, tall, scary) about things (say, food, friends and enemies) that actors perceive as discretely identifiable.

 

Information has at least one sender and/or receiver

A sender (a voice crying in the wilderness) may create information in a data structure that no receiver inspects.

A receiver may find some information in a data structure that was not intentionally sent.

E.g. The sun radiates a flow of light towards a rotating earth.

A sunflower finds a direction to turn its face to optimise its energy consumption.

 

Different actors can find different information in the same data structure

E.g. The sun radiates a flow of light towards a rotating earth.

A sunflower finds a direction to turn its face to optimise its energy consumption.

One man reads the shadow on a sundial as describing the hour of the day.

Another concludes that the sun rotates around the earth; another that the earth spins on its axis.

 

E.g. the data structure in a DNA molecule may be decoded by a biological cell as instructions for making proteins.

And decoded by a human reader of the genome as carrying a gene for some life-shortening condition.

Neither actor can read and act on the data structure as the other does.

 

To communicate requires sharing a data structure and a language

First, the data structure of a message must be preserved (a concern of Shannon’s theory).

Second, creators and users must share a language for encoding and decoding that data structure.

 

Two things can go wrong.

 

First, the data structure is distorted between sender and receiver.

E.g. Speaker says: “Send reinforcements we are going to advance.”

Listener hears: “Send three and four pence we are going to a dance.”

The intended signal is distorted at some point between sender and receiver.

Shannon’s information theory is about preserving the integrity of a data structure.

 

Second, creators and users use a different a language to encode and decode a data structure.

Or the ambiguity of natural language disables communication.

E.g. Speaker says: “He fed her cat food.”

Listener 1 hears: He fed her cat – food (He fed a woman’s cat some food).

Listener 2 hears: He fed her - cat food (He fed a woman some food that was intended for cats).

Listener 3 hears: He fed - her cat foods (He somehow fed the cat food that a woman owned).

 

In business systems, the presumption is that things do not go wrong.

Data structures are preserved perfectly.

Senders and receivers apply the same language to writing and reading them, or perfect translations are made.

 

Information is a subjective view of a data structure

The information in a data structure depends on senders and/or receivers and the languages they use.

E.g. I leave my office door open.

Case 1: I do it deliberately, to signal that I am open to visitors; you read the door as saying I am open to visitors, and enter my office.

Case 2: I do it by accident, but am open to visitors anyway; you misread the door as saying I am open to visitors, and enter my office.

Case 3: I do it by accident, but am not open to visitors; you misread the door as saying I am open to visitors, and enter my office.

 

Any meaning created or found in a message or memory structure is information to that actor

An actor can change their mind about the information found in a message.

E.g. I say the swimming pool is warm; you hear and act on that information by diving in.

I turns out the swimming pool is cold, and you now recall the information as a lie.

What a sender considers true, a receiver may consider false, and vice versa.

 

Knowledge is information that is true enough to be useful.

The accuracy or truth of information is a matter of degree.

Knowledge is information that is true enough to be useful (e.g. Newton’s laws of motion).

Sometimes what we say can be tested by measurement of meaning against reality.

But all measurement has a degree of accuracy, and even Newton’s laws of motion are approximations.

 

Read Information for a longer discussion.

Read Knowledge and truth for exploration of that topic.

Communication

The universe may be continuous, but animals decode apparently continuous signals into discrete facts.

How our brains and computers work (at the neuron or electronic level) is not important here.

We make sense of the world by chunking perceptions of the continuous universe into discrete entities (you, me, your lunch, or a bicycle ride).

We describe the world in terms of discrete facts (you are tall, I am old, lunch is tasty, and this bicycle is speeding at 20 mph).

These facts are models or coded representations of things we perceive as discrete entities.

 

Social system members have two mechanisms for sharing the language, rules and state of a social system.

 

·         Communication - by messages sent from senders to receivers

·         Recording - by storage in a shared memory that all actors can access.

 

Social systems range from the informal to the formal.

Business systems evolved from social systems by formalisation social communication.

Gradually, the transactions of government and commerce were standardised.

The language and behavioural rules of a business system, once agreed, are stable for a system generation.

The business system is directed according to those rules, until they are changed in a new system generation.

Data

Data

A structure of matter/energy in which information has been created or found.

Any feature or part of a signal that is mappable to a language or data model.

 

A structure becomes information when it is mapped to a language.

Honey bees use the language of dance to signal the locations of pollen sources; situation-specific data is found in particular dance movements.

The bees’ language has to be generalised, independent of particular bees, particular dances and particular pollen sources.

 

Analysing an example signal

The sentence below is one particular physical signal, formed of characters and spaces that you can read on your screen.

“Jack Jones has booked seat 35F in coach E on the fast train to Newcastle Central from London Kings Cross leaving at 18.00 hours tomorrow evening.”

You can surely read all the meanings that I encoded in this signal, because you and I share the same general-purpose language.

 

The sentence contains at least seven separable facts, which each associate a thing with a property or another thing.

·         There is a train to Newcastle Central from London Kings Cross.

·         The train leaves at 18.00 tomorrow evening.

·         The train is a fast train.

·         The train has a coach E.

·         Coach E has a seat 35.

·         Seat 35 is forward facing.

·         Jack has booked a train seat.

 

The sentence implies some further facts:

·         London has a station called Kings Cross.

·         Newcastle has station called Central.

 

Given some sentences similar to the one above, you can abstract at least ten domain-specific types.

And if you like to generalise further, you can abstract super types from those types.

Note that you can’t classify all things under one super type only, because there is multiple inheritance. E.g. a Station is both a Place and a Passive object.

 

Particular thing or property

Domain-specific type

More generic type

Even more generic type

Jack Jones

Customer

Person (or Party)

Entity

E

Coach

Passive object

Entity

seat 35

Seat

Passive object

Entity

Newcastle Central

Station

Place (or Passive object)

Entity

London Kings Cross

Station

Place (or Passive object)

Entity

F (= forward facing)

Seat direction

Property type

Attribute

Fast train

Train journey duration

Property type

Attribute

To/from (= station role in journey)

Arrival/Departure Place

Property type

Attribute

has booked

Booking

Process

Event

leaving at 18.00 tomorrow evening

Departure time and date

Point in time

Event

 

Data models
The domain-specific language used in a digital information system is commonly defined in what is known as a data model.
A data model is composed of generalised data types (e.g. Customer, City)

The data types which are instantiated as data items (e.g. Jack Jones, London) in particular messages and records created and used by actors.

A data model is a language for communication of information between members of a specific business/social system.
This language stands apart from, is independent of, particular actors, particular signals, and particular information.

Knowledge

Knowledge

Information that is accurate or true enough to be useful

 

Consider these reasonable uses of the verb “to know”.

1.      You know your son’s name

2.      You know your son likes ice cream.

3.      You know to remove your hand from a hot plate (that knowledge is in your spinal column).

4.      You know that removing your hand is an autonomic action (completed before any signal reaches your brain).

5.      You know you know all the discrete facts above; that is to say, you are self-aware.

 

Knowledge is an ambiguous term

French and German languages have different words for the knowledge of recognition and the knowledge of understanding.

Wisdom

Wisdom

The ability to respond effectively to knowledge

 

Wisdom ought to help actors and/or the society they live in to flourish.

Is wisdom transient or persistent? Can an elephant be wise?

There is evidence that chimpanzees and elephants, have considerable self-awareness.

Surely wisdom is the ability to generate directions in new situations by introspection of remembered information?

 

So far, we haven’t needed to discuss actors having conscious aims, having free will or making choices.

However, wisdom implies an ability to be consciously introspective, to recall and analyse remembered information.

Conclusions

This WKID table is compatible with the information and communication theories above.

 

Having

Means

Wisdom

The ability to respond effectively to knowledge

Knowledge

Information that is accurate or true enough to be useful

Information

Meaning created or found in a structure by an actor.

Meaning encoded in a signal by a signal creator or decoded from a signal by a signal user.

Data

A structure of matter/energy in which information has been created or found.

Any feature or part of a signal that is mappable to a language or data model.

 

The language used to encode/decode signals may shared by signal senders and receivers, but is independent of them

 

So, what about the data/information ambiguity in enterprise architecture?

Enterprise architecture about business roles and processes that create and use information. Or should that be data?

 

Enterprise data architects typically define their domain along these lines.

"Data architecture defines business data in terms of relationships between the following data elements:

·         Data stores and data flows created and used by business activities.

·         Data structures contained in data stores (usually defined in terms of data entities).

·         Data structures contained in data flows (such as messages).

·         Data qualities (meta data) including data types, confidentiality, integrity and availability.

Architects may relate these data elements to business activities and to business applications."

 

However, data architects do also speak of information and information systems.

They use the term "data" variously, sometimes as synonym for “signal” or for “information”.

And when they speak of a data in storage, they are usually thinking of the information, not the raw signals.

 

Suppose we agree data = physical signals, and information = meanings given to signals by actors?

OK, then which is the correct term, data store or information store? database or information base?

Both are correct, since a database does store signals in a physical form.

And what it stores does represent facts meaningful to actors who read/write those signals.

 

And so, the data/information terminology ambiguity will persist in EA frameworks.

Asking enterprise architects to use distinguish data from information in a disciplined way simply doesn’t work

It is easier to let people use the terms “data” and “information” interchangeably, choosing whichever suits their audience.

Appendix: Eight other ways people propose to differentiate data from information

The proposal above: data is any feature or part of a signal that is mappable to a language or data model.

However, many other distinctions have been dawn between data and information.

This section outlines many and diverse ways in which people think to distinguish data from information.

Some distinctions are human-centric, some are about chunking, and some are about idealisation.

Three human-centric distinctions

Our domain is the use of information in business systems.

But there was information before business, and before humans.

So the first three distinctions seem arrogant, since they elevate humans over machines, humans over animals, and managers over operators.

 

1. Data is computerised: information is human?

Data architects distinguish data at rest (in stores) and data in motion (in flows).

Some say data flows created or used by computers are data, whereas data flows created or used by humans are information.

And thus, some enterprise architecture sources distinguish between data and information architecture.

But is there a sound basis for treating information as the preserve of humans?

Computers mimic social systems in sharing vocabularies, grammars and rules for reading and acting on signals.

The first business computers played roles formerly played by humans; and computers still act as proxies for humans.

 

2. Data is binary digits: information is verbal?

This distinction might be seen as a rephrasing of the computer / human distinction above.

But it is misleading, since information is conveyed in other ways than by words.

And humans invented binary digits long before computers were invented.

Ones and zeros are meaningful information to any humans who manipulate them in binary arithmetic.

 

3. Data is facts used by operators: information is facts aggregated or analysed from operational facts?

This distinction assumes a kind of information stack.

It treats operators as bottom level actors, and managers as the topmost actors.

The distinction is subjective, as can be seen in the fact that one person’s atomic information is another’s summary data.

The total of available stock items can be seen as an atomic fact, or an abstraction from thousands of stock items.

Your current account balance can be seen as an atomic fact, or a total abstracted from thousands of credit and debit transactions.

Every cardinal number (amount of stock available) and value judgement (heavy, good, nearly finished) is both an atomic fact and summary data.

 

Information is created and used by actors at different levels of management for different reasons.

“Management reports typically aggregate individual cost transactions into a total for (say) a chemical plant in a factory.

This is of no interest to the junior accountant responsible for successful business transactions, but very useful to the factory accountant.

However, the CEO sees even plant totals as bits of data – needs data aggregated to the whole company and trended over time.

Good BI reports let you drill down from very top to very bottom to provide assurance the high level number is accurate.

And they enable analysis of exceptional behaviour or actors, such as, which plant costs the most to run.”

(See “Big Data and Business Intelligence” on the avancier.website for more from Rick Anderson.)

Three chunking distinctions

These three distinctions are about how signals are divided into facts, or atomic facts are aggregated into complex facts.

 

4. Data is singular: information is plural?

At your local railway station, you read an ordered list of train departure times on a notice board.

At the same time, you hear one train departure time announced over the Tannoy.

Surely, the single item of news you hear gives you more accurate information than list on the notice board?

 

5. Data is the atoms of (molecular) information?

In reply to the question "How good is Jack's diet?" comes the one word answer "Poor."

That one word sentence conveys one atom of information to the questioner.

The atom of information is not the word, but rather the fact that Jack’s diet is poor.

 

"Jack is short and has a poor diet".

This sentence contains two discrete facts: Jack is short; Jack has a poor diet.

Analysis of many sentences of the same kind may reveal a third fact: Poor diet tends to stunt height.

Words give us a vocabulary for communicating facts about discrete entities and events, in signals.

Surely each discrete fact is information in its own right? To call it data is entirely arbitrary?

 

6. Data is discrete: information is continuous? (Or the reverse)

To send or perceive information, actors need to make or find some distinguishable variety in the flow or structure of a signal.

Some signals are continuously varying (the angle of the light from the sun).

Some signals are chunked into discrete elements (the light flashes in Morse code).

 

Information can be extracted from a continuous signal, such as the time of day on a sundial.

The angle of a bi-metal strip is another continuously-varying variable.

So the information read from these signals can be continuous also.

 

However, actors have evolved to make sense of the world by chunking signals into discrete elements.

Actors that can match new entities and events to a remembered pattern or type have an advantage in life.

Nobody knows how the brain works, but we do know it can store and retrieve discrete facts, e.g. telephone numbers.

And business systems clearly divide information into discrete facts, e.g. names, addresses, and account balances.

They assume people can write, read, remember, consider and act on discrete facts about business entities and events.

Two idealisation from reality distinctions

 

7. Data is objective: information is subjective? (Or the reverse)

Many modern philosophers and scientists take the view that all is subjective; perception is reality.

The matter and energy of the universe is not directly knowable; it is knowable only in a subjective measurement or description of it.

 

“If you accept quantum physics at face value then at least one of two dearly held principles from the classical world must give....

One is realism, the idea that every object [particular] has properties [instances of typical characteristics] that exist without you measuring them."

Anil Ananthaswamy “New Scientist” 13 December 2014

 

“A supposed “reality” that is “outside” of every logical possibility of empirical or logical interaction with “it” can play no direct role in the sciences.

Science can deal only with phenomena, that is to say, only with what can “appear” somehow in experience.

All scientific concepts must somehow be traceable back to phenomenological roots [in the study subjective experience].”

http://plato.stanford.edu/entries/peirce/index.html

 

If all is subjective, then how can actors use the matter and energy of the universe to communicate?

Sending and receiving actors must share a common vocabulary and grammar for writing/reading the information in signals.

You might reasonably say the meaning of that information is objective within the bounds of the social system the actors are members of.

 

8. Data is physical: information is logical?

This equates data with signals, and information with meaning created or used in signals.

It says data (in its energy or matter form) is physical or concrete, whereas information (meanings) is logical or abstract.

 

Whatever data/information distinction you prefer, you can test it on the exercises below.

Exercises for the unconvinced reader

Do you distinguish the terms “data” and “information” in one of the many ways listed above? Or another way?
Then whatever your chosen data/information distinction, you can test it on these exercises.

 

Exercise 1: How would you reword the definition of enterprise data architecture below?

 

"Data architecture: defines business data in terms of relationships between the following data elements:

·         Data stores and data flows created and used by business activities.

·         Data structures contained in data stores (usually defined in terms of data entities).

·         Data structures contained in data flows (such as messages).

·         Data qualities (meta data) including data types, confidentiality, integrity and availability.

Architects may relate these data elements to business activities and to business applications."

 

Exercise 3: In any sentence below, at any point, does substituting “data” for “information” change the sentence’s meaning?


A human receives information from another human and stores the received information.
A human receives information from a computer and stores the received information.
A computer receives information from a human and stores the received information.
A computer receives information from another computer and stores the received information.

 

Exercise 3: When and where in the story below do you see your friend’s address as data, information, or knowledge?


You ask your new friend for his address.
He brings his address to mind, and tells it to you.
You copy it into your address book.
Before you visit him, you open the address book and use it to find his house on a map.
Then you use the map to find his house.
Next time you visit, you don't need the address book or map, because you remember the way.
But you get into an argument about what data means, and he doesn't invite you back for ten years.
Now, you have forgotten his address and have to use the address book and map again.

 

 

Footnote: Creative Commons Attribution-No Derivative Works Licence 2.0      18/08/2013 11:18
Attribution: You may copy, distribute and display this copyrighted work only if you clearly credit “Avancier Limited: http://avancier.website” before the start and include this footnote at the end.
No Derivative Works: You may copy, distribute, display only complete and verbatim copies of this page, not derivative works based upon it.
For more information about the licence, see http://creativecommons.org