Saturday 12 September 2009

Can a taxonomy be true?

In a previous posting I argued that what distinguishes knowledge from mere opinion is that the arguments and evidence in its favour make it seem more likely than not to be true. It’s not necessary, because it’s generally not possible, to be sure that they are true.

This is fairly clear for the simple propositions considered by philosophers and for scientific theories but I’ve argued, in Types of model, that knowledge includes other things, such as taxonomies. But what does it mean for a taxonomy to be true?

The need for clarity
The first requirement is that the taxonomy is clearly defined. Specifically, that the definitions of the categories ensure that the people using the taxonomy can fairly readily agree on how to categorise a typical thing to which the taxonomy is supposed to apply.

The need for similarity
The second requirement is that things allocated to one category should be more similar to one another than they are to things in other categories. The similarity can apply to any property of interest. If I classify things on the basis of some property of interest to you then you will be able to locate them subsequently. This works for filing systems, library classification systems, classification systems for businesses, etc.

The best taxonomies pass the similarity test for many properties of the things in question and often work because of some underlying property of those things. The Periodic Table of the Elements is an excellent example. Elements in the same group have many similar physical and chemical properties because of similarities in the orbits of their electrons.

Testing for falsifiability
As Karl Popper showed it is, outside logic and maths, generally impossible to prove truth but often possible, and generally more useful, to try to prove a claim false. So can a taxonomy be proved false?

Indeed it can.

Consider the system of diagnoses used by psychiatrists. There are some differences between psychiatrists but most divide mental illnesses into neuroses and psychoses:
  • Neuroses are further divided into anxiety, depression, obsession and various phobias.
  • Psychoses are further divided into schizophrenia, bipolar disorder, paranoia, etc.
Richard Bentall, a professor of clinical psychology, has examined the evidence for this diagnostic approach in Madness Explained. He's shown that these conditions are not met. Psychiatrists do not agree on diagnoses and diagnoses are poor guides to treatment or outcomes. Research also shows that doctors can get better guidance on treatment by considering symptoms alone and ignoring diagnostic labels.

These findings, which are not particularly new, have profound implications for psychiatry. It has been going down a blind alley for over 100 years!

It’s also significant for my model of knowledge since it shows that a taxonomy can be seen as an hypothesis that meets Popper’s test of falsifiability. So a taxonomy is a form of knowledge.

Tuesday 14 July 2009

Creating the message

In my last post I showed how an extended semiotic framework could resolve the old data/information chestnut. I showed that during decoding the recipient must use several kinds of knowledge, eg of grammar and coding, that was also used by the sender. Now I want to look at the process of composition. In this, as with decoding, the levels are applied successively but from top to bottom. At each level the sender makes one or more choices on the basis of what he believes that the recipient also knows. Specific steps, technical or otherwise, may be needed to ensure that the recipient does know it.
The starting point is always an intention. This may be to transfer information but is often to produce action, eg giving an command or placing an order. In either case the sender needs to know what the intended recipient already knows about the topic.

For instance, if I wish to explain collaterised debt obligations to you I need to know how much you already know about markets and derivatives. Similarly, if I wish to order a garden hammock from you I should first establish that you sell such hammocks. At the pragmatic level this provides the shared context within which communication will occur.

(Human beings routinely negotiate these issues but IT systems lack this flexibility. Therefore, where the sender and recipient are IT applications, such as an ERP or online order taking system, these matters must, unless relevant standards already exist, be negotiated between the parties. The pragmatic context may include agreements on the legal significance of messages, the speed with which orders should be honoured, etc. At least some of the context may be stated in commercial terms and conditions, in contracts or in national law.)

From the context and his intent the sender must decide what information he needs to transfer. In the general case of inter-personal communication, including negotiating, teaching, etc., this is, in fact, the most difficult step. In IT systems its often strongly constrained by obvious needs, eg to specify the product being ordered, and constraints, eg everything has to be typed by a telesales agent.

This information is now passed to the semantic level where the sender selects the natural language and/or coding system to be used. The natural language should, obviously, be one that the recipient understands and this also applies to any technical terms used. In the case of coding systems there is often an obvious choice, eg the Gregorian calendar for dates, WGS 84 for lattitude and longitude, but other systems remain in use so it may be necessary to specify the system in use. The choice of units of measurement is also part of the semantic level.

(Where the sender and recipient are IT applications these decisions may be taken by the designer of one or the other or negotiated between them. They will not generally be taken by any individual user. They may be documented in a data dictionary but as natural language text as there are no commonly used notations for expressing semantic choices.)

At syntactic level the sender will apply his knowledge of grammar to create grammatically correct sentences in the chosen natural language. The grammar generally follows from the choice of natural language.

At lexical level the words are converted into letters and punctuation marks. In most cases the lexical rules
follows from the choice of natural language but there are a few languages, eg Serbo-Croat, that are written in more than one script.

At coding level the symbols are converted into bytes (this is almost always automatic).

Finally, the bytes are inserted into the fields in a pre-agreed structure (the format level).

(Where the sender and recipient are IT applications the format may be decided by the designer of one or the other or negotiated between them. There are several commonly used notations for defining such structures. It may be documented in a data dictionary and also stored in a database schema.)

We now have a sequence of bytes that can be sent electronically as a message with confidence that the recipient will be able to recover the intended meaning.

Tuesday 2 June 2009

Information IS data in context

There has been a lot of discussion about the meaning of data, information, knowledge, wisdom and metadata. Though often entertaining this discussion usually generates more light than heat. Here I will use an expanded version of the semiotic framework to resolve the confusion. I find that information IS "data in context" and data IS an encoded form of information. However, the context has several layers, each of which contributes to the encoding of information as data. Much of the confusion is due to ignoring the layered nature of the context.

I will consider the steps needed to fully understand a message (or stored record). I’ll assume that the message is digitally encoded – though this is almost irrelevant to the analysis. Initially, let’s assume that the message is English text.

a) I receive a message comprising a string of bytes.

b) I can covert this into printed, or displayed, characters. But to do this properly I must know which character coding system, eg EBCDIC, was used to create the message.

c) Next I use my lexical knowledge to recognise words and punctuation marks.

d) And then my syntactic knowledge to parse it. I now see the text as a structure, specifically a sentence, with a subject, clauses, etc.

e) Adding my semantic knowledge (of what the words mean) gives me the meaning of the sentence.

f) Finally I relate this meaning to other relevant knowledge (the context for the sentence) and see the significance of the message.

At each step the decoder, whether human or electronic, must apply information it already knows. This information may be called metadata. Here’s a summary:

Semiotic level

What is added by the receiver

The result

6 Pragmatic

Contextual knowledge.

Significance of the information.

5 Semantic

Meanings of words (from dictionary)

Meaning of the sentence

4 Syntax

English grammar

Parsed sentence.

3 Lexical

Lexical rules

Words

2 Coding

Character coding.

Characters

Input

n/a

Bytes

Now let’s generalise.

Numbers

Suppose the message consists of several distinct fields, some textual and some numerical. Then we’ll need to divide the message into fields before we apply the character code for text since numbers may use non-text coding. This will be level 1. And we’ll still need level 2 to turn the bytes into numbers.

There’s no lexical or syntactic level for numbers but we will need to know the semantics. Many numbers are measurements or predictions of measurements and for these we need to know what is measured and the units. We may even need to know how the measurement was made and by whom.

Other numbers are ratings or rankings and this also needs to be known.

Some of this may, of course, be given explicitly by other data items. Taken together this information allows us to convert the number into a sentence of known meaning, eg

  • 17.63 => The height of the mast is 17.63 m.
  • 623 => This MP’s expenses were the 623rd largest.

Finally, at the pragmatic level, we add contextual information to see the significance of the information, eg

  • The boat is too tall to pass under the bridge.
  • This MP is probably honest.

Image

Now suppose that we have fields containing image data. Processing is similar to numbers. The image coding, eg JPEG, is used at level 0 and there is no lexical or syntactic processing. To get the meaning of the image (semantic level) we need to know what sort of image it is, eg an aerial photograph or an X-ray image, its scale and perhaps other details about the equipment and settings used.

Finally, as before, the pragmatic level adds contextual information to let us see the significance of the image.

Composite model

We can put all this together in a composite model as shown in the table. If necessary the Images column can be generalised to cover the result of any sensing system, eg video, radar images, seismograph output.


Text

Numbers

Images

6 Pragmatic

Add contextual knowledge to get significance.

5 Semantic

Add dictionary knowledge to get meaning.

xx

And units.

And scale, part of spectrum sensed.

4 Syntax

Add NL grammar to get a parsed sentence.

xx

xx

3 Lexical

Add lexical rules to get words

xx

xx

2 Coding

Add character coding to get characters.

Add format rules to get numbers.

Add format rules to get 2D image.

1 Record structure

Add record structure knowledge to divide message into fields.


Bytes

I started by noting the muddle about the words data, information, etc. In fact these words are often used interchangeably and in different ways by people with different backgrounds and interests. IT people, however, should say ‘data’ when we want to discuss bits and bytes and their decoding and processing at lexical and syntactic levels. We should say ‘information’ when discussing the semantics and significance of the data.

Tuesday 12 February 2008

Knowledge and pseudo-knowledge

It’s often easy to distinguish prejudice from informed opinion. Prejudice is an opinion ‘without visible means of support’ whilst the holder of an informed opinion can back it with arguments and examples. The holder may be part of a professional community with its own journals, models, prizes, etc.

But is every subject with these features a form of knowledge? It turns out that quite a lot of them are not.

One relatively easy way to determine whether a subject is really knowledge is to ask whether the people practicing it can make valid predictions. By valid I mean that their predictions are correct significantly more than half the time and much more often than the predictions made by non-experts.

[Some subjects, eg theology, media studies, contain many theories that make no predictions. I have discussed three of these separately.]

Pseudo-subjects are rare, possibly unknown, in science and engineering because these subjects have institutionalised methods for testing their theories. They are much commoner in medicine largely because of the power of the placebo effect. For instance, Homeopathy makes predictions about the effects of its treatments. When carefully tested to exclude placebo effects, eg in double-blind clinical trials, these predictions are generally wrong.

They are commoner still in the subjects that study human activity, eg Politics, Economics and the humanities.

In The Black Swan – The Impact of the highly improbable (Allen Lane, 2007) Nassim Taleb looks at the evidence on the accuracy of predictions made in Security analysis, Political science and Economics. In each case he finds there are few studies but that those that exist show the predictive power of these fields to be poor.

Security analysis

Securities are tradeable financial instruments such as shares, options, collateralised loan obligations. The job of the securities analyst is to advise investors which to buy and which to sell.

T Tyszka and P Zielonka (Expert Judgements: Financial analysts versus weather forecasters. J. of Psychology and Financial Markets, 2002, vol 3(3), p 152-160) found that, compared to weather forecasters, security analysts are worse at predicting but have more faith in their predictions. An analysis communicated by Philippe Bouchard showed that predictions in this area are on average no better than assuming this period will be like the last period, ie worthless.

This is despite the analysts’ extensive knowledge of the firms and sectors they study!

Political science

Philip Tetlock (Expert Political Judgement: How good is it? How can we know?, Princeton University Press, 2005) asked about 300 supposed experts to judge the likelihood of certain events within the next five years. He collected c27,000 predictions. He found:

· Experts greatly overestimated their accuracy

· Professors were no better than graduates.

· People with strong reputations were worse predictors than others.

Economics

Taleb found no systematic study of the accuracy of economists’ predictions. Such evidence as exists suggests that they are just slightly better than random. For instance Makriades and Hibon (The M3-Competition Results, Int. J. Forecasting, 2000, vol 16, p 451-476) ran competitions comparing more and less sophisticated forecasting methods. They found that sophisticated methods, like econometrics, were no better than very simple ones.

Common threads

In each case:

· The experts know a lot more facts than the layman.

· They also know more theories about their fields.

· There is little published research about the accuracy of these theories.

· Supposed experts are not familiar with what little there is.

· Predictions cluster – experts copy each other.

· Predictions do not generally cluster around the true value.

These fields may have value as sources of analogies and models. Sometimes even remote analogies and very simple models can be helpful.

However, their predictions are generally worthless.

Theories that make no predictions

It’s interesting to look at three subjects (ie sets of facts and theories) that don’t seem to make predictions – and certainly give little thought to either predictions or testing. They are Theology, critical theory and business studies.

Analysis shows surprising parallels between them.

Theology

Theology does not seem to make predictions. Most theology is aligned with a single major religion, or even a denomination or sect, though some ideas are shared. Theology has existed as a discipline for at least 1,500 years and has shown some evolution. However it doesn’t seem to show much progress.

Theology has three fundamental problems:

· There’s doubt as to whether its subject matter – God – actually exists. Even saints and eminent churchmen have expressed doubts on this matter (though not usually in public).

· There’s no agreement as to how theological ideas are to be tested. Ideas may be tested against holy books (but which?), or tradition (which?) or personal revelation (whose?). Without this agreement progress is probably impossible.

· The views of scholars can be overruled by the official utterances of religious leaders.

Theology is thus the projection of religious feelings into the realm of reason and morals. There seems no reason to expect that such feelings will be good guides to the nature of reality and thus it should not be judged by its ability to produce predictions.

How it should be judged is another matter – and not one to which I have an answer. However, unless theologians can produce an answer it seems difficult to see why anyone should pay it any attention.

Critical theory

Critical theory is a collection of theories used by scholars in the humanities. These theories include Marxism, psycho-analysis and post-structuralism. An introduction to critical theory will mention several dozen theories. It’s immediately clear to an interested outsider that this isn’t at root a quest for truth in the sense that the sciences are. For while science has many controversies – and occasional feuds – these eventually get resolved with the successful theories being consolidated into the larger body of science. This hardly happens in critical theory – though some theories have become unfashionable.

If critical theory is not a quest for truth what is it? Critical theory is, I believe, a form of politics. Its theories reflect, often explicitly, political movements in society. They are the projection of those movements and issues into the analysis of cultural products such as books, films and clothes. Now to the degree that this analysis is right it should not be judged by its ability to produce predictions but by its ability to produce social change. That’s not a judgement that I shall attempt here.

Business studies

I use the term business studies to cover research and analysis published by business schools and management consultancies. There is a great deal of such material and the quality varies greatly.

· Some is rigorously empirical. It treats businesses as phenomena whose behaviour can be studied. This can lead to predictions. The results of testing such predictions are reported VERY occasionally.

· Some of this material is usefully analytical. That is, it dissects a significant business problem in order to identify threats, opportunities and constraints. It can be useful to managers without making predictions.

· Much, however, is weak. It takes a small number of examples – often selected on no clear basis – and draws conclusions that seem largely subjective.

As with theology and critical theory there’s plenty of change but little clear progress. There’s little evidence of an accumulation of agreed facts and theories and little referencing of the work of others. Often the same ideas appear at different times under different names.

Furthermore, names are often used by consultancies and research houses as sub-brands and there are real material rewards for consultants and analysts who create names that are adopted by the market. Thus the published material often reflects the competition between business schools, for students, and between consultancies, for clients. (Such competition is not absent in theology and critical theory but it is more marked here.)

If business studies is judged by the volume of valid predictions it performs poorly but many analysts and consultants would say that that is not its main purpose. Its purpose, they would say, is to recommend actions. How, then, should such recommendations be judged?

It is possible – medicine has the same purpose. Recommendations should be judged by their outcomes. This, however, is typically very difficult. The number of companies adopting any given recommendation may be small – and they may be taking other initiatives in parallel. And they may well treat all their initiatives as confidential.

Friday 18 January 2008

Why knowledge is NOT “justified true belief”

In Theaetetus, Plato’s Socrates argues that knowledge is “justified true belief”. That is, it is, a belief for which the believer has a justification and that is, in fact, true (see en.wikipedia.org/wiki/Epistemology). This doesn’t really work.

Firstly defining knowledge as belief excludes tacit knowledge, such knowing how to ride a bicycle. My ability to ride a bicycle does not depend upon my having any particular beliefs about riding or, indeed, bicycles. So Socrates’ definition can only apply to explicit knowledge.

To see why it doesn’t even work for explicit knowledge consider the following truth table for beliefs.

1

2

3

4

Is the belief justified?

Y

Y

N

N

Is the belief true?

Y

N

Y

N

Does it meet Socrates’ definition?

Y

N

N

N

Justification: I agree with Socrates in wanting to distinguish knowledge from mere guesswork. To count as knowledge a belief needs to be justified. Suppose you believe that Hungary is an attractive market for your latest product. This is a justified belief if you can produce evidence for it. I would accept market research and the success of similar products as sufficient evidence so if you have that data I’ll accept your belief as justified. I’m prepared to agree that a belief is justified even if the evidence is not conclusive, as it won’t be in this case until you’ve tried to sell it in Hungary.

Truth: The problem comes in distinguishing cases 1 and 2 in the table. Is your belief about the Hungarian market true? The only way that I can know this is to wait for the results of your Hungarian launch. And if you never launch your product in Hungary then neither of us will ever know if your belief was true – and therefore knowledge. This is inconvenient but not absurd.

How about the market research? Is that knowledge, ie is the data you have accurate and appropriate? You believe so and you have reasons for doing so, eg that you used a reputable firm with experience of the Hungarian market. You certainly have a justified belief. But can you know that it’s true? You cannot be certain, so this, also, does not meet Socrates’ definition of knowledge.

Not truth but confidence

The reality here is that beliefs can be held varying degrees of confidence. We can rarely be certain but are often, justifiably, confident. And we determine the proper degree of confidence using just those arguments that constitute the justification for the belief.

So knowledge, I suggest, is those beliefs where the arguments and evidence in favour make them seem more likely than not. Beliefs where the arguments and evidence in favour fall short of that are rumours, opinions or prejudices.

This view has two immediate consequences:

  • In applying knowledge we should consider how sure we are of its truth. We should not claim certainty where the evidence is doubtful.
  • Systems that store knowledge should include assessments of reliability and/or pointers to the sources of the supposed knowledge.

Thursday 17 January 2008

Now ‘models’ are replacing ‘views’

Think about accounts. All businesses have to produce financial accounts for their auditors, shareholders and regulators. Back in the bad old days of BC – Before Computers – those were often the only accounts produced. Now, of course, almost every business also produces a variety of routine management accounts plus ad hoc analyses as needed.

These are possible both because we have computers to do the work and because we keep lots of financial data in databases. This data constitutes a model of the business (see my post on Types of Model). To the degree that it’s a good model all the required accounts can be derived from it. The accounts are views of the model and there are an unlimited number of valid views.

This change from creating a few predefined views to creating a model is not restricted to accounting. In fact it’s pretty general.

The trend to model building
In the past when people wanted to communicate a design or understand a thing or process they created a view of the thing or process. These views included maps, accounts and blueprints and required special materials, tools and skills. Usually sets of these were needed to define a territory, business or design and it was difficult to keep them in synch. Each kind of view was defined by a list of allowed elements or features (a meta-model); other elements and features being either ignored or indicated by annotations.

Now an organization is increasingly likely to build a digital model of the thing of interest from which it can derive any number of views. The model is also defined by a list of allowed elements or features but a longer list than for any view. From the model we can produce both familiar and novel views and there is no synchronization problem.

Examples include:

  • Maps: Many maps are possible for any territory. For instance they may show or omit roads, railways, and contours. Nautical charts show almost nothing on land but a great deal about the sea. Ordance Survey has digitised its map data and derives actual maps from this resource. Many companies now have Geographical Information Systems that allow them to combine their own data with that available publicly.
  • Accounts: Databases of assets and transactions support many kinds of financial and management accounts.
  • Engineering design: Traditionally engineers produced plans, front and side elevations and cross-sections. CAD models can yield both blueprints and lists of parts and jobs.
  • Building design: Construct IT at Salford Univ. has proposed that building projects should be based on a shared database that fully defines the building.

Being digital these models support many kinds of analysis and processing that were either impossible or very expensive when only views were available, eg calculations of load, simulation of performance or experience.

  • Civil engineers can show what their constructions will look like when complete.
  • Aeronautical engineers can simulate airflow and thus calculate performance and fuel efficiency.

Back to accounting

However, most accounting ‘models’ are not good enough to simulate the consequences of changes in processes or trading conditions. Some organizations have built good enough models but not, generally, as part of their accounts.