A Grammar of Questions.

Quistic grammar is based on what I believe to be the biological basis of language in the brain: a set of models of our internal and external environment, specifically including the social environment, with the models being labelled with tokens for purposes of logical manipulation of ideas.

... it is my contention that the development of language is a direct consequence of our brain’s data organization function which gives rise to a data framework, or belief system.
Gerhard Adam

Language and the thirst for knowledge.

A language universal is a statement that is true for all known human languages. An absolute universal is a statement of the form: all x have y, and an implicational universal is a statement of the form: IF x is present THEN y is present. For example, since we all have bodies, it is no surprise that all human languages have names for body parts.   However, all languages do not divide the body into parts in the same way. For example, a language may have words for hand and arm, but lack a word for 'wrist'.  The naming of body parts is a language universal, but the precise identification of parts is not.

If there is one true semantic universal in language it is this: all human languages have a means for formulating questions.   I am referring to the semantic aspect of questions, rather than the syntactic aspect - the meaning or intent of a query rather than its particular grammatical form.

Children notoriously ask a lot of questions once they have a grasp of language. Perhaps language has evolved from having a primary function of command to having a primary function of conveying information.   In terms of the way information is conveyed - the semantic component of language - any instance of language use can be analysed as if it was designed to answer an un-asked question.  There appears to be a social aspect of knowledge sharing in language - we tell each other things on the unspoken assumption of a desire to know, a need, a query.

Grammars for computational linguistics.

There are three broad categories of grammar. Prescriptive, or normative grammar is derived from the writings of prominent authors and attempts to define best practice for the users of the relevant language. Descriptive grammar is more scientific - it attempts to describe how ordinary people actually use a language. It is strange a fact that the language of a small, remote community with no writing is often more accurately recorded than the spoken language of users of the world's major languages. This comes from a historic over-reliance on literary language as a resource for the study of the grammars of major languages.

The third class of grammars is the class of 'scientific grammars' as rules for the formulation of sentences. The term 'scientific grammar' is my own. The science aspect comes from the use of mathematical procedures to model aspects of language. Most commonly these grammars have been called structural, generative or transformational grammars.

In general, such grammars are based heavily in syntax.   Most especially since Noam Chomsky's Syntactic Structures, published in 1957, most grammar-based approaches to language have been almost entirely based on syntax.   Indeed, for many researchers, the terms 'grammar' and 'syntax' seem to have been used synonymously.  Although this approach has been quite productive and instructive, it has limitations.  In computational linguistics it seems that we must keep adding new rules of grammar to cater for the exceptions which keep cropping up in natural speech.   Another problem is that these grammars operate so as to turn the analysis and the synthesis of sentences into two distinct operations.

Natural Language Processing, NLP, covers the use of computers to analyse human language with two major objectives: the understanding of how language works, and the building of a machine or computer program that can interact with humans using plain, everyday human language.

The central problem for NLP is the transformation of potentially ambiguous utterances into an unambiguous internal representation of data structure the computer can work with to produce the desired output.
Klaus K. Obermeier Natural Language Processing Technologies In Artificial Intelligence, 1989

The problem with syntax-based grammars is that they do not, indeed cannot, represent semantic data structures. Working from the notion of human ideas as held in neural structures, I have worked out a semantic grammar which may be a model of how the brain stores information and of how that information is linked to language.

Quistic grammar.

Quistic grammar is based on the observation that all languages support queries, and that all small children ask lots of questions.    I suggest that our language mechanism is inextricably bound up with our thirst for knowledge.    Quistic grammar is based on the information requested by different query types: the who, what, where and when etc. of life.

I keep six honest serving-men
(They taught me all I knew);
Their names are What and Why and When
And How and Where and Who.
I send them over land and sea,
I send them east and west;
But after they have worked for me,
I give them all a rest.
Rudyard Kipling, The Elephant's Child.

I suggest that the construction of a sentence is like the threading of a string through a maze. In the maze there are simple ideas, and the string connects ideas in a specific sequence according to the logic of semantics, rather than syntax. The ideas are stored in neural structures and the 'string' is a set of temporary interconnections.   Once the string has been constructed as a structure in a semantic grammar it is used as a groundwork to build the most appropriate corresponding syntactic structure, which is then transmitted as speech.

I suggest also that the hearer takes the syntactic structure in the speech and 'threads' a semantic structure to match it, within the hearer's knowledge and experience.  In the syntactic structure there are multiple cues which, together with the communicators' shared knowledge, can be used to predict the next idea in the sequence. The ability to predict a syntactic or semantic element suggests that the brain, in analysing a speech input, is actually constructing a speech output string - the brain of a hearer acts as if it were the speaker. There is, I suggest, an element of 'how would I phrase that?' in our reception of speech.

If the structures, or ideas, in the brain are labelled with a simple set of category-tokens, then I suggest that this is necessary for the building of syntactic-semantic strings. The labels or tokens identify the structures by their role in the answering of questions. This is necessary but not sufficient to human language. But the stored ideas are richly interconnected by relations of similarity and difference, co-occurence, emotional trigger and other associations. Such a core of labels together with the richness of interconnection is, I suggest, a sufficient and necessary set of conditions for the formation of a semantic template onto which the components which eventually emerge as speech can be grafted.

Using quistic grammar, complex sentences can be analysed or synthesised using a single mechanism. There is no need for two mechanisms or even a bi-directional mechanism. A path is followed by which words are chosen from 'word buckets'. The analysis / synthesis is somewhat stochastic: words in the 'word buckets' are arranged by frequency of occurence and by recency of recall. This is why the average person is more likely to say 'walk to the shops' rather than 'perambulate to the emporia'.

Language is used to talk about the who, what, where, when, why and how of our world. Quistic grammar focusses on that aspect to categorise all parts of speech, not as verbs or nouns, but as people, places, objects and actions. Analysis of a string of words, whether as a phrase, sentence, paragraph or even a book is thus simply a matter of asking questions such as: "Who did what to whom, and where and when and why and how?"

The topic of quistic grammar is further developed in A Science Of Human Language