WRITING ABOUT LITERATURE
There are many ways to write about literature, many ways of reading, interpreting, and appreciating literature. Assignments in literature can range from close reading of passages to very broad discussion of its themes and ideas. The following are some of the ways instructors may ask you to approach and understand literature:
- The summary, that is, a critical retelling of the story and examination of its interior logic
- Passage analysis, a close reading of a selected or shorter passage
- Structural analysis, or an examination of plot elements and why the author has so arranged them
- Character analysis, for example, discussing characters' motivations or how they externalize themes and ideas
- Point-of-view analysis, or who tells the story and how this affects the telling
- Metrical analysis, that is, looking at how the rhythms and patterns of language communicate ideas
- Style analysis in prose, or examining how an author says what he says
- Tone analysis, or discerning the overall mood of a work
- Historical or cultural analysis, that is, how a piece reflects the beliefs and values of the society that produced it, or how history can shed light on a work for modern readers
- Comparing and contrasting to demonstrate similarity, difference, or superiority
- Imagery analysis, or the sensory impact created by words
- Symbolism analysis, or how the things in a story simultaneously represent something else
- Ideas or themes analysis, or detailed discussion and evaluation of the author's ideas
- Evaluation, or a judgment on the overall quality and value of a work
The different ways of examining literature frequently overlap; for example, characters or authors' styles can be compared or contrasted, a passage analysis might focus on ideas, or an author's imagery can be found to contribute to the tone of a composition. Bearing in mind this overlap, the following are suggestions for ways to organize papers that are commonly assigned in literature classes. Adapt these suggestions as needed to suit your particular situation.
1. The summary
There are two elements to consider when writing a summary:
- The story or plot-the main events as they appear in chronological order. The main events are those that propel the story forward.
- The causes, motivations, or logic underlying the story.
In pre-college or lower division work, a fairly simple thesis may be satisfactory:
Jurassic Park by Michael Crighton is a fast-moving, suspenseful story.
For transfer level or upper division work, the fact that you are summarizing an existing work will not excuse you from developing a thesis statement. Generally, college instructors look for an interesting, provocative, and supportable thesis, even for a "mere" summary. The thesis statement for a summary paper makes a claim about the events, motivations, and interior logic of the work:
J.R.R. Tokien's The Lord of the Rings shows that renunciation of power can lead to individual salvation.
Alternately, you may theorize that the book shows that individual evil cannot be overcome without help from other people or that divine intervention aids right effort. Whatever your claim, you should logically support your thesis while summarizing the story.
In general, organize a summary paper as follows:
- In the introduction, identify the work, the author, the most significant characters, and the general situation. State your central idea in a thesis sentence.
- In the body, summarize the events and evaluate the logic of the story. Keep in touch with the original work, citing significant words, phrases, or passages to illustrate.
- End with a restatement of your thesis and a conclusion that goes beyond the points you have already made.
2. Passage analysis
If you are asked to analyze a passage of a longer work, don't be tempted to cheat–first read the entire work to make sure that you understand the relation of the part to the whole. What are the central ideas or themes of the whole work? Then study the passage you are to write about. What is its central idea?
The chapter "The Mirror of Galadriel" in Tolkien's The Lord of the Rings shows that the elves' ability to handle uncorrupted power (as illustrated by the Mirror and the elvish rings of power) does not guarantee their ability to handle corrupted power as it is embodied in the One Ring. Rather, Galadriel, a heroic figure, refuses the offer of the One Ring knowing that its destruction will diminish her as well. Wisdom to refrain from using evil is a central struggle throughout The Lord of the Rings.
In general, organize a passage analysis paper as follows:
- In the introduction, describe the particular circumstances of the passage, placing it in the context of the longer work. Who speaks? What is the setting? State a general reaction to the passage.
- In the body, combine the results of your close reading of the passage with one of the central ideas you have discovered in the work as a whole. Do they reinforce each other, or does tension exist between the two? Use examples from the passage to back up your claims. Always explain how the material you quote supports your point.
- End with a restatement of your thesis and a conclusion based on the points you have made.
3. Structural analysis
Structure is the organization of a literary work-how and why the author has arranged plot elements or ideas. The structure of a work should support not only the plot but the ideas the work explores; therefore, an analysis of structure must begin with the understanding of these ideas and then go on to examine the work's organization in terms of its success in supporting or communicating them.
The structure of D.H. Lawrence's short story "The Rocking Horse Winner" supports one of the story's themes–that objects are no replacement for love. For example, rather than engendering content, the advent of money only worsens the household situation: "The voices in the house suddenly went mad,...[they] simply trilled and screamed in a sort of ecstasy: ‘There must be more money!'"
In general, organize a structural analysis paper as follows:
- In your introduction, state the most important ideas in the work and explain how the work's organization contributes to them.
- In the body, provide selections from the text to back up your claims about themes. Then, discuss the way in which these ideas influence the form. Also, describe how the form influences the ideas–that is, how the parts relate to the main ideas.
- In the conclusion, evaluate the structure: Are all the parts of the work necessary? Are all parts equally important? Would the work be damaged if any part were omitted or transposed? Are the parts successful in creating the author's intended impression?
4. Character analysis
A character analysis is a detailed examination of some aspect of a character or characters. For example, you might describe the behavior of characters as it reveals their motivation. Or, you might discuss how the author develops a character to embody themes or ideas. You may compare similar characters or contrast diverse characters-the possibilities for character analysis are probably endless.
There are several ways an author can reveal character in his principle actors:
- By what the person himself says (or thinks, in the first-person or third person omniscient point-of-view)
- By what the person does
- By what other characters say about him or her
- By what the author says about him or her, speaking as the storyteller or as an observer of the action
Sometimes you may find that the setting reflects characters' feelings, values, or states of mind. To illustrate, certainly the storms in Shakespeare's King Lear mirror both the action and the inner states of various characters.
Much of character analysis involves your impressions–supported by citations from the work-of a character's character. For example, in examining Michael Cunningham's The Hours, you might find that Clarissa is motivated by love for material objects and by too much of a need to be accepted by others, and yet acknowledge that, ironically, she is the only healthy character in the novel. Your essay could contrast some of her opinions, experiences, and emotional reactions with Laura's and Virginia's, the other principal characters.
In general, organize a character analysis paper as follows:
- In your introduction, clearly state a central idea about the character or characters.
- In the body, develop and support your claim. Choose a pattern of organization: around primary characteristics, around central incidents that reveal primary characteristics, or around various sections of the work.
- In the conclusion, state how the selected traits of these characters relate to the work as a whole.
5. Point-of-view analysis
Point of view is the personal pronoun-I/we, you, he/she/they–from which a story is told. Mark Twain's The Adventures of Huckleberry Finn is told from the first-person, or "I" point of view:
The widow she cried over me, and called me a poor lost lamb, and she called me a lot of other names, too, but she never meant no harm by it.
On the other hand, John Steinbeck's Cannery Row is told from an omniscient third-person, or "he" point of view:
The bums who retired in disgust under the black cypress tree come out to sit on the rusty pipes in the vacant lot. The girls from Dora's emerge for a bit of sun if there is any. Doc strolls from the Western Biological Laboratory and crosses the street to Lee Chong's grocery for two quarts of beer.
The speaker's attitudes and the scope–or limitations–of his knowledge contribute to the development of the work. Point of view should help make the work seem authentic and contribute to the overall mood. In Poe's "The Telltale Heart," for example, the first-person point of view contributes greatly to the agitated, claustrophobic feel of the story:
Meantime the hellish tattoo of the heart increased. It grew quicker and quicker, and louder and louder every instant. The old man's terror must have been extreme! It grew louder, I say, louder every moment!–do you mark me well? I have told you that I am nervous: so I am.
Organize a point-of-view analysis as follows:
- In your introduction, describe the work, answering the following kinds of questions:
- Who is the speaker?
- What is the character and background of the speaker?
- What is his function in the story-protagonist, antagonist, supporting character?
- What is his relationship to the person listening to him?
- Does he speak directly to the reader, or in such a way that the reader is a witness or eavesdropper?
- Does the speaker rely on others for information?
- Is he/she affected by the action?
- In the body, analyze the effect of the speaker on the situation and vice versa, pursuing any of the above questions that appear promising. What is produced by the perception, ideas, and language of the narrator?
- In the conclusion, evaluate the success of the point of view: Is it consistent? Effective? Truthful? Does it succeed in making events and motivations more probable and believable?
6. Metrical analysis
Metrical analysis, or prosody, gives you the opportunity to develop your sensitivity to the sound of language and to become aware of the power of sound and rhythm to communicate and amplify meaning.
William Carlos Williams' descriptive poem "The Dance" illustrates how patterns of stressed and unstressed syllables help convey the writer's meaning, in this case suggesting the relentless rhythm and whirling dance of fair-goers. The author also uses onomatopoeia in the stressed words chosen to approximate the sounds of the bagpipes:
In Breughel's great picture, The Kermess,
The dancers go round, they go round and
Around, the squeal and the blare and the
Tweedle of bagpipes, a bugle and fiddles
Tipping their bellies...
And in the lines below from Edna St. Vincent Millay, the longer vowels and flowing cadence of the first line contrast with the crisper sound and slightly choppy rhythm of the second to suggest the movement, look, and sound of the waters:
The larger streams run still and deep,
Noisy and swift the small brooks run...
A metrical analysis may include discussions of foot (a unit of accented and unaccented syllables), meter (a predominant pattern of feet), rhythm (which, as in music, drives a sense of movement), rhyme (an echoing of sound and structure), and stanza (group of lines).
In general, organize a metrical analysis paper as follows:
- In the introduction, include a brief discussion of the rhetorical or dramatic situation of the poem as it leads into a consideration of prosody:
- Is the poem narrative, descriptive, argumentative, or expository?
- What is the principle idea of the passage?
- What is the dominant mood?
- In the body, discuss the rhythm of the passage and the basic metrical pattern and variations (where they occur). Look at the relationships of the syntactic units to the meter–is there a conflict, or is there agreement between sentence structure and metrical emphasis? Discuss the sound of the passage, including quality and length of sounds. Also discuss assonance, alliteration, consonance, onomatopoeia, and patterns of consonant and vowel sounds.
- In the conclusion, evaluate the success of the passage. Are the metrical devices appropriate? Do they augment the idea of the passage? Or do they contrast or conflict with it? Do they give the passage more power than it otherwise would have?
7. Style analysis in prose
Style means all the ways in which a writer uses words, phrases, and sentences to achieve his desired results. In particular, the connotative and symbolic values of words, their rhythm and sound, and the complexity or simplicity of grammar combine to reveal the style of a writer.
Diction, or word choice, might be the easiest element to observe in style analysis. Words do not exist by themselves, but jostle each other in a context in which one word affects another. The connotation, or implied "color" of a word, the symbolic value of some words, and words' functions in similes, metaphors, and other figures can all be fruitful to examine.
Despite the difference between poetry and prose, various "poetic" devices such as alliteration, assonance, and onomatopoeia may be at work in prose. Read the passage aloud and listen for its rises and falls, its rhythms and lengths of utterance.
Grammatical analysis can reveal complexity or simplicity in an author's style. Sentences may fall into patterns, or an author may use recurring rhetorical devices. A mere description of grammar, however, can be deadly dull if it leads to no generalizations.
John Steinbeck, for example, is an American author generally straightforward in his style and not given to florid expressions or hyperbole. This holds true for his book Tortilla Flat. But here, in addition, the author uses grammar skillfully-mostly short, simple clauses alone or in strings of run-ons-to suggest at once the childlike simplicity and occasional duplicity of his characters. He sprinkles their dialogue with Spanish expressions and diminutive endearments. He even succeeds in approximating the respectful form of address found in Spanish, and which has no like form in English:
"Ai, Pilon, amigo!...Pilon, my little friend! Where goest thou so fast?...I looked for thee, dearest of little angelic friends, for see, I have here two great steaks from God's own pig, and a sack of sweet white bread. Share my bounty, Pilon, little dumpling."
Pilon shrugged his shoulders. "As you say," he muttered savagely.... Pilon was puzzled. At length he stopped and faced his friend. "Danny," he asked sadly, "how knewest thou I had a bottle of brandy under my coat?"
In general, organize a style analysis paper as follows:
- In your introduction briefly describe the author and the example of his work-the composition or passage–you will discuss.
- In the body, discuss one or more of the three areas of analysis listed above: grammar, rhythm and sound, or diction. Cite your source to illustrate.
- In the conclusion, evaluate how-and how well–the rhythm and connotations of the author's words and sentences contribute to the author's intended effect as you understand it.
8. Tone analysis
Tone is the overall atmosphere, or mood, of a work. An author's tone suggests his or her attitudes (though tone and attitude are often used synonymously). Tone is not so much seen as sensed, not so much stated as implied. Tone pervades the entire work, not just individual parts, settings, or characters.
An author uses every story element to reveal tone–diction, character development, point of view, grammar, structure–everything. Words carry connotative or emotional overtones in addition to their denotative or dictionary meaning; characters can be nefarious or sincere; even grammar can be redolent of meaning or sparse, austere, direct. In addition, the structure of a work can be linear or complex, reinforcing or contrasting with the dominant mood.
What tone do you feel in this opening paragraph of D.H. Lawrence's "The Rocking Horse Winner"?
There was a woman who was beautiful, who started with all the advantages, yet she had no luck. She married for love, and the love turned to dust. She had bonny children, yet she felt they had been thrust upon her, and she could not love them. They looked at her coldly, as if they were finding fault with her. And hurriedly she felt she must cover up some fault in herself.
Although as straightforward as a fairy tale, the way good things turn bad gives this passage a bitter tone, and the story remains bitter when the mother is present. In her absence, the tone is often anxious:
The Grand National [horse race] had gone by: he had not "known," and he had lost a hundred pounds. Summer was at hand. He was in agony for the Lincoln [horse race]. But even for the Lincoln he didn't "know" and he lost fifty pounds. He became wild-eyed and strange, as if something were going to explode in him.
"Let it alone, son!" don't you bother about it!" urged Uncle Oscar. But it was as if the boy couldn't really hear what his uncle was saying.
The shift in tone is central to the dynamics between mother and son: she is bitter and unhappy, and he anxious and isolated.
Describe tone with adjectives: simple, straightforward, complex, forceful, gentle, ironic, sarcastic, understated, sympathetic, humorous, angry, analytical, evasive, sardonic, neutral, hostile, grand, serious, ghoulish, mournful, comic, jovial, friendly.
In general, organize a paper on tone as follows:
- In the introduction, describe the tone and state the role tone plays in the piece. Your thesis should enumerate those areas you plan to investigate.
- In the body, expand on your central idea. If there is unity of tone, treat various sections to show this. If there is a shift in tone, or complexity of tone, demonstrate this.
- In your conclusion, relate your conclusions about the tone to your overall understanding of the work.
9. Historical/cultural analysis
Though most literary artists aim at universality, their works are also products of their time and place. Historical and cultural analysis examines the beliefs and values of the society that produced a piece, or seeks to clarify the work itself for a modern reader by shedding light on its historical milieu. Historical or cultural analysis can also compare or contrast beliefs or values expressed in a work to modern beliefs or to those of other cultures.
If, for example, you were doing an historical analysis of Virgil's The Aeneid, your introduction would explain the forms of government, social hierarchies, pantheism, and art of ancient Rome. You would speak of city-states, tyrants, the custom of slavery and how slaves were treated; you would discuss the way armies were conscripted.
When you engage in historical or cultural analysis, observe the effects on the composition of its time and place in history as well as the effects the work may have had on readers of its time. To evaluate you may ask and answer these questions: What elements of the times and culture are present in the work? Are there elements that challenge or diverge from the norm? In historical analysis, discuss to what extent history has bypassed the problems delineated in this work.
In general, organize an historical or cultural analysis paper as follows:
- In your introduction, discuss the period in which the work was written and the relationships of the work to that period–to what extent does this work reflect the beliefs and values of its time, or challenge them?
- In the body, refer to the author's style, structure, main ideas, and preconceptions. Consider the setting and events of the story, as well as the clothing styles, machinery, speech habits, topics of conversation, and habits of thought of the characters.
- In the conclusion, evaluate the work according to its success in giving a picture of life at a particular period. Does this work remain relevant to readers today?
10. Comparing and contrasting
You may compare (show similarity with) or contrast (show differences between) almost any of a great many elements within a work or between two works–characters, motivations, point of view, tone, or themes, for example. You may compare or contrast the works or characters of two different authors or of two by the same author. You might compare or contrast two authors' styles. Whatever you compare or contrast, your purpose is to demonstrate their similarities and/or differences, or the superiority of one over the other.
For example, you might compare and contrast the characters of Bilbo, Frodo, and Gollum in
The Lord of the Rings. You may decide that the ring symbolized fate for the three characters. You explain how Bilbo was fated to find the ring, how Frodo was fated to destroy it, and how Gollum was fated to be destroyed by it. Further, you may argue that these fates reflect basic character traits of the three: Bilbo the adventurer, who grew up in an innocent age; Frodo the adult, who had responsibility thrust upon him; Gollum the fallen, who couldn't resist temptation. If you wished, you could even argue further that these three facets actually are part of everyone, and that the three characters are really one character.
In general, organize a compare/contrast paper as follows:
- In your introduction, state what you are comparing or contrasting and formulate your claim about similarities, differences, or superiority.
- In the body, support your claim by reference to the works, citing examples as needed.
- In the conclusion, reiterate your main idea. Acknowledge the limitations of your treatment of these works. Point out the implications of your treatment, drawing conclusions, if possible, beyond (but still based upon) the points you have already made.
11. Imagery analysis
Imagery communicates meaning on a level even more basic, perhaps, than words. It is the use of figurative language to evoke sensory, emotional, psychological, or intellectual responses in a reader by showing rather than by telling. Figurative language, including allusion, metaphor, simile, allegory, personification, symbolic embodiment, all use sensory details–seeing, smelling, hearing, touching, tasting–to conjure sympathetic feelings in the reader.
Meaningful, thoughtfully constructed imagery creates mood, externalizes thought, and increases dramatic effects (especially if there are abrupt changes in imagery). Imagery might exploit the etymology, or origin and history, of words, to subtly revive their original meanings.
In general, organize a paper about imagery as follows:
- In your introduction, state your central idea about the imagery of the work you will examine, for example, that it is mainly visual, or that it makes the poem more powerful, or that it is ineffectual.
- In the body, describe the imagery–you might group images by type, such as animal imagery, food imagery, or sexual imagery. Discuss the kinds of responses elicited by the imagery: What feelings or ideas does it evoke? Does this work rely on a particular device or technique to generate images? You may also address some of the following:
- the frames of reference, or sources of the imagery.
- the effect of one image, or series of images, upon other images and ideas in the work.
- the way imagery causes suggestions and implications to appear in the work.
- In the conclusion, state how effective the imagery was in communicating the author's ideas or strengthening the intended effects. Was the imagery meaningful, or merely decorative?
12. Symbolism analysis
When a thing in a story–a person, place, thing, or action–represents more than itself, either by association, resemblance, or convention, it is a symbol. To some extent, symbols are cultural icons, allusions to important historical or mythical events or religious customs. A certain amount of cultural immersion may be necessary to pick out and appreciate the symbols in a work. Other symbols seem almost universal, or at least frequently recurring. For example, water often represents life no matter when or where you live. And the image of the ubiquitous trickster-hero appears in many cultures (think Coyote, Loki, Br'er Rabbit, Bugs Bunny).
To identify literary symbols, look for something that recurs in a work. Does this "something" appear when the author is making related points? The interpretion of literary symbols is fluid, as objects can carry meaning at several levels. Take the symbol of a ring, which in western culture symbolizes commitment-and possibly entrapment. In The Lord of the Rings, the master ring symbolizes the ultimate entrapment: power, corruption, fear, hate, amorality, hubris, domination, addiction, slavery, seduction.
In general, organize a paper discussing symbolism as follows:
- In your introduction, identify the main idea of the piece and briefly describe the symbol or symbols as they relate to this idea.
- In the body, develop the relationship between the ideas and the symbols, citing examples to illustrate your points. Are the symbols particular to a certain culture or do they appear to be universal? Are the symbols essential to understanding the ideas?
- In the conclusion, evaluate the success of the symbolism. Do the symbols enrich the story or are they merely embellishments?
13. Ideas and themes analysis
While the theme of a work is usually implied by the events, characters, and other devices of the work, the ideas typically are explicitly stated. You might find the ideas put into the mouth of a principal character, or they may be revealed by a third-person narrator. If the author's own voice is heard overtly in the work (in the person of an observer, perhaps, or in a first-person narrative), it may be the one to articulate the ideas.
To identify ideas and themes, look for some of these:
- Direct statements by the author
- Direct statements by the author's persona, the narrator
- Dramatic statements made by the characters in the work
- Characters who stand for ideas or themes
A central idea stated in Victor Hugo's Les Misererables, for instance, is put forth by a third-person narrator commenting on the actions of characters in the story: "...Those are rare who fall without becoming degraded." As there may be many ideas embodied in a work, it is usually best to write about one important idea rather than several.
Philosophical novels, such as James Redfield's The Celestine Prophecy, interconnect many ideas because the main purpose of such novels to explore ideas rather than develop character. Expecially to discuss this genre, you may need to choose a main idea and write about how well some of the other ideas logically connect with it. For instance, The Celestine Prophecy proposes that a deep purpose underlies seeming coincidences. The book connects this idea to the complex idea that we must be nonjudgmental in our attitude to experience the deeper meaning of coincidences. This is supported, the book points out, by experiments in quantum physics: expectations affect results. The author proposes that expectations are energy. Further, the book posits that an unmanifest source of energy exists from which humans can learn to draw, leading eventually to expansion of human consciousness–and the reason we have coincidences. In a paper about the ideas in this book, you could conclude that the disparate ideas really form a cohesive unit.
In general, organize a paper about ideas as follows:
- In your introduction name the idea you intend to discuss. You might also show how you arrived at your decision to write about that particular idea-Why, briefly, have you chosen to discuss this idea?
- In the body, show the ways in which the writer brought out the idea in his work. How forcefully is the idea presented? Use illustrations from the text that are clearly relevant and reinforce your point.
- In your conclusion, evaluate the idea and its relevance and function in the work. How convincing is it in the story?
Too often students will avoid making judgments or commitments about literature, though they may describe beautifully the metrical structure of a poem. Yet the ultimate goal of all literary study is evaluation, the act of deciding what is good, bad, or mediocre.
While personal preference may guide what you read, it is valueless as the basis for literary evaluation if it is purely whimsical, without any basis in thought or knowledge. But by what standards may a work be judged good, bad, or indifferent? At the risk of oversimplifying, we will classify these standards as truth, vitality, and beauty.
First, we can try to evaluate a work in terms of truth or honesty. A work should have interior logic and be true to its own purpose; that is, every aspect of a story should contribute to achieving the main purpose. In addition, a work should reflect life and offer insight into humanness. Ask these questions: Is this work believable? If it is a fantasy, does it stay true to its own fantastical framework? Do you believe in this universe, or are you at least willing to believe in it? Does this work touch upon universal truths or is it merely escapist? Does it seem to express authentically the realities of human nature? Does it reflect the world as you know it?
A good work of literature appears to have vitality, or a life of its own. To evaluate the vitality of a work, ask these kinds of questions: Is this work, are these characters complex and rounded, or are they flat and unbelievable? Is the point of view convincing? Does the author show emotions through action, or merely tell about them? Does the work keep your interest? Are there surprises in the plot, or is this a formula story? Is there a dilemma or paradox that is true to life?
Last, you can try (if you are brave) to discuss a work in terms of its beauty. Bear in mind that the beauty of a literary work is not found in pretty scenery or attractive characters; but rather transcends all a work's individual elements. (A bad composition about a very attractive woman, for example, is still a bad composition.) Discuss proportion and balance, coherence and unity, apt imagery and language. If you do not subscribe to such classical definitions of beauty, look for asymmetry, vertigo, contradiction, and grittiness! Define beauty in your own terms, and go on to support your claim about the work.
In general, organize an evaluation paper as follows:
- In the introduction, briefly describe your central idea and the points by which you expect to demonstrate your idea. State on what grounds you are evaluating this work, and describe your criteria. (This is where, for example, if you have chosen to define beauty in highly personal terms, you would set out your criteria for beauty.)
- In the body, demonstrate the grounds for your evaluation. Show the good points (or deficiencies) of the work you are evaluating. Such points might be qualities of style, idea, structure, character portrayal, logic, point of view, and so on. Describe the probability, truth, or force with which the work demonstrates your claim.
- The conclusion should be a statement on the overall impression of the work you are evaluating. Did you find it good? Bad? Mediocre? Truthful? Vital? Beautiful?
Richard P. Gabriel
This paper was published in 1983
Deliberate writing: When I sit down to write several paragraphs of English text for an audience which I cannot see and which cannot ask me questions, and if I care that the audience understands me perfectly the first time, I am engaged in 'deliberate writing.' Such writing is careful and considered, and for a computer to write deliberately many of the outstanding problems of artificial intelligence must be solved, at least partially.
I want to contrast deliberate writing with spontaneous writing and with speech. For the remainder of this presentation I will use the term 'writing' to refer to deliberate writing, and I will use the term 'speech' to refer to both casual writing and speech.
Vivid and Continuous Images
Whether you are writing fiction or non-fiction, good and careful writing has two important qualities: it must be vivid and it must be continuous. I have borrowed these terms from John Gardner [Gardner 1984], who wrote of them in the context of fiction writing, but I think they are appropriate in non-fiction writing as well.
In a vivid piece of writing the mental images that the writer presents are clear and unambiguous; what the writer writes about should appear in our 'mental dream' exactly as if we ourselves were thinking the thoughts he is describing. When the writing produces this clear image we can absorb what he writes with little effort.
In a continuous piece of writing there are no gaps or jumps from one topic to another. The image that is produced by the writing does not skip around. In non-fiction, especially in technical writing, the problems and questions we have about the subject are answered as soon as we formulate them in our minds. That is, as we read a piece of technical writing we are constantly imagining the details of the subject matter. Sometimes our image is confused because we are not sure how some newly presented detail fits in, or we are uncertain of the best consistent interpretation. At this point the writer is obligated to jump in and settle the matter or provide a clarification. This way we do not have to stop and think, or go back to re-read a passage or some passages.
Insofar as our image must be vivid, it must also be continuous. If our image is discontinuous it cannot be vivid - it is blurred or muddy at the point of discontinuity. Similarly, if our image is not vivid it must be discontinuous - we are apt to stop and wonder about the source of blurriness, and at that point our image stops being continuous.
Computers and Writing
I believe that writing is the ultimate problem for artificial intelligence research. Among the problems that must be solved for a computer to write well are: problem-solving, knowledge representation, language understanding, world-modeling, human-modeling, creativity, sensitivity, and judgment.
Problem-solving is important because some aspects of writing require the writer to place in a linear order facts or other statements which describe an object or an action that is inherently 'multi-dimensional.' The order of the facts and the techniques that prevent reader-worry about yet-unpresented facts is as difficult a planning task as any robot planning problem.
Knowledge representation is important for being able to find and refer to facts about a topic rapidly and accurately. The interconnections in the writer's mind between facts must be such that connections that the reader will see are apparent to the writer. If a detailed and complex search must be undertaken by the writer to discover relevant connections, it may be that they will be missed, and the vivid and continuous image will be lost.
Language understanding is important because a human writer will re-read his writing in order to test its effectiveness. Later in this presentation I will talk about this more thoroughly and speculatively.
World-modeling is important because a writer must understand the consequences of his statements; if he talks about some aspect of the world or chooses to use a metaphor or an analogy, he must think carefully whether the correspondence between his subject and the metaphor or analogy is accurate, and whether consequences of his metaphorical or analogical situation are impossible or ridiculous.
Creativity, sensitivity, and judgment fall into the category of things that artificial intelligence has never really looked at seriously. To write with good taste, and, hence, effectively, requires the writer to write in new and interesting ways, to be sensitive to the sore spots that his reader might have, and to judge what is important and useful for his readers.
Aspects of Good Non-Fiction Writing
If I expect you to understand my non-fiction writing without problems, I must do two things: I must anticipate what you know about the topic of discussion, and I must anticipate the problems you will have comprehending how my sentences and paragraphs are constructed. As you read from left-to-right, every word must fit in properly; you must never be forced to re-read parts already seen, and you must never have to reflect on my sentences. The text must be transparent.
These two aspects form the ends of a spectrum of concerns that a writer who cares about good writing must consider each time he writes. At one end is the correct decision about what is shared information, and at the other end is the effortless transmission of new information and relationships between facts. I will illustrate these two aspects with an example.
Consider writing the directions on how to get from one place to another in a car. When I tell you how to get to my house, I must know how much you know about the area; I must be certain you know where the Locust Street Eisner's is. If you do not live in the area, then perhaps the specific landmarks I use will be impossible for you to recognize. But if you do live in the area, I can use phrases like, "go to the stadium on Welch Road, and then ...." In short, I must carefully reason about what shared information we have about the area and also about what information you will learn while you are traveling through the area following my directions.
If I have tried to explain the directions to you in the past, then I can refer to that conversation or to that document. In short, there can be some common context and shared information about my explanation. My writing of the directions to you must accurately refer to the knowledge I am sure you have. If I refer to something that you don't know or to something that you could find out with some difficulty as if it were something you knew, then my directions would be bad.
At the other end of the spectrum, I must anticipate where along the trip you will become uncertain that you are on the right track. If there is a long stretch of road to traverse after several tricky turns, I must tell you sights that will alert you that all is well. If I say to turn right at the third stop sign, and it is behind a bush, I must warn you of that, or else you will likely have to re-do that part of the trip.
My directions will not be less accurate for this extra information, but this information will help make them better directions.
If you are not certain that you understand my directions, then you will perhaps become confused and begin to doubt that landmarks that you see correspond to landmarks I describe in my directions. You will think, "would he describe this tree like that?" or "could this red house be the pink one to which he refers; his directions are so confused that maybe he's simply being sloppy here?"
If my decisions about what is shared information are bad enough, then you - the reader - will find that my writing is difficult to read; you will try to find the correct reading of the text that makes it all clear. And, if my text is simply confusing, then you will wonder whether we agree on the facts; you will think that, if you could only know what I - the writer - knew, then the text would become crystal clear.
Pragmatics of Good Non-Fiction Writing
There are many ways that shared information comes into play in good non-fiction writing. Obviously facts that I assume that the reader knows ought to be facts actually known to the reader. If the facts I assume the reader knows are not clear to the reader - if they are difficult concepts, or if the implications of the facts as they bear on my discussion are difficult to grasp - then it is my obligation as a writer to make the facts clear, even if that requires repetition and tutoring.
My text may introduce information that is crucial to understanding the rest of the piece. Not only must I carefully present that material, but in my subsequent references to it I must be sensitive to the fact that the information was recently learned - perhaps it was forgotten or even skipped over. I should never treat information that I have introduced the same way that I treat assumed facts. For one thing, if I treat the information I have introduced exactly as the information I assume the reader has known for a while, then the reader may believe that I am talking over his head by falsely assuming his knowledge is greater than it actually is; and maybe the reader skimmed the presentation of the new material and doesn't realize that the later, confusing reference to it is a reference to new and not old information.
It is often helpful for the reader if the writer, when he refers to possibly puzzling information, refers to the information in a clarifying way. If every reference adds to the comfort the reader has about the material, the new material will be better understood.
The writer has an obligation to the reader: The reader chooses to read the piece. It is rarely the case that a reader is truly forced into reading a piece of writing from beginning to end. The writer's obligation is to make the reader's task easy enough that the reader will want to read the entire piece.
The Language of Good Non-Fiction Writing
Beyond what I assume my reader to know, and beyond what I tell him, there are the actual words, phrases, sentences, and paragraphs with which I choose to pass that information to him. In bad writing the 'mental dream' is interrupted or chafed by some mistake or conscious ploy of the writer. Whenever a reader is forced to think about the writing, the words, the sentence structure, or the paragraph structure, or whenever the reader has to re-read a section of writing to understand how the words relate to each other, it is at this point that the transfer of information from the writer to the reader is stopped, and the dream that accompanies this transfer dies. The dream must be re-established, and this can take extra time that could be better spent continuing a line of thought.
A second effect of such bad writing is that if a sentence has incorrect syntax, or if it is clumsy and difficult to understand, then the reader is justified in losing respect for the writer, in questioning the intelligence of the writer and his judgment, and in lowering his estimate of the importance, significance, and correctness of the entire piece of writing.
Finally, all non-fiction, and especially technical writing, requires examples and concrete details to be understandable. When we write about a computer program, we probably have thought about that program for a long time, and we have internalized its characteristics to help our own mental processes. When the reader reads our description of it, he wants to build a mental image of the program and its operation, and we hope that his mental image is similar to ours. Without specific details the reader cannot imagine the program accurately, and it is even possible that his image is inconsistent with ours. In this case, the reader will have to adjust to the newer image once he discovers the discrepancy, if he ever discovers it.
There are many common writing errors that occur; these errors render writing difficult to read with respect to the aspects mentioned earlier. The most common errors that a writer makes are errors of the basic skills of writing. I will briefly catalogue some of these errors.
Many writers excessively use the passive voice. In using the passive voice, the agent of the action is either placed at the end of the sentence ("His finger was bitten by the parrot") or else it is left out altogether ("His finger was bitten"). Perhaps the writer intends to focus the reader's attention on the injury, but the natural tendency of the reader is to imagine this action, agent and all. But if the agent is missing or introduced at the end of the sentence, the image is hazy or it is wrong; a second attempt at the image must be made by the reader, and that is when the distraction away from the dream occurs.
Beginning a sentence with an infinite-verb phrase is often a mistake; either the reader can have trouble understanding the time relationships between the elements of the sentence or else he may have difficulty understanding the logic of the statement. For example, in "taking an interrupt, the program executes the terminal handler," there may be some question about whether taking an interrupt happens concurrently with the program's executing the terminal handler or whether one event causes the other to occur. In "rapidly switching contexts, the scheduler carefully considers the next job request," there is a hint of illogicalness because 'rapidly' and 'carefully' are dissonant. The reader will pause over this statement, even if it is ultimately understandable.
Problems with diction are common. Diction refers to the choice of words and the appropriateness of that choice. Diction is the hallmark of many styles of writing. 'High diction' refers to high-brow, intellectual, or even snobbish writing. I have heard some people say that scholastic writing should not be fun to read. This reflects their attitude about the proper diction for scholastic writing. The main problems people have with diction are deciding on the proper tone for their writing task and in then sticking consistently to that tone. The sentence "the essential ingredient of an efficient topological sorting procedure is a fine set of cute hacks and clever bums" shows an inconsistent level of diction - scientific and refined at the start and casual or computer-gutter at the end. Either level of diction may be fine within a specific context, but to mix them in this way is unforgivable.
Sentence variety is important because when the sentences of the same type are strung together, the result is boredom and a lessening of the dream that keeps readers reading and understanding. Anticlimactic sentences - sentences with relative clauses at the end - can be distracting because they seem to taper off rather than getting to some point.
Unintentional rhymes and rhythms can also be distracting by causing the reader to stop absorbing the content of the writing and to focus on the writing itself. No reader of a serious paper on garbage collection would pass over this sentence without a chuckle: "Collecting spare records and gaining some space happens quite regularly and at quite a high pace."
Similarly, unintentional puns can be a problem. "The input/output bottleneck was broken by adding a separate DMA channel" contains such a pun. When the reader sees a pun or thinks of an interpretation of a sentence that is humorous, the writer is in trouble: The reader has lost the mental dream and is thinking about the pun.
Explaining events out of their natural order is a serious error: The reader must stop reading and piece together the actual sequence. This error is at the border of the language errors and the pragmatic errors, but it shares the theme of all of these errors: The reader must abandon his dream and concentrate on the writing.
Writing and Manners
Good writing is an act of communication between a writer and an unseen reader. Good writing is a courtesy that is expected by the reader, and if a reader puts my paper away because he cannot handle the writing, I have failed my duty to that reader. Similarly, I have no respect for a writer, regardless of his professional stature, if he will not take the time to think carefully about how he presents his work and results to me.
How can you write well? The key is to use yourself as a model of your audience. The reader does not know as much about the topic you are presenting as you do. To understand what people know you must be able to forget things you know. This forgetfulness is reasoned: You carefully reason about what you know that is not common knowledge. Then you use this reduced knowledge to see what in your text is unclear because it requires the knowledge you have that your reader may not have.
And you must be able to forget the structure of your text, so that as you read it you can successfully be a model of a reader coming upon your text afresh. One way to do this is to put time between you, the writer, and you, the later reader.
As we read, each new word causes us to move forward in parsing the text - we must understand the relationships between words in order to piece together the picture that the writer is presenting. It is possible to carefully choose words so that the reader can progress along an obstacle-free path through the words. Word choice is done when the first draft is being written and also during revisions.
You could accomplish this by reasoning about where parsing choices (and confusions) can arise and by picking words that tend to guide the reader one way over others. Sometimes it isn't possible to eliminate problems for the reader with such local choices, and global re-planning is necessary: You may have to rewrite an entire paragraph to avoid confusion in one part of a sentence.
But I want to talk about computers doing deliberate writing. The above cautions are only a small fraction of the advice that could be given to human writers: What of computer writers? A program that writes deliberately must be able to plan and re-plan, to debug errors and inconveniences, to reason about knowledge, and to understand itself well enough to reason about why it makes certain decisions.
Writing well is difficult for a person to do, and it is also very difficult for a computer to do. As we have seen, there is an intimate relationship between the writer and his reader, although these two individuals may be separated by many miles and years.
A computer that has such a relationship with a reader is difficult to achieve. Because the writer has to share a common background with a reader (at some level) to be a successful writer, the computer must have this common background built in by the author of its writing programs. Artificial intelligence has a range of techniques for reasoning about shared knowledge and, to some degree, about plausible inferences from that knowledge.
However, the careful writer also is able to reason about the problems that the reader will have in parsing his writing. The writer is able to use himself as a model reader, after he has put his writing aside for a time. The computer cannot do this as easily, because there is no conception of 'forgetting' in current artificial intelligence paradigms, nor is there an easy way to use a natural language parser to find out where the parser has trouble with a sentence and how to correct the difficulty.
Nevertheless, there are programs that can write. I have written such a program, called Yh. Yh is a program that writes text, and it is one of the first attempts at a deliberate writing program. The texts it produces are explanations of the operation of simple programs. These programs have been synthesized from a description of an algorithm provided by a person during a conversation with an automatic programming system - in this case the PSI system [Green 1980].
To accomplish the writing behavior I described earlier, this program generates text from left-to-right, making locally good decisions at each step. As the generation proceeds, other parts of the program observe the generation process, and, because these parts of the program are able to make connections between distant parts of the text, they are able to criticize the result.
In other words, after the initial version of the text has been produced, further reasoning about the global nature of the choices is performed, the text is transformed, and complex sentence structure is introduced or eliminated.
There are several mechanisms in Yh designed to produce clear writing, but these mechanisms require detailed and extensive knowledge both about writing and about the subject matter of the writing to be effective. Yh has only a small amount of knowledge, but, even with this limited depth of knowledge, Yh has generated a great deal of text of good quality. However, Yh does not write uniformly well; as more knowledge is added or existing knowledge is refined, the quality of its writing will improve.
In this section I will explain some of the philosophy behind the design of Yh. Yh is a fairly large program and has a complex control structure. Because the behavior of Yh is dependent on this structure and complexity, and because the structure and complexity are a result of this philosophy, I think it is important to spend a little time understanding it.
How to Make Computers Write
Researchers in artificial intelligence have been theorizing for many years about the mechanisms necessary for intelligent behavior in restricted domains, especially domains that are the realm of specialists and not laymen. One hope is that a uniform structure among these mechanisms will emerge and that this uniform structure will generalize into something which can perform a wide range of tasks.
The effect of this generalization, it is hoped, would be the creation - theoretical or actual - of a computer individual, a program or machine that has some of the qualities of a human mind and which encompasses nearly the full range of abilities of a normal, average person. Such a computer individual would comprise an immense body of machinery.
Rather than looking for this uniform structure within the individual mechanisms, perhaps the proper place to look for it is within the organization of an entire system. That is, perhaps individual solutions to specific problems of intelligence need be constrained to work only within their intended domain; the responsibility for selecting and relating various solutions would be left up to this uniform, overlying organization.
The driving force behind the ideas presented herein is the fluid domain, which will be introduced shortly.
Complexity versus Simplicity
One of the prevailing notions in all scientific endeavors is that simplicity is to be favored over complexity where there is a choice. When there is no other choice but a complex alternative, of course the complex alternative must be chosen.
Years ago, Herbert Simon [Simon 1969] gave us the parable of the ant on the beach, in which the complex behavior of the ant as it traverses the sand is viewed simply as the complexity of the environment reflecting in the actions of the ant. He says:
An ant, viewed as a behaving system, is quite simple. The apparent complexity of behavior over time is largely a reflection of the complexity of the environment in which it finds itself.
He goes on to substitute man for ant in the above quote and attempts to justify that statement. The goal of creating an intelligent machine has evolved, historically, from the initial sense of simple programs demonstrating interesting behavior even though those programs are simple. This sense comes, I think, from the speed of the machine in executing the steps in a program: Even though a computer may not deliberate deeply in its processing, it can explore very many alternatives, perhaps shallowly. For many problems this extensive, but shallow, analysis may substitute effectively for deep deliberation.
I want to counter Simon's parable above with a quote from Lewis Thomas's "The Lives of a Cell" [Thomas 1974]; he says, when discussing the variety of things that go on in each cell:
My cells are no longer the pure line entities I was raised with; they are ecosystems more complex than Jamaica Bay.
He later goes on to compare the cell as an entity to the earth.
Each cell is incredibly complex, and our brains are composed of very large numbers of them, connected in complex ways. A conclusion consistent with this is that programs that behave like people, even in small domains, must be rather large and complex - certainly more complex than any program written by anyone so far. And to write such a large program requires an organizing principle that makes the creation of such a program possible.
Two Aspects of Programming
There is a useful dichotomy to help us understand how artificial intelligence programs differ from many other programs: algorithmic programming versus behavioral programming.
In algorithmic programming the point is to write a program that solves a problem that has a single solution; steps are taken to solve the problem in an efficient manner. One example of an algorithmic program is one that finds the largest prime pair smaller than 1,000,000. To be sure, writing algorithms is not simple, but it is quite different from what I call behavioral programming.
In behavioral programs the point is to produce a program that behaves in certain ways in response to various stimuli. A further requirement on such a program might be that the response is not simply a function of the current stimuli but also of all previous stimuli and responses. Examples of behavioral programs are operating systems and some artificial intelligence programs.
In writing, one could possibly write an algorithmic program to generate single sentences, but to generate a paragraph or some longer piece requires a program that can react to its previous prose output, much as people do when they write. I say 'requires,' but that obviously isn't correct, because it is certainly possible to write an algorithmic program to write paragraphs within selected domains - I simply mean that writing prose-generating programs is easier if there is a structure that supports the activities necessary for prose writing, and I believe that structure is a loosely connected network of experts.
Fluid versus Essential Domains
I want to make a distinction between two of the kinds of domains that one can work with in artificial intelligence research: fluid domains and essential domains. The qualifiers, fluid and essential, are meant to refer to the richness of these domains.
In an essential domain, there are very few objects and operations. A problem is given within this domain, and it must be solved by manipulating objects using the operations available. Generally speaking, an essential domain contains exactly the number of objects and operations needed to solve the problem, and usually a clever solution is required to get the right result.
As an example of an essential domain, consider the missionaries and cannibals problem. In this problem there are three missionaries, three cannibals, a boat, and a river; and the problem is to get the six people across the river. The boat can hold three people, and if the cannibals ever outnumber the missionaries in a situation, the result is dinner for those cannibals.
If this problem actually were to occur in real life, it probably would be solved by the missionaries looking for a bridge, calling for reinforcements, or making the cannibals swim next to the boat.
An important feature of a problem posed within an essential domain is that it takes great cleverness to solve it; an essential domain is called essential because everything that is not essential is pruned away, and we are left with a distilled situation.
In a fluid domain, there are a large number of objects and a large number of applicable operations. A problem that is posed within the context of a fluid domain is typically the result of a long and complex chain of events. Generally, there are a lot of plausible-looking alternatives available, and many different courses of action can result in a satisfactory solution. Problems posed in this type of domain are usually open-ended and sometimes there is no clearly recognizable goal.
A typical fluid domain is writing. In this domain there are a large number of ways of expressing things, beginning with inventing phraseology out of whole cloth and progressing towards idioms. As I noted earlier, writing is a process of constant revision, and often that revision is centered on word choice and how those choices affect the overall structure of a piece of writing. Therefore, it does not seem likely that a computer program could avoid doing the same sorts of revisions and be as effective as a program which also did post-word-choice revision.
A key feature of writing, and of fluid domains in general, is that judgment is often more important than cleverness, and frequently the crux to solving a difficult problem is recognizing that a situation is familiar. As we write more and more, situations in which wording or fact-introduction is a problem become easier for us to spot and to repair. We use our judgment to improve the clarity of our writing.
A successful approach to take in order to solve problems in fluid domains is to plan out a sequence of steps that lead from where we start towards what appears to be the goal. These steps are islands, where each island is a description of a situation that we believe can be achieved adequately. We will refer to a description of the situation as the situation description. At each island we then apply the best techniques available for achieving that situation, given the previously achieved situations.
There are two problems we need to solve to make this method work: 1)~We must be able to plan these islands without doing very much backtracking, and during the planning stage we must not be required to perform any actions that might need to be performed during the later execution stage; and 2)~once the plan is completed and we are executing it, we must be able to effectively bring to bear the appropriate techniques at each island so that the island is actually achieved.
I will consider each problem a little more carefully; as I do so, I will illustrate points concerning the general problems with their realization in the writing domain.
The first problem is to build a path, perhaps a graph, of nodes where each node is a situation that we wish to establish. We hope that if we traverse the graph, establishing the corresponding situation by taking some actions, then the final situation matches the goal towards which we were aiming. Building this graph is called coarse planning.
In writing, each node - the islands above - could be a sentence's worth of facts that must be conveyed, or it could be several sentence's worth of facts. The point is that each island represents a set of facts which ought to be expressed as a unit - locally in some section of the text, if possible. We hope that if we could adequately express in sentences each fact in the plan, then the entire text would adequately express all the facts.
Because we may need to consider many possible graphs of nodes before a plan emerges and because we may not wish to take all of the actions to establish the situations at each node while planning, we will need to operate on abstract descriptions of the possible actions that can be taken to decide whether a node can possibly be established by future actions.
In writing, this planning stage involves making a list of the propositional contents of sentences that might be written - the graph that is produced is simply a linear list. During this planning stage a sequence of sets of predicate calculus formulas is created, where each set of formulas represents the propositional content that should be conveyed at that stage in the text. The propositional content in each set might be expressed in a single English sentence or in several. We will want to consider whether saying these sentence-contents in a given order will convey the meaning we intend before we commit ourselves to the plan.
The second problem is to find those actions that will actually establish the situations called for in our plan. Fine-grained planning may be required to establish smaller islands - or islets - within the larger given islands. The actions that are determined to be appropriate at each island are executed in order to flesh out the plan, establishing the particulars of the plan. However, once these particulars have been established, we may find that the fleshed-out plan does not work well, and then we are faced with the problem of modifying what we have so as to accomplish our goal as nearly as possible.
In writing, we will actually propose words and sentences to accomplish the propositional contents in our plan. After this stage there exists a first draft text. We might find that the words chosen do not fit well with the planned structure of the text, and that a different structure might be better. Or it might be that the structure of an earlier part of the text prevents the structure for the later part to be realized.
In order to determine that the executed plan accomplishes our goal, we must be able to observe and criticize the actions of the system as it performs the steps of the plan. This is a good way to discover the inadequacies of the techniques brought to bear at each island.
In writing, we will observe our word and sentence-structure choices to determine whether we are effectively conveying intended meaning or whether we have unintentionally expressed an unwanted meaning or connotation. One might say that the program 'reads' what it has written, although in Yh there is no parser - the program reviews a representation of the text, which is simply an elaborate parse tree for that text.
Thus there are three essentials to our method: 1)~draw up a coarse plan; 2)~implement the details of the plan as best as can be done; and 3)~observe the processes carrying out the details of the plan in order to criticize its effectiveness and to propose changes to correct any deficiencies.
Intelligence and Communication - Object-oriented Programming
Suppose we had a program that exhibited a degree of intelligence; from whence would this exhibited intelligence emerge? Certainly the program code by itself is not 'intelligent,' although the intelligence of the system must emerge from that code somehow. The intelligence emerges from the program code as it is running. But, to go one step further, what in the running of that code is the source of the intelligence?
Consider a team of specialists. If a problem the team is working on is not entirely within any one person's specialty, then one might expect that they could solve it after a dialogue. This dialogue would be an exchange of information about strategies, techniques, and knowledge as well as an exchange of information about what each specialist knows, why it might be important, and why some non-obvious course of action might be appropriate.
One could say that the collective intelligence of this team emerges from their interactions as much as from each individual's expertise; some particular expertise may not be able to address very much of the problem directly, but the combination of expertise plus an overall organizing principle might better address the problem as a whole.
Also, from a practical programming point of view, if a system can be expanded mainly through the addition of another individual piece of knowledge or expertise, with responsibility for organizing that new piece of knowledge left up to the system somehow, then a large system composed of many pieces of knowledge could be created and managed.
The question, then, becomes one of supporting communication well. In Lisp, for example, in order to use a function written for a specific purpose, one has to know the name of the function and its calling sequence. This will not do for the scenario I have outlined above: Being able to address the correct or most appropriate function or expert must be accomplished flexibly.
Object-oriented programming addresses some of the needs of the system I have outlined. In object-oriented programming one builds systems by defining objects, and the interactions between these objects is in terms of messages these objects send to each other.
A standard example of this style of programming is the definition of 'addition.' In a traditional programming language we can define addition as an operation performed on two numbers. The function that adds two numbers might be able to look at the types of numbers (integers, floating-point, or complex, for instance) and then decide how to add the numbers, perhaps by coercing one number into the type of the other (we add a floating-point number to a complex number by coercing the floating-point number to a complex number where the real part is the given floating-point number and the imaginary part is 0).
An orthogonal - the object-oriented - way of doing this is to consider numbers as objects which know how to add other numbers to themselves. Therefore a complex number might be sent a message saying to add to itself a floating-point number and to return the value to the sender. The complex number would then look at the type of the number sent to it and take the correct steps.
When we want to modify these systems to be able to add new types of numbers, we do different things in each system. In the traditional system we need to improve the addition program so it knows about the new types of numbers that can be added. In the object-oriented system we need to create a new type of object - the new type of number - and to provide information about how to add other sorts of numbers to it.
To make this work easily, though, it is necessary to have provided a fallback or error handler to each object. For example, if a number, x, is sent a message requesting that another number, y, be added to that number, then if x does not know what to make of y, x could send y a message to add x to it, but cautioning y that x has already tried. If y is also puzzled, then y can try another error procedure rather than simply throwing the question back to x.
With such a fallback position, we can add new data types to an object-oriented system more easily than to a traditional system.
Yh is an object-oriented system, and I will call the objects in it experts.
Overview of the System
Yh, the writing program, is organized as expert pieces of code that can perform certain tasks in writing: Some can construct sentences, some can construct phrases, some can supply words or idioms, and some can observe and criticize the operation of the rest of the system. These expert pieces of code are objects in an object-oriented system.
These experts are capable of taking some action, and each one has an associated description of what it does. This description is in a description language which can express features and attributes, as well as their relative importances.
Let me be a little more specific. Yh comprises a number of experts held in a database, each of which is a small program. When Yh has a task to do, it finds an expert to invoke. Each expert has an associated description of what sorts of tasks it can do, and this description is used as an index for that expert. To find an expert to do a certain task, Yh formulates a description of the task and matches that description against the description of each expert in its database of experts. The description that best matches the task description corresponds to the expert that Yh will invoke.
Yh uses these descriptions when it is planning: Yh can use the descriptions of each expert to simulate the actions of that expert and can thereby propose a sequence of experts to invoke that will accomplish some goal. The descriptions are not represented procedurally, but they are structured in such a way that an 'interpreter' can be applied to a situation description and an expert description, and produce the situation description that would result if the expert were applied in a context where the first situation description held. This interpreter assumes that the expert would do exactly what its description claims it would.
In summary, these descriptions are used during coarse planning to help determine islands and during the execution of the plan to find appropriate experts to invoke. During both of these activities a pattern matcher is used to identify the appropriate experts.
Yh is agenda-driven using a priority agenda. That is, the agenda contains items that have priorities attached to them. Periodically Yh scans this agenda to determine what to do next, invoking the highest-priority agenda item.
Descriptions are matched against other descriptions. The matching process is soft or hybrid, which I define to mean that matches result in pairings of attributes, bindings of variables, and a numeric measure of the strength or closeness of the match.
More specifically, each description is a set of ordered pairs called descriptors; the first element of each descriptor is a sentence in a simple first-order logic, and the second element is the 'measure of importance' of that sentence. This simple first-order logic contains constants, variables, functions, predicates, and some quantifiers. I will refer to the first element of a descriptor as the propositional content of the descriptor. The propositional content of the description is the concatenation of the sentences within the description.
Here is a partial list of the sorts of entries in the description of an expert and how they affect a match:
GOALS: These are the main actions performed by the expert, expressed as sentences with associated measures of strength. The primary matching operations consider only these sentences.
PRECONDITIONS: These are the pre-conditions the expert expects to be true when it is invoked. These are also expressed as sentences with associated measures of strength.
CONSTRAINTS: These are predicates that must be true in order for the expert to be invoked.
PREFERENCES: These are predicates with associated measures of strength. For each predicate that is true in the context of a potential match, the associated measure of strength is added to the strength of the match. If the measure of strength is a number, it can be periodically decayed.
ADDED-GOALS: These are the new goals that the actions of an expert may post when that expert is invoked. The goals are stated as sentences and have associated measures of strength. For each goal, the associated measure of strength indicates how important it is to achieve that goal.
SOFT-CONSTRAINTS: These predicates are exactly like CONSTRAINTS above, but each has an associated measure of strength that affects only the strength of the match.
INFLUENCES: These are GOALS-like entries that only affect the strength of a match. These entries are unified against all descriptors (entries in a description) in the description being matched. If an entry unifies with another, then the measure of strength is added to the strength of the match.
COUNTERGOALS: }These are like GOALS above, but they represent things that are undone by the action of an expert when invoked.
The expert description as well as the situation description are expressed in this language.
There are two major intentions behind this style of description: inexact matching and influencing a match.
This first intention was formulated after observing that it may not always be possible to find experts with descriptions that match perfectly, and that experts whose descriptions are relevant to the goals may be able to help accomplish those goals. For example, an expert that can write a passive sentence is able to accomplish the following two goals: The expert can put the direct object at the front of the sentence, making that object more prominent; and it can keep the agent of the sentence anonymous, which is useful if the writer doesn't know the agent, for instance. One can argue that neither of these goals is more important than the other, and it can be the case that if one wanted to accomplish one or the other of these goals, the passive sentence expert might represent the best means.
The second intention was formulated after observing that there may be very many experts whose descriptions match a given situation description and which could be used to take some useful actions. For instance, there will be quite a few words that could be used to express a concept or an object, perhaps equally well. We want to be able to influence which expert is invoked. In the word-choice example, perhaps we want to avoid recently used words (using PREFERENCES), or we want to encourage the use of words with certain connotations (using INFLUENCES). If the writing program wishes to avoid sentence constructs that have been used recently in a passage, SOFT-CONSTRAINTS can be used to influence the choice of sentence constructs.
This section describes the pattern-matching process in a medium degree of detail; and, in particular, the mechanisms in the pattern matcher which give rise to the behavior described above will be outlined. The casual reader can skip this section.
To match two descriptions, a pairing of the descriptors of one with the descriptors of the other is produced. A pair of descriptors, d1 and d2, is placed in the pairing if the propositional content of d1 unifies with the propositional content of d2. During this pairing process, the entries labelled GOALS are the only ones considered. It may or may not be the case that each descriptor of each description is paired with one from the other, but no descriptor can be paired with more than one other descriptor. A match exists if there is a non-empty pairing.
More formally, suppose we have two descriptions, P and D, and suppose that the GOALS part of the description of P is:
where each pi is a sentence in the first-order logic and each si is an integer. Suppose that the GOALS part of the description of D is:
where each di is a sentence and each ti is an integer. Let U be a predicate on sentences where U(f1,f2) is true iff f1 and f2 unify. Then, P and D match iff:
there exists an i,j 1 < i < n, 1 < j < m such that U(pi,dj)
Let Pairing be a set of pairs that result from a match. If [(pi,si),(dj,tj)] belongs to Pairing and [(pi,si),(dk,tk)] belongs to Pairing, then j=k.
For every two descriptions there may be several pairings of GOALS entries.
Once the pairing is produced, the strength of the match is computed using the measures of importance. The basic strength of the match is computed from the pairings obtained as described above.
then the basic strength of this match is a function of si_1,tj_1,...,si_k,tj_k
Define the strength of match between P and D, Strength(P,D), to be the maximum of the strengths of match of all the possible pairings of descriptors in P and D.
The remainder of the entries (such as INFLUENCES) are also paired with entries from the other description, and the measures of importance are used to modify the strength of the match.
To be more specific in the case of INFLUENCES, let (I,s)_P be an influence, where I is a sentence and s is a measure of strength. If
there exists a (d,t) such that U(S,d)
then the strength of the match is altered by a function of s and t.
The effect of all but the GOALS portion of the description is to affect the strength of the match and not the validity of the match.
Performance of the Matcher
The pattern matcher performs operations on cross products. Because of this, the performance of the pattern matcher is potentially quite bad. Many of the operations, however, can be formulated in such a way that a parallel processor could greatly increase the performance of the matcher.
On the other hand, during the generation of the paragraph of text that will be shown in the example that follows, the pattern matcher was invoked approximately 5,000 times and the underlying unifier approximately 300,000 times. The total time for the generation of the paragraph was only 15 CPU minutes on a DEC KL-10A.
Planning is done quite simply. We start with an initial situation and a goal situation. The initial situation and the goal situation are each represented by a description, and the pattern matcher is able to use these descriptions to determine the degree of progress towards the goal.
The planner tries to find a sequence of experts that will transform the initial situation into the goal situation. To do this, the planner finds an expert whose description indicates that the expert will transform the initial situation into a situation that is 'closer' to the goal situation; P1 is closer to D than P2 if Strength(P2,D) < Strength(P1,D). Yh uses the description of the chosen expert to transform the initial situation into a new situation, and the process is repeated with this new situation in place of the initial situation.
This transformation may add further goals, and it may also add entries that indicate preferences or influences over the remainder of the planning process. In this way, the current part of the plan can influence the later parts.
If the search does not appear to be proceeding towards the goal, backtracking occurs.
The initial and goal situations may also be pairs of islands within a larger plan, and often this is the case. That is, when Yh is planning some paragraphs of text on some topic, it uses other, simple, planning heuristics to lay out the sequence of topics or facts to be expressed within each paragraph. Finer-grained planning then fleshes out this plan as best it can.
Once the plan is in place, the first expert in the list of experts found in the planning process is applied, and that expert actually updates the situation description accurately. The updating done by the planning process using the description of the expert may have been only an approximation to a detailed analysis that the expert needed to perform to decide the expert's exact effect.
At this point, if the situation that was expected to be established by the actions of this expert is, in fact, not established, Yh may apply other applicable experts to the situation until the desired state is reached.
Applicable experts are found by the pattern-matcher by searching and matching in the database of experts. An applicable expert is one whose strength of match is above a certain threshold. If no expert is above the threshold, the planner is called again to do finer-grained planning to solve the problem.
As an example from writing, suppose the initial situation is a certain state of the text and the goal state is to explain a simple additional fact. There may be some means of expressing this fact, but the means selected may have created a noun phrase that is ambiguous in its context, perhaps by using similar words to those used earlier to refer to a different object. When the situation description is updated, this ambiguity may become apparent, and some other actions might be taken to modify the text or to add further text to help disambiguate the wording.
A general-purpose function-calling mechanism is based on pattern matching descriptions - call by description. It is possible for an expert to ask another expert to perform some actions: The first expert puts together a situation description - which describes the current state of affairs along with the desired state of affairs - and asks the pattern-matcher and planner to find an expert or a sequence of experts to accomplish the desired state of affairs.
The Writing Process
With the above simple overview description of the operation of Yh, I will now explain in more detail how writing is done by Yh.
Given a set of facts to explain, Yh applies some simple heuristics to the facts to determine the order of presentation of those facts. For writing about programs the heuristics simply examine the program to determine the data structures and the flow of control.
These ordered facts are the initial islands in the planning process. A finer-grained plan is produced which partitions the facts into sentences. That is, the finer-grained plan is a sequence of sentence schemata (declarative, declarative with certain relative clauses, etc.) along with the facts that each expresses.
At this point the writing begins with text being produced from left-to-right, all the way down to words. As the actual writing proceeds, a simple observation mechanism is used to flag possible improvements in the text. For instance, if sentences with the same subject or verb phrase appear, this is noted. The mechanism for observation is to use the pattern-matcher to locate experts that are designed to react to specific situations, such as the same subject appearing in two different sentences.
The text is represented as a parse tree. Each node of the tree contains two annotations: One annotation states the syntactic category of the subtree rooted at that node; and the other annotation contains the situation description which caused that part of the tree to be created. In general, Yh is able to randomly access any part of the tree, using as indices the syntactic annotations, the situation description (semantic) annotations, or the contents of the nodes.
When the text is complete, the experts that were triggered by interesting events - such as the same verb phrase appearing in several places - are allowed to modify the text. While this is happening, further observations are made. The process continues until a threshold of improvement is reached - that is, until there is little discernible improvement to the text.
The effect of the observation experts can be to move facts between planning islands. The initial planning stage can be regarded as only a first approximation in a series of better approximations to a satisfactory plan for expressing a set of facts in English.
When Yh starts writing there are three agenda entries, which cause the above actions to happen: 1)~A coarse planning entry; 2)~a plan execution entry; and 3)~an observation-expert activation entry. This first entry causes the coarse planning to happen, and the second entry causes the plan to be executed (the first draft to be written). The third entry is more complicated. While the first draft is being written, observers watch the process and make suggestions. These suggestions are simply entries in a database of such entries. The third agenda item causes these entries to be processed, and any actions that need to be taken based on the suggestions contained there will be initiated by this agenda entry.
Example of Writing
The next few pages will present an example of Yh writing about a simple Lisp program.
Dutch National Flag
The Dutch National flag problem as is follows: Assume there is a sequence of colored objects in a row, where each of the objects can be either red, white, or blue; place all red objects to the left, all white objects in the middle, and all blue objects to the right.
Given the initial sequence:B R W B R W B
the result is:R R W W B B B
The problem is a sorting problem, and it can be done in linear time using an array and three markers into that array. The following is a simple MacLisp program that solves the problem where an array is used to the store the elements in the sequence:;;; Dutch National Flag(declare (array* (notype flag 1)) ;represents the Flag ;can be r,b, or w. ;r = red, w = white, b = blue (special n)) ;represents the length of the Array;;;exchanges (flag x) and (flag y)(defmacro exchange (x y) `(let ((q (flag ,y))) (store (flag ,y) (flag ,x)) (store (flag ,x) q)));;;tests if (flag x) is red(defmacro redp (x) `(eq (flag ,x) 'r));;;tests if (flag x) is blue(defmacro bluep (x) `(eq (flag ,x) 'b));;;tests if (flag x) is white(defmacro whitep (x) `(eq (flag ,x) 'w));;;increments x by 1(defmacro incr (x) `(setq ,x (1+ ,x)));;;decrements x by 1(defmacro decr (x) `(setq ,x (1- ,x)))(defun dnf () (let ((l 0)(m 0)(r (1- n))) ;initialize l,m, & r (while (not (> m r)) (cond ((redp m) (exchange l m) (incr l)(incr m)) ((bluep m) (exchange m r) (decr r)) (t (incr m)))) t))
The flag is represented by a 1-dimensional, 0-based array of n elements, FLAG. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are initialized to 0; R is initialized to n-1.
While M is not bigger than R the program does the following: If (flag m) is red, it exchanges (flag l) and (flag m), incrementing L and M by 1. If (flag m) is blue, it exchanges (flag r) and (flag m), decrementing R by 1. Otherwise, it increments M by 1.
In order to exchange (flag x) and (flag y), the program saves the value of (flag y), stores the value of (flag x) in (flag y) and then stores the value of the temporary in (flag x). An element of FLAG is red if it contains R, blue if it contains B, and white if it contains W.
The above three paragraphs are written by Yh from an internal representation that captures exactly what is in the above code plus the comments. The representation is similar to that which a compiler would use to represent the above computation. Rest assured, Yh is not capable of reasoning about programs - every deduction made about the program while writing the above text was trivial.
The rest of this section will explain the generation of the first paragraph.
Yh is started with the task of explaining the Dutch National Flag program. Yh will first produce a plan to accomplish that, which is as follows:
- discuss the data structures;
- discuss the main program; and
- discuss the macros.
The remainder of the paper will present a detailed discussion of how the paragraph which accomplishes step 1 of the plan is written. The only data structure is an array with some array markers; the plan for this portion of the text, after it has been fleshed out during the writing process, is:
- discuss what the array represents;
- discuss the dimensionality of the array;
- discuss the base of the array;
- discuss the size of the array;
- discuss the array markers; and
- discuss the initialization of the array markers.
At the end of this part of the writing process, the paragraph is:
The one-dimensional, zero-based array of n elements, FLAG, represents the flag. There are three array markers, L, M, and R, standing for left, middle and right, respectively. L is initialized to 0. M is initialized to 0. R is initialized to n1.
While writing this first draft, experts notice that some changes should be made. After making those changes, the paragraph will be:
The flag is represented by a 1-dimensional, 0-based array of n elements, FLAG. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are initialized to 0; R is initialized to n1.
Starting the Process
Yh is started on the task of writing about the program described above. It is given three pieces of advice in the form of influences. Recall that an influence is a descriptor which is used to bias the pattern matcher towards choosing an expert that has a description with that descriptor in it; if a negative influence is used, then the pattern matcher will be biased towards choosing an expert that has a description with the negation of that descriptor in it.
These three pieces of advice are: 1) do not use too many adjectives; 2) collapse all sentences that share either a common subject or a common predicate as soon as they are identified; and 3) do not allow very complex sentences. This last piece of advice uses a simple complexity measure which considers sentence length, adjectival phrase length, and relative clause depth.
Conducting an Inspection
I will refer to all of the functions and data structures in the above program as the program.
First the program is examined. Initial planning islands are established for all of the major parts of the program: the data structures and the functions. This is done by an expert that has a checklist of general features of a program that are important to discuss when explaining or describing programs. The items in the checklist have associated priorities, and the plan is to discuss the items in the checklist in priority order unless some other order is specified. That is, this checklist represents heuristics about how to write about programs.
In the case at hand, the data structures come first because they are given a higher priority by these heuristics. The functions come next with the main function, DNF, first and the macros after that. A paragraph break will be inserted between the discussion of the data structures and the discussion of the program code.
The initial plan is: 1) explain the data structures, in this case the array; 2) explain the main program, DNF; and 3) explain the macros - EXCHANGE, INCREMENT, DECREMENT, REDP, BLUEP, and WHITEP - in this order.
Yh now begins to execute that plan, and the first data structure is then examined. It is the array that represents the flag. A simple examination causes the array-describing expert to be directly invoked using call by description. This array expert knows about interesting things concerning arrays.
All relevant features of the array are retrieved. In addition, the functions that use this data structure are retrieved. All other arrays in the program are found.
The facts discovered about this array by the array expert are: 1) it represents an object to the user; 2) each of its elements is one of R, B, and W, which represent the colors of the flag; 3) three array markers are used to point to places in the array, and these markers are moved; and 4) there are no other arrays defined in the program.
Because there are no other arrays in this program, an influence is added to the initial situation description stating that the array is unique. This may result in a noun-phrase generator choosing to say 'the array' rather than some other descriptive phrase.
Now that the relevant facts about the array are known, the discussion of the array will be written.
The things that are important to talk about for an array are: what it represents, its name, its length, its base, its first element, its last element, its dimension, and the types of its elements. The array expert's overall strategy is to introduce the array as a topic of discussion and to follow~up that introduction with facts about its features. The options for introducing the array are:
- The <array> represents <something>.
- There is <array description>.
- The <array> has n elements.
- The <array> is m dimensional.
- The first element of <array> is <first element>.
The notation, <form>, means that form is to be expressed as a phrase or a clause, and the text is substituted for <form> in the appropriate schema. In order to write any of these sentences, the array expert will invoke the simple declarative sentence expert using call by description.
The simple declarative sentence expert knows that exophoric references - references to objects in the external world with which the reader is familiar - should be placed early in a passage. The array expert knows that the array represents the flag in the program, and this representation forms the basis of an exophoric reference. The array expert chooses to introduce the array using the first of the list of schemata above: The <form> represents ....
The array is an object in the Dutch National Flag program, and the flag is an object in the world. Because the flag is more concrete to the reader, the reference to the flag ought to be placed first in the sentence. To accomplish this, the simple declarative sentence expert posts a request to transform this sentence to the passive voice. This would move the noun phrase referring to the flag to the beginning of the sentence, making it the topic and formal subject.
Requests to perform passive transformations are sent on a special expert that keeps track of the various proposed passives and has the responsibility of deciding whether there would be too many passive sentences too close together.
The first sentence generated will be: The array represents the flag.
The simple declarative sentence expert generates the sentence left-to-right by generating the noun phrase for the subject, then the verb phrase, and finally the noun phrase for the direct object. These phrases are written by experts invoked by the simple declarative sentence expert using call by description.
Every sentence generated is annotated with the situation descriptions that were used to generate the sentence as well as the experts that were used in the process. In fact, each phrase and each word is also so annotated.
The situation descriptions used in finding the experts that generate each sentence and phrase are among the primary factors that determine wording and sentence structure. Thus, there is a correspondence between the situation descriptions and the words and sentences. While writing the first draft, these situation descriptions, along with the influences, are the sole determiners of the wording and sentence structure, and, while revisions are being made to the text, the situation descriptions are updated so that the correspondences between the situation descriptions and the words and sentences are maintained as well as possible.
If the annotations are examined in left-to-right order, a good idea of the structure and wording of the passage can be gained. This is an approximation to re-reading, which was mentioned earlier as an important aspect of how good writing is done by people.
Although this does not have the same effect as an actual re-reading under forgetfulness, the performance of Yh demonstrates that it is adequate for many writing tasks.
The noun phrase expert is fairly robust and generates interesting and appropriate noun phrases. It also generates all of the modifiers called for by the situation description. These modifiers include the determiner, adjectives, relative clauses, and post-noun modifiers such as prepositional phrases.
If the number of modifiers of a noun would make the noun phrase too long or too complex, the noun phrase generator can post further requests in the current situation description that possibly would cause other sentences to be generated in order to present the modifiers.
In the first sentence about the array, noun phrases for the array and the flag must be generated.
If the representation of an object to be generated has a unique name associated with it, the noun phrase generator will use that, unless it is necessary to add other descriptive material. As an example from fiction writing, in introducing a character to a story it is often not sufficient to use the character's name - usually the reader wants to know some simple facts about him.
In this first sentence, the name is not used because we are in the situation of introducing a new 'character,' the array. However, this name, which is FLAG, will be introduced later.
Because Yh is being used recursively to generate the subject noun phrase, there is no a priori reason to expect that the noun phrase will be a single word, or even a simple noun phrase. Had a phrase been generated, that entire phrase would be treated as a noun; if it were a verb phrase - as would be the case for an action - the result would be a gerund.
Uniqueness of the Noun Phrase
There are two considerations to be made when a noun phrase is being generated: 1) whether the noun phrase is being generated to refer to an object already referred to in the text written so far; 2) whether the noun phrase is being generated in such a way that the noun phrase itself is similar to one already used in the text to refer to an object not equal to the object being referred to now.
In order to locate the first type of reference, the annotations for all previous noun phrases are searched to find references to the same object. The second type of reference is more difficult to find. Simply stated, Yh searches all previous noun phrases and tries to match the description for the current phrase against those for all previous noun phrases. It is hoped that the internal descriptions are such that if two noun phrases have closely matching descriptions, then the noun phrases generated are similar, and hence ambiguous. This activity is the analogue of re-reading a passage.
In general, when Yh discovers that two parts of a text clash - due to ambiguities or coincidentally similar wording - Yh is capable of repairing the problem at either site, or it could choose to reformulate parts of the text to remove one or both of the offending parts.
If Yh decides that two references are the same (Case 1 above) Yh makes a request to consider combining the sentences in which they occur. In the case of ambiguous references (Case 2 above) Yh posts a request to find distinguishing descriptors. The descriptions will be scanned for the most important distinguishing descriptors, which will then be placed in prominence in the noun phrases. Other tactics such as increasing the distance between the two references might be tried also.
The next decision in writing a noun phrase is which determiner to use, the or a, or whether to use no determiner at all. If it is specified that there should be no determiner, then none is used. If there are no other noun phrases referring to the same thing, then the is used. If the noun phrase is plural then a cannot be used.
Let us recall where we are in the writing process: The array expert has invoked the simple declarative sentence expert, which has invoked the noun phrase expert. The noun phrase expert may invoke the adjective expert.
Any of the experts in this chain of control can specify that no adjectives should be used. This would be the case if one these experts wanted to add adjectives in some order other than that which the adjective expert would choose; if one of these experts wished to add some adjectives, it would invoke the adjective expert directly.
In the sentence, the array represents the flag, there are no adjectives to insert.
Under the heading of 'modifiers' are also prepositional phrases that appear after the noun phrase, as in the dog in the yard. The placement of all adjectives and prepositional modifiers in a noun phrase is controlled by the distance to the nearest noun phrase to the left that refers to the same object. That is, the further to the left there is a noun phrase referring to the same object, the less the negative influence there is against using these modifiers: The further away a reference to the same object, the more important it is to use a detailed noun phrase.
Verb phrases are handled very much the same way as noun phrases.
The Array Lives
The sentence produced thus far is:
The array represents the flag.
When the initial plan was made, no attention was paid to the details of the array, and now that the array expert has examined the array, it is about to add a number of new goals to be achieved at this planning island. In particular, the array expert wants to write about (in order of importance): the length, the name, the base, the dimension, the array markers, the element-type, the first element, and the last element. Moreover, the operations that are performed on the array and any array markers into the array are important to discuss. In the Dutch National Flag program, exchanges are performed only at the array-marker points. Fortunately, Yh knows specifically about these operations and can talk about them intelligently.
Remember that there is a negative influence against being too verbose with adjectives, which will cause some of these modifiers to be left out.
In the world there are objects, and there are qualities of those objects, which may be necessary qualities or accidental ones. In writing, mentioning an object is typically done with a noun phrase, and the qualities are expressed as modifiers. In Yh, there are two ways to influence the writing about objects and their qualities: 1) One can add influences which will increase or decrease the importance of discussing the objects or their qualities, and 2) one can add influences which increase or decrease the importance of using the means of expressing the objects or their qualities.
For example, if the influence which controls the means of expressing a quality is negative and strong enough, the quality probably cannot be mentioned - Yh is prevented from using the means to do it. If the influence which controls the importance of discussing a quality is strong enough, the quality probably will be mentioned.
If the importance of mentioning a quality is high, and the importance of not using adjectives is high, then Yh may express the quality in a separate sentence in which the quality is not expressed as an adjective. If the importance of mentioning the quality is not very high, and the importance of not using adjectives remains high, then the quality will not be mentioned. This behavior is a product of the mechanisms in the pattern matcher.
In the case at hand, the importances of mentioning some of the qualities of the array are low.
The array expert goes through this ordered list and decides facts to mention, knowing that the sentence it just generated contains the noun phrase the array. The first thing considered is the length; because there are a number of ways to talk about the length of an array, the decisions about how to mention the length may interact with some of the other information to discuss.
For instance, if the first and last elements are mentioned, or the base and the last element, then the length can be skipped. Which of these alternatives is used can depend on whatever aspects of the array have been discussed or whether there is some advice about what to discuss.
Given that the length is to be said directly, there are several ways to accomplish this. One is to say, the..., length n, ... array; another is ...array...of n elements. The first alternative is simply to add another adjective to the list of adjectives in the noun phrase so far. If this would result in an overly complex noun phrase, this alternative would be rejected.
Because I have specified to Yh that it ought not use a lot of adjectives, ...array... of n elements is chosen, but a negative preference is added to this method, which decays with time: The other methods of introducing modifiers to a noun phrase will tend to be used if the same request is made later. Human writers do the same thing: Recently used words and sentence constructions are avoided because they distract from the mental dream by their repetition.
Yh chooses this method using call by description. The influence that I added simply is weighed along with all the other considerations by the pattern matcher. The effect of this influence is to reduce the strengths of all adjective-adding methods.
Recall that the current sentence is:
The array of n elements represents the flag.
The name of the array is next, and there are several methods that can be used: Yh could say ...,named < name>,... in the adjective list, Yh could add a second sentence, or Yh could use an appositive to the noun group itself. This last method is selected, again because Yh was told to avoid adjectives, yielding:
The array of n elements, FLAG, represents the flag.
Next comes the base; in this case FLAG is a zero-based array, which means that the first element has index 0. The possibilities are the same as for the name of the array. The appositive route is abandoned both because double appositives are not pleasing and because an appositive was just used, which resulted in a negative preference being attached to that technique.
Zero-based is considered a single word and is inserted by a specialist on adding adjectives to existing noun phrases. The list of adjectives is located, and the new one is added at the front of this list. Recall that the text is represented as a parse tree along with an annotation of what is at each node. To locate the adjectives, the tree is searched to find the noun phrase node for the array, and then the adjectives are located by looking at that part of the subtree.
Similarly, the modifier one-dimensional is appended to the front of the sequence of current adjectives. Thus far the sentence is:
The one-dimensional, zero-based array of n elements, FLAG, represents the flag.
Recall that there is a transformation pending for moving the direct object of this sentence to a more prominent position in the sentence.
Those Pesky Array Markers
At this point the array expert is still controlling the writing process. It is turning its attention to the array markers by invoking an array marker expert.
When an expert is controlling the writing process, it can do one of several things: It can examine the situation description and the current text and decide on specific actions, like adding a word directly to the text; it can decide to invoke Yh recursively on a situation description and place the resulting text somewhere; or it can decide to invoke a sequence of experts using call by description.
An array marker is simply an index into an array which is used to keep one's place during a computation. In the Dutch National Flag program there are three array markers: one to mark the place to put red objects, which moves to the right; one to mark the place to put blue objects, which moves to the left; and one to scan through the array examining the color of things it finds, placing them in the right place.
Once it is decided to talk about one array marker, it is often wise to discuss all three in one place. Because these array markers are similar, it might be a good idea to talk about them similarly; perhaps the sentences can collapse to form one smooth, parallel sentence.
The array marker expert checks to see whether the array marker in question has already been discussed, which would have been posted as an influence. Then the array marker expert locates the array into which the array marker is an index; knowing this array, the expert locates all of its other array markers.
The expert sets up a dispreference for using the name of the array markers as the only referencing expression, and it calls Yh recursively to try to find a way to express the stereotyped phrase, there are n < objects>. The situation description that the expert uses is one which suggests a simple declarative sentence with subject (there), predicate noun (objects), and the modifier (n). Of course, representations of these components of the declarative sentence are used and not the words themselves. Writing this sentence is fairly straightforward: Yh adds the next sentence:
There are three array markers.
The array marker expert then decides to add the names to the right of this sentence as an ordered appositive, which will have a list of names and a parallel list of descriptions attached to the end. This is a standard way to introduce a list of objects with descriptions and names, attaching the names in a parallel construction, and, though it is very idiomatic, there is no good reason not to use this technique.
The array marker expert makes up an extended description of what it wants done: the stereotyped phrase description, the list of names and associated descriptions, and some other hints to the next writing expert to be called. Yh is then called recursively.
The stereotyped phrase expert adds the phrase,
, L, M, and R, standing for left, standing for middle, and standing for right, respectively.
Notice that the gerund form of the phrase, stands for x, must have been derived from the program text for the Dutch National Flag program. This derivation is performed by the parsing system in PSI, and the result is placed in Yh's dictionary.
Respectively is added to the end of the sentence. While inserting these infinite verb phrases, a verb phrase collapsing expert notices that they are the same and notes a possible collapse. Because there is an influence that states that it is better to collapse immediately than to wait, the collapsing is attempted right away.
Up to this point Yh has been lucky in that all of the things that it needed to do regarding special phrases or circumstances have been handled by an expert in that area. But luck can run out, and in the situation of collapsing these parallel phrases, there is none left. In this case the verb phrase collapsing expert can only notice that collapsings are possible, but it does not know how to actually collapse sentences with the same verb phrase! When this expert looks for another expert to actually do the collapsing, it finds only a general phrase collapsing expert. This general collapsing expert simply tries to eliminate all the common words from each phrase except the first. Thus, given the phrases, standing for left, standing for middle, and standing for right this expert will try to get rid of the phrase, standing for, from the second two.
The transformation, however, does not eliminate the extra words per se, but simply hides the words from the sentence printer, leaving the original wording available in case it is needed later: Perhaps a transformation will wish to recover that wording.
In the previous section I stated that one of Yh's experts 'notices' an event taking place, and earlier I stated that observing was an important part of the plan-execution part of Yh.
When any event takes place, the expert causing the event to occur formulates a description of the event and does a call by description on that description. The description states that an event is occurring, and the experts who observe events of that type are allowed to run.
For example, when noun phrases are added to the text, an announcement is made reporting that a noun phrase satisfying a certain situation description was inserted in the text by a certain expert and located at a particular place in the text. An expert that keeps track of all noun phrases is invoked and adds that information to its own database. A similar activity takes place for the verb and other sorts of phrases and words.
Recall that Yh is agenda-driven. One of the items on that agenda causes observation experts to perform activities based on what they have observed. If, for example, a noun phrase observer has noticed the same noun phrases being generated in different places, then an expert to consider merging the phrases will be eventually invoked. Additionally, observation experts are able to perform their actions right away, and this is what I asked Yh to do when I advised it to perform all collapsings immediately.
Pondering the Issues
Sometimes a simple examination of the explicit properties of an object does not bring forth all of the interesting things that might prove useful in writing about it. For instance, one interesting thing about an array marker is the value to which it is initialized. In the representation above this fact is not mentioned outright, but is hidden in the program code.
The array marker specialist invokes an expert that reads the code and finds out to what the various markers are initialized. Because the annotated code states the purpose of the lambda binding, it is possible to specify which lambda's cause initialization rather than saving/restoring.
The line:(let ((l 0)(m 0)(r (1- n))) ;initialize l,m, & r
is annotated to state that the initialization values are 0, 0, and n-1.
Thus, three new sentences are proposed, one for each initialization:
L is initialized to 0. M is initialized to 0. R is initialized to n-1.
The names, L, M, and R, are used because there is no requirement to describe fully the markers, because they have already been introduced.
To express (1- n), a special routine is called that will convert the standard Lisp prefix notation to mathematical infix notation for external printing purposes.
As these last three sentences are generated, Yh notices that the first two have the same direct object, and all three have the same verb phrase. Additionally, the previous sentence about the array markers is noted to have used the names, L, M, and R.
The Fun Begins
Given that the initial paragraph looks like:
The one-dimensional, zero-based array of n elements, FLAG, represents the flag. There are three array markers, L, M, and R, standing for left, middle and right, respectively. L is initialized to 0. M is initialized to 0. R is initialized to n1.
there are still some loose ends to tie up and some transformations to apply. At this stage, this paragraph is the best that Yh can do by making local decisions about paragraph structure, sentence structure, and word choice.
Let me number the sentences:
(1) The one-dimensional, zero-based array of n elements, FLAG, represents the flag. (2) There are three array markers, L, M, and R, standing for left, middle and right, respectively. (3) L is initialized to 0. (4)} M is initialized to 0. (5) R is initialized to n1.
First, it might be possible to collapse sentence (2) with some or all of sentences (3), (4), and (5) - L, M, and R are common noun phrases. But because in sentence (2) L, M, and R are used as direct objects and in sentences (3), (4), and (5) they are used as subjects, the only way to accomplish such a collapsing would be by making further relative clauses to the direct objects, which would result in a sentence like:
There are three array markers, L, which is initialized to 0, M, which is initialized to 0, and R, which is initialized to n-1, standing for left, middle, and right, respectively.
This would be a very complex sentence; this option is rejected on the grounds of complexity.
Sentences (3), (4), and (5) pose something of a problem because they are so closely related to each other. All three have the same verb phrase structure, and the first two have the same direct objects. The latter fact causes sentences (3) and (4) to be collapsed by merging the predicate parts. Therefore the subjects are conjoined with and, and the verb phrase is transformed into the plural. The direct object is left as it is. The situation description in the text annotations is patched to reflect the fact of the multiple noun phrase.
The new third sentence is then:
(3') L and M are initialized to 0.
Sentence (5) has the same verb phrase as sentence (3'), but sentence (3') is fairly complex already, so Yh choses to simply bring them closer together with a punctuation change. The last sentence of the paragraph, hence, becomes:
L and M are initialized to 0; R is initialized to n1.
Another option for these last three sentences is to use the parallel construction:
L, M, and R are initialized to 0, 0, and n-1, respectively.
This is not done because it produces a sentence with the same structure as the one before it. This is determined by producing the description of the sentence that would result from the collapse of sentences (3), (4)}, and (5) and comparing that description with the description of sentence (2).
Finally, Yh has to transform the first sentence to the passive voice in order to change the focus from the array to the flag. The first sentence becomes:
The flag is represented by the one-dimensional, zero-based array of n elements, FLAG.
Alternative First Paragraphs
By increasing the dispreference of adjectives and adjusting the influences on how things such as modifiers can be introduced, the following paragraphs were generated in place of the first one:
The flag is represented by an array of n elements, FLAG. It is a 1-dimensional array. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are initialized to 0, the first element; R is initialized to n-1, the last element.
The flag is represented by an array of n elements, FLAG. It is a 1-dimensional array with N elements. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are øinitialized to 0, the first element; R is initialized to n1.
A Weird Alternative
Suppose that all of the structure-producing experts were removed from Yh, leaving only the programming knowledge experts and the lexicon, what would Yh say? It would say:
Array N elements Flag Represent One-dimensional Zero-based Array markers Three L M R Standing for Left Middle Right L Initialize to 0 M Initialize to 0 R Initialize to n-1
Yh does a fair job of writing about a small class of programs, but it is not a production quality program. It does not even perform very many of the things that we saw go into good writing.
Yh does not do explicit reasoning about shared information, nor does it reason about the implications of facts introduced in the text it writes. However, in writing about simple programs very little reasoning is required, and, therefore, this is not much of a problem. There are commonsense reasoning programs that could easily be adapted for use in Yh. [Creary 1984][Gabriel 1983].
Yh does not explicitly consider whether its writing produces vivid and continuous images in the reader. Certainly there is no mechanism for Yh to experience those images itself. And Yh never actually re-reads any of its writing, although it reviews its writing using the description mechanism. The level of success of this review process is encouraging, and, combined with a commonsense reasoning expert which could reason about knowledge and belief, this technique could be sufficient for many writing tasks.
Yh takes some actions aimed at producing good writing: Yh plans its text carefully, it deliberates over word choice, and it is sensitive to potential ambiguities in its wording.
As I stated at the beginning, a writer has an intimate relationship with his human reader. Judgment, sensitivity, humor, the human facts and experiences - especially the literary experiences that help give a writer his voice - are things that I believe are difficult to give to a computer, but maybe not impossible.
- [Creary 1984] Creary, Lewis G., The Epistemic Structure of Commonsense Factual Reasoning, Stanford Computer Science Department Memo, to appear.
- [Gabriel 1981] Gabriel, R. P., An Organization for Programs in Fluid Domains, Stanford Artificial Intelligence Memo 342 (STAN-CS-81-856), 1981.
- [Gabriel 1983] Gabriel, R. P., Creary, Lewis G., a reasoning program written by Gabriel and Creary at Stanford University from 1982 - 1983}, no documentation.
- [Gardner 1984] Gardner, John, The Art of Fiction, Alfred A. Knopf, New York, 1984.
- [Green 1977] Green, Cordell, A Summary of the PSI Program Synthesis System in the Fifth International Joint Conference on Artificial Intelligence - 1977, Cambridge, Mass, 1977.
- [Simon 1969] Simon, Herbert A., The Sciences of the Artificial, MIT Press, Cambridge, 1969.
- [Thomas 1974] Thomas, Lewis, The Lives of a Cell in The Lives of a Cell, Bantam Books, Inc., 1974.