Reading: a cognitive process

Reading involves a series of interlinked cognitive processes. These processes are not inbuilt for reading. They cannot be as reading is only a very recent innovation in the course of human history, and thus we could not have evolved such a specific process in such a short space of time. Reading out loud is a more complex process the just normal, silent, reading. However, it is not a distinct process. Instead it involves all the same processes as reading sliently, then this processing leads onto further stages of processing which are more speech-related.

Perceptual span

The ability to read is one that takes many years from childhood to develop into a skill that becomes really quite automatic, requiring little conscious effort to perform. There are many processes going on in the brain, but first, the words themselves need to be detected visually. When we read our eyes make very many short and quick movements across the text, called saccades. Our eyes do not just fixate on the features of the text, instead, we make use of parafoveal vision. This could be seen as “reading ahead”. Studies, using equipment to track eye movements, have shown that we typically are looking about two or three words ahead when reading out loud. So we look over a span of text rather than at each word, or even letter, at a time. This is called “perceptual span”. The size of this span varies depending on the text being read. If the font is bigger for instance, the perceptual span will be physically bigger on the retina, however, there is some constancy of size when the number of letters or words in contains is considered.
There are three types of perceptual span. Total perceptual span encompasses the whole area over which information is recognised. Letter identification span is the area from which letters are identified, whereas word identification span is the area from which words can be identified. Perceptual spans have a bias to the right, i.e., ahead of what is being read. Pollatsek et al. showed that readers of Hebrew, which is read from right to left, have a bias to the left in their perceptual span.

Neurophysiological basis of reading

Knowledge of basic neurophysiology can tell us what areas must be involved in reading out loud. Information is received at the retina, transmitted via the optic nerve and the lateral geniculate nucleus, to the primary visual cortex. Processing occurs here and information is then relayed to Wernicke’s areas and Broca’s areas, where more information processing occurs. Finally, information is relayed to motor cortex associated with producing speech sounds. But this does not really tell us much. All I said was “processing” occurs in these areas, but knowledge of what these areas might do, does not tell us how they do it. The study of neurophysiology alone does not allow us to infer mechanisms of processing with any degree of certainty. The best course of action is to observe people, both those who can read normally and those with reading disorders, and try and come up with a computational model based on these observations, and also taking into account what we know neuronal structures can do.

A model of reading

One of the first, and most significant, computational models is that created by Rumelhart and McClelland (1981). Theirs is an interactive activation model based on three levels of processing following from visual input; the feature level (which detects the individual features of letters), the letter level (which collates input from the feature level to detect letters), and finally, the word level (which takes the outputs of the letter level to detect words). As it stands, this is just a hierarchical model. To be an interactive activation model the levels need to loop back. The connections between the levels are both excitatory and inhibitory. There are also inhibitory loops leading back within the levels. For instance, the input of a particular letter feature, say, a vertical bar, excites the corresponding cells in the feature level. These cells then act to inhibit the cells which do not correspond to that feature. There are then excitatory outputs connecting to nodes of cells in the letter level which correspond to letters which have the feature of a vertical bar (e.g, "T", "H", "L", etc.). There are also inhibitory connections that serve to inhibit any nodes which correspond to letters which do not have the feature of a vertical bar (e.g., "A", "O", "C", etc.). The excited letter nodes then inhibit other nodes in the same level. The letter nodes that are excited then provide excitatory input to nodes in the word level which correspond to words which have that particular letter in a certain position. Again, inhibition is also fed into the word level for words that do not meet these requirements.

Criticism of the Rumelhart and McClelland model

Rumelhart and McCelland’s model has faced criticism due to the fact that it is only designed to deal with four letter words, but there is no real reason why its could not be expanded to deal with more. Of course millions of connections would be required to encompass the lexicon of the average adult, but this is not neuronally impossible. Another criticism is that the model does not predict the observed phenomenon of priming. Priming is where a mask (something to block a visual stimulus) is presented, followed by the display of a word for so short a period that it is not consciously recognised by a test subject. Immediately after this, another word, the target word, is displayed for a slightly longer period of time so that it is consciously detected. After that, another mask is displayed so that a subject has to report the word from memory, and does not get any longer to recognise it. Findings have shown that when a subject is presented with two different words, the frequency of correct reports is significantly less then when both those words were the same. By the same, I mean that they are the same word, except the first is in lower case letters while the second is in upper case letters so that there is a difference in the visual presentation. The frequency of correct reports is also higher if any letter in the prime word is identical and in the same position in the target word, and also, if the prime word is semantically related (such as “sharp” and “point”).

Development of the Rumelhart and McClelland model

The interactive activation model has no capacity for this sort of behaviour as it stands. However, if the idea of strengthening of connections was introduced it might. A theory behind the priming effect is based on the same principle as conditional learning, namely, that synaptic connections are strengthened in accordance with simultaneously occurring inputs (this is known as "Hebbian learning"). This causes long term potentiation in the cells relevant to recognising a priming word and so there is still some residual activity when the target word is presented, thus it requires less activity to reactivate the specific node for a word. This effect is also seen when lists of words are presented for a test subject to read out loud, and some words are repeated. Those words that have previously been presented have a shorter reaction time to being read out than before. There also seems to be evidence of a more long term effect to this. It has been found that people will more quickly recognise familiar words (such as “chair”, and “apple”) than unfamiliar words (such as “microscope”, and “sasquatch”), thus implying some kind of increased synapse strength over the period of years rather than seconds or minutes.

Cognitive routes in reading

There appears to be three possible routes of processing from reading words to saying them out loud, all three starting with a visual analysis system, which could be the interactive activation model mentioned above. Route one involves phoneme-grapheme conversion, which deals with simply converting spellings into sounds, then converts them into speech. This is how many children learn to read, and also how an adult reader would read out unknown words. Routes two and three both involve the straight recognition of familiar words. Route two however goes on from there to employ a semantic system to associate the word with a meaning, which is involves auditory association cortex and then associations in motor cortex to produce speech. The third route skips the semantic recognition and goes straight to auditory and motor associations.

The processing areas involved in the articulation of read words

To read out loud, the visual input needs to be processed by areas such as Wernicke’s area, the primary areas for the comprehension of language. The output of this area leads to Broca’s area, which coordinates neurons in the motor cortex to produce speech.

Summary

It obviously takes more processing to read out loud than it does to read silently. There are many more additional processes, rather than there necessarily being significant changes to the reading processes themselves. Certainly, experiments show that the rate of words read per minute is greater when reading is silent than when it is out loud.

Based on the lectures of Prof. Peter McLeod, Department of Experimental Psychology, University of Oxford.

cognitive psychology	Library volunteer	Reading	DTT
hamming distance	Wernicke's aphasia	mundane	Tsundoku
North Dallas Forty	Face recognition	Stroop test	Neurophysiology
Paul Broca	Reaction time	Broca's aphasia	Saccade
Wernicke's Area	Broca's Area	cognitive	synapse
read	University of Oxford	redundancy