13 December 2009

Dielxysa?




Can you raed tihs? When a student first sent me this image in an e-mail my haed hrut. A class discussion and a simple test ensued to try and get to the bottom of this puzzle regarding redundancy in language. Let's start with the fact that the source of this meme is a translation of an English text that spread like wildfire over the net in 2003:

Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.

Can the text be traced back to its source at Cambridge University? Well, it seems a number of people have tried, but to no avail. This may lead to our first doubts concerning this word scrambling game.

Here is a text on the subject created by a skeptical blogger using the "scramble script":

I snlreiecy digarese wtih the perisems put frtoh aobut scbrialnmg wrods, so I'm itionltlnaney enrovdaenig to ulizite leetnghir cpocmtaeild wodrs, not nclesiesray uonommcn wrdos, taht can not be dceerihped as ieuntlivity as tohse in the oirginal prgpraaah. The frist of my dsiceorives is taht wrods endnig in sufefxis or bnegining in pierxfes bmecoe daggesiend form the frist/lsat ltteer rothlpisneias taht spupedsloy are the baiss of the pmseires, and bemcoe mcuh mroe clinaelnhgg, amsolt ieclenaipbhrde. See?

This proves that some scrambled words are more difficult to decipher than others. Note here that words ending in suffixes or beginning with prefixes complicate matters enormously. And this leads us back to our Hungarian text, and the question of 'what do we mean by the word "word"'?

Discussing the text in class, I used the following test (a randomly selected sentence from the internet):

A talaszteny 16:9-se kpnaénayárl rizelnkeekd

Geddit?

A tesztalany 16:9-es képaránnyal rendelkezik

(The test model has a 16:9 aspect ratio)

Here we are confused by a number of factors. First of all, there are two compound nouns "tesztalany", and "képarány". You might argue that we should treat these as separate words, but there are other problems. The word képarány ("picture ratio" = "aspect ratio) ends in the comitative suffix -VAL/VEL, and to make matters worse, the initial V consonant of that suffix phonologically assimilates with and lengthens the previous consonant.(This is reflected typographically by the "nny" consonant cluster towards the end of the word). Here then, the last letter is part of an inflectional ending and not of the "base word": the addition of inflectional letters simply adds to the confusion.

Then there is the verb "rendelkezik". If we break this verb down into its constituent morphemes we come up with something like this:

REND-EL-KEZ-IK
("order" + denominal verbal suffix + reflexive suffix + s3 person marker)
"s/he/it possesses"
Here the suffixes create a verb from a noun, and give information about reflexivity and person.

Let us look at another example, and see how the meaning of the word "unfolds" as more suffixes are added:

ki [out]

ki- véd- [out-defend]

ki- véd- és - [out-defend-deverbal nominal suff.-]

ki- véd- és – é- [out-defend-deverbal nominal suff.-S3 poss. suff.-]

ki- véd- és – é - re [out-defend-deverbal nominal suff.-S3 poss.-sublative suff.]

(suff. = suffix, roughly translatable as "in its/his/her defense")

If we jumble up the letters in between, we lose too much information, and the resulting word simply becomes a mess.

So how can we understand the Hungarian (translated) text? The translator probably made a number of conscious decisions in "creating" his scrambled words. In other words, s/he manipulated the jumbling of the text to make it easy to read.

First, note that in the Hungarian text there are many letters - especially consonants - which seem to be in the same order as in the original word. This makes the processing of the words a lot easier (note that in many languages such as Hebrew, Arabic, and Persian the vowels are simply left out). If most of the consonants are in approximately the right order and/or position words are easy to make out: "iprmoetnt" is easy because "iprmoetnt" retains the right order, approximate location, and correct sounds for p - r and t - nt. Transpositions of adjacent letters are easier to read than more distant transpositions. Also, there are a number of "short" words included in the text. There is less shuffling of letters in short words than in long words, and none in say a definite article such as "a" [the]. Short words = easier to process.

We might hypothesize the following:

The meme represents a simplistic reduction of the reading process in both languages.

In both languages, a number of function words [such as "the, be, and, you" etc.] remain unchanged - mostly because they are short words. This helps the reader by preserving the grammatical structure of the original, helping him/her to work out what word is likely to come next. Context also helps: the text is reasonably predictable. After you have understood the first few words of the sentence, you can guess what words are coming next (even with very little information from the letters in the word). Context plays an important role in understanding speech that is distorted or "noisy", and the same is probably true for written texts that have been jumbled.

However, on comparing Hungarian with English in more detail, we may conclude that the process of reading comprehension in Hungarian is to some degree different to the process in English, as grammatical information unfolds within words to a greater degree in Hungarian. By contrast, in English much grammatical information is conveyed by word order, as English is an inflectionally poor language.


It stands to reason that there is a stronger reliance on analytic strategies when reading words in an agglutinative language such as Hungarian, as prefixes, suffixes and inflections may provide a lot of additional grammatical information. In English many of our "words" map directly to morphemes, so perhaps there is a greater tendency towards "direct lexical access", otherwise known as "whole-word" or even "logographic" reading, where words are treated like images.


In conclusion, the text shows some of the power of redundancy in both languages, but it's not as simple as saying that letter order does not matter at all. It is a very reductive, pseudo-scientific meme.

And while we're on the subject, somebody should tell the people at French Connection UK that most people think their t-shirts read "fuck". Surely that's inappropriate. Maybe they got the idea from "CFUK", or "Conservative Future UK".

1 comment:

  1. I think you should publish this, perhaps as a reaction to this one:

    "Our findings suggest that effects of letter transpo- sition probably reflect the principles of defining lexical space and lexical organization, and do not emerge from the peripheral registering of letters in alphabetic orthog- raphies. In a recent study, Frost et al. (2005) argued that lexical space in Hebrew is organized in a radically differ- ent manner than that of English and other Indo-European languages. Whereas in English, words in the mental lexi- con are aligned according to some orthographic dimension that registers their constituent letters, in Hebrew, lexical space is structured according to the morphological roots, so that all words derived from a given root are clustered together. If lexical access in a language such as Hebrew indeed requires the correct identification of a specific root morpheme, and many roots share the same set of letters, the primary task of the lexical system is to determine the exact identity and order of letters constituting the root mor- pheme. Root-letter transpositions will therefore prevent the processing system from extracting the correct root iden- tity necessary for the lexical search. This would produce genuine differences in sensitivity to letter transpositions in Hebrew, when compared with English. Thus, whereas readers of English seem to display some “blindness” to transpositions in RSVP, readers of Hebrew seem to display extreme difficulties in reading transposed text." http://pbr.psychonomic-journals.org/content/14/5/913.full.pdf

    ReplyDelete