Wednesday, July 3, 2013

English Is Weird, Basque Is Not

A natural language processing software company devised a "language weirdness index" to distinguish between languages that have features in the WALS database that are atypical of languages in the database on multiple dimensions, from those that have language features that are more common in the languages in the database.

Out of 239 languages for which there was enough data to make a meaningful evaluation, English ranked 33rd most weird, while Basque ranked 10th least weird.  Hindi was the most "normal" language in the database, while Mixtec (Chalcatongo) which is spoken by about 6,000 people in Oxaca, Mexico was most weird by this measure.

The lengthy analysis in the linked post discusses how the index was formulated and what strengths and weaknesses it has as a measure of language weirdness.

2 comments:

Maju said...

I'd say that even native English-speakers would agree that English is weird, what is less clear to me is why Spanish ranks even more weird, being relatively regular and with a rather simple phonetic structure.

That Basque is not 'weird' is not surprising to me either: it's a very "logical" and fairly regular language in spite of its verbal complexity (you study verbs as simple tables of preffix-root-suffix for example), its phonetic principles are not too different from those of Spanish (which is one of several Basque-influenced romances), with very limited exceptions like the presence of -k at the end of some declensions (subject and DO essentially).

andrew said...

One fascinating aspect of the index is how unstable it is within language families.