How many letters are in an average word?

When using text processing, people often ask themselves how many characters there are in a single word. The answer is not straightforward, as it depends on the complexity of the text, the language we’re working with, and the criteria we use, such as whether contractions (like “don’t” or “can’t” or “won’t”) count as a single word or whether we should count them as two.

What are the average word lengths in different languages?

Running simple numbers on the public domain text (which for many reasons may not be representative of how long the words are) gives us a perspective on how long the average word is in different languages:

  • Spanish: 4.776
  • English: 4.793
  • Portuguese: 4.794
  • French: 4.886
  • Italian: 5.327
  • German: 5.991

So in short, if you ask what is the average length of a word, the answer is that the average word length in English is 4.8 characters.

For the sake of simplicity, we can assume that the average word length in a random text is about 5 characters per word. The fact that German words are longer does not surprise anyone, but to be honest, it made me think about why Italian words are longer than Spanish or French.

Italian words can sometimes appear longer due to factors such as the inclusion of prefixes, suffixes, or additional syllables that convey nuances of meaning. However, this is not a consistent rule for all words in the language. In Spanish and French, you can also find words that vary in length based on their etymology and usage.

For example:

Italian:

  • Incredibile (incredible)
  • Immaginazione (imagination)

Spanish:

  • Increíble (incredible)
  • Imaginación (imagination)

French:

  • Incroyable (incredible)
  • Imagination (imagination)

Things that affect word length the most are:

  1. Contractions and Use of Abbreviations:
  • The handling of contractions such as “don’t” “can’t” or “won’t” can vary. Some may consider them to be a single word, while others may count them as two words.

2. Hyphenation:

  • Words connected by dashes, such as “well-known” or “self-esteem,” may be counted as one or more words depending on the processing criteria.

3. Compound Words:

  • Languages like German often use compound words, which can significantly affect the average word length depending on whether each component is counted separately.
  • Example: English: “Airplane ticket” German: “Flugzeugticket” (Flugzeug + Ticket)
  • Extreme Example: English: “International Airport Security Check Procedure” German: “Internationallufthafensicherheitskontrollverfahren”

4. Agglutination:

  • Languages that agglutinate, like Finnish or Turkish, may form long words by combining multiple morphemes, influencing word length.

Turkish: Word: “Evlerden”

  • “Ev” means “house” in Turkish.
  • “Evler” means “houses.”
  • “Evlerden” means “from the houses.”

Finnish: Word: “Kirjoittamattomissakaan”

  • “Kirjoita” means “to write” in Finnish.
  • “Kirjoittamatto” means “unwritten.”
  • “Kirjoittamattomissa” means “in the unwritten.”
  • “Kirjoittamattomissakaan” means “in the unwritten (form) either.”

5. Inflections:

  • Inflections, such as verb conjugations or noun declensions, can contribute to longer or shorter word forms depending on the language.
  • Spanish: Verb Conjugation: Base: “hablar” (to talk) Inflected forms: “hablo” (I talk), “hablas” (you talk), “habla” (he/she talks), “hablamos” (we talk), etc.
  • German: Noun Declension: Base: “Hund” (dog) Genitive form: “des Hundes” (of the dog) Dative form: “dem Hund” (to the dog)

6. Reduplication:

  • Some languages use reduplication (repeating part of a word) for various purposes, affecting the perceived length of words. For example: “Chit-chat” or “Ping-pong

Scroll to Top