The English language is a lot more French than we thought, here’s why

The English language is a lot more French than we thought

DISCLAIMER: I personally do not have an opinion on the classification of English, I am not a linguist. This article only researches the statistics behind English as there is currently no such data available.

The English language and its origins have been a topic for fierce debate among many linguists. English is classified as a (West) Germanic language, meaning that it is closely related to other Germanic languages such as Swedish, Dutch and German. The other dominant language family in Western Europe is the group of Romance languages: French, Italian, Spanish… all languages that have sprouted from Latin somewhere throughout history.

Unlike other Germanic languages, English shares a large portion of their vocabulary with French and Latin, often attributed to the period of Norman French dominance in England after 1066. The size of this Romance influence on English, along with some other technical aspects such as pronunciation and syntax, has led some radical linguists to believe that English should in fact not be seen as a Germanic language, but rather as a Romance-Germanic hybrid. However, the general consensus is that the overall English language is a third of Old English origin (so, Germanic) but that the core vocabulary is entirely Old English. The keyword here is core, as most linguists claim that French and Latin influence only enters the language through a handful of basic words but a vast majority of academic terms. For many, this seems to be the most important criterion for its classification as a Germanic language.

I personally don’t care much about these classifications, but I was very surprised to discover that in fact no-one recently has actually bothered to research the origins of English, let alone the core! The latest research was done in 1975 by Joseph M. Williams, where he examined the 10,000 most frequently used words in English, based on a rather small sample size of corporate letters. Here are my issues with his research:

And core vocabulary is precisely what this whole debate is all about, so I decided to do my own little research using Python to see how I could provide some statistics behind these claims!

The Oxford Dictionary claims that there are roughly 250 000 distinct words in English vocabulary. But what share represents the core vocabulary? What does that even mean? The Oxford Dictionary uses the following table with some insight on the relation of the most common words in English to the appearance of words in English sources:

This table shows us a rather large problem: the actual occurrence of words in applied English does not reflect the (core) vocabulary or even the language as a whole. 50% of any given text in English will use the exact same linkers/pronouns, even though those 100 words only reflect 0.04% of distinct English vocabulary. A word such as “the” alone makes up 6% of any given source in English. This disproportionate use of extremely basic structural words deceives the reader into thinking that the English language/vocabulary is of an entirely different etymological composition.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Advanced Analytics in Order to Cash Process

13 Jun, 2017

In a previous article, author Sibanjan Das focused on advanced analytics in procurement. In this article, I will pick up …

Read more

How to Overcome Your Analytics Anxiety for Better Decision Making

18 Feb, 2018

Analytics can help organizations make decisions faster, more conveniently, and more accurately …  at least in theory. In practice, however, …

Read more

7 Challenges Faced By Data Scientists In Data Processing In 2020

2 Mar, 2020

Each day we generate 2.5 quintillion bytes of data. All the data that is being generated by us while using …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.