Jump to content
Islamic Forum

Recommended Posts

There are 77430 words in the Quran, but it is said that the Quran
has around 2000 (TWO THOUSAND) unique (unrepeated) words. I am trying to
categorize those words in a way easily comprehensible to non-Native
learners of Arabic. I think this is the quickest way to understanding
Quran, because you need not try to understand Quran word by word……What
you need is to group the basic vocabulary with simple examples.

Share this post


Link to post
Share on other sites
PropellerAds

:sl:

Please post your Quran related topics in the designated section. I've moved this topic for you.

Share this post


Link to post
Share on other sites

 

How many unique words are there in the Quran?

 I put forward this question to Kais Dukes, author of Quranic Arabic Corpus. My emailed question looked like this:

Dear Brother,>> السلام عليكم>> I have gone through your website and found it very essential for> learners, researchers and for curious Muslims.>> I have a question to you?> How many words are there in the Holy Quran without repetion? In other> words, how many unique words are there in the Quran?>> I hope you have the answer, If your answer is from a seconday source,> please refer to the relevant sources.>> مع أطيب التمنيات>> Md. Fazlul Haque
  In response to my question, he wrote:
Salamu Alaykum Fazlul Haque,To the best of my knowledge, our project is the first accurateannotated morphological work for the Quran by computer, so I would besurprised at an accurate unique word count from another secondarysource. Although of course, I could be wrong. The number of uniqueArabic words in the Quran is not an easy question to answer. In Arabicthe concept of a "word" can have multiple technical linguisticinterpretations. Based on the existing annotation we have performed atthe Quranic Arabic Corpus (http://corpus.quran.com), I can provide thefollowing statistics:

Total number of space-seperated words = 77,430Number of *unique* surface forms (i.e. space-separated word-forms,including clitics) = 18994Number of unique words by *stem* = 12183Number of unique words by *root* = 1685 (not necessarily a greatmetric for unique word counting, e.g. pronouns have no Semitic root)Number of unique words by *lemma* = 3382 (excluding verbs, and otherwords where lemma is not annotated).This is a primary source (we annotated this ourselves). These figuresare quite accurate, but are subject to minor revision as furtherchecking occurs. The terms used above have technical linguisticmeanings. Thus, the number of unique "words" is not only a problem ofcounting. Wwe have computers, so counting annotated data is in theoryvery simple, I produced the above statistics after 10 minutes of workjust now. The issue is what metric to use ... unique white-spaceseparated word-forms, stems, roots, lemmas, or something else? UnlikeEnglish, Arabic is a highly inflected and morphologically richlanguage, with multiple segments often fused into a single word-form.As an estimate, I would say that there are at most 7,000 unique"words" in the Quran in the sense of what you would need to have alexicon with wide-ranging coverage for the Quran. Something alsointeresting to note, is the Zipfian distribution. A handful of words(e.g. the top 100 words) will cover a very large percentage of theactual Quran, i.e. most verses. (the 80/20 rule).You might be interested in these web pages:http://corpus.quran.com/lemmas.jsp - List of unique lemmas in theQuran organized by frequencyhttp://corpus.quran.com/verbs.jsp - List of unique verbs in the Quranorganized by frequencySorry for giving you such a vague linguist's response, but in Arabicthe concept of a unique word is itself vague, and Arabic linguists (orat least computational Arabic linguists) tend to prefer to work withbetter defined terms such as the white-space separated tokens, surfaceform, lemma, stem and root, but even then those terms also haveproblems :-)I would suggest that the above two web pages with lists of mostfrequently occurring lemmas and verb roots, are probably more what youare looking for.If you have any further questions, please ask, I would be happy to help.-- Kais DukesLanguage Research GroupSchool of ComputingUniversity of Leeds

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×