Jump to content
Islamic Forum
mfhaq77

How To Analyze Quranic Arabic Corpus Morphological Data 0.4

Recommended Posts

Alhamdu Lillah, the Quranic Arabic Corpus is a great resource for Learners and Researchers of the Quran all over the world. The author of this project Kais Dukes is distributing its Morphology Data free of cost. If you can learn how to analyse it, you will be able to extract many things of the Quran. The following discussion will guide you how to analyse the Quranic data.

 

If you want to analyse Quranic Corpus, Download it from
corpus.quran.com/download/, import the txt file into MS Access
2007/2010, Use Query option to get desired result although analysis
based on FEATURES column is a little bit tricky.

 

Before analyzing Quranic Arabic Corpus morphological data 0.4, you have to learn some terms of Corpus Linguistics.

In linguistics, a morpheme is the smallest semantically meaningful unit in
a language. The field of study dedicated to morphemes is called
morphology. Morphemes are of two types: Free and Bound Morphemes. A
morpheme (or word element) that can stand alone as a word is called
Free. It is sometimes called stem, because other non-free elements are
added ti it.

In morphology, a bound morpheme is a morpheme that only appears as part of a larger word. They are sometimes called affixes.

Affixes are three types: Prefix, Infix, Suffix

Affixes (prefix, suffix, infix and circumfix) are all bound morphemes.

Bound morphemes occur only before other morphemes.Examples: un- (uncover, undo)

Infix Bound morphemes which are inserted into other morphemes. eg not found in English. But Food > Feed

Suffixes are Bound morphemes which occur following other morphemes.

Examples:

-er (singer, performer)

-ist (typist, pianist)

-ly (manly, friendly)

Quranic Arabic Corpus morphological data 0.4 includes these and other linguistic terms concerned.

Let me explain a few Rows

LOCATION is
the Surah:Ayah:word:morpheme reference of the Quran. FORM is the
English Transliteration of the surface Arabic Word form, which is based
on Buckwalter Transliteration. See the chart:

http://corpus.quran.com/java/buckwalter.jsp

TAG is the lexical or grammatical category of the morpheme concerned. FEATURES
describe the detailed linguistic features of the morpheme.

Description of FEATURES
In morphology and lexicography, a lemma (plural lemmas or lemmata) is the
canonical form, dictionary form, or citation form of a set of words
(headword). In English, for example, run, runs, ran and running are
forms of the same lexeme, with run as the lemma. Lexeme, in this
context, refers to the set of all the forms that have the same meaning,
and lemma refers to the particular form that is chosen by convention to
represent the lexeme.

Difference between stem and lemma
In computational linguistics, a stem is the part of the word that never
changes even when morphologically inflected, whilst a lemma is the base
form of the verb. For example, from "produced", the lemma is "produce",
but the stem is "produc-." This is because there are words such as
production. In linguistic analysis, the stem is defined more generally
as the analyzed base form from which all inflected forms can be formed.

For illustrations of Other Abbreviated Terms, Go to page

http://corpus.quran.com/documentation/tagset.jsp

For Verb Forms, Refer to page:

http://corpus.quran.com/documentation/verbforms.jsp

The First Word of Quran Bismi

The First Word of Quran Bismi consists of two morphemes: bi which is used
as prefix, and somi (don't think that the "o" in somi is like English
"O", it is a symbol of 'sukun' according to Buckwalter Transliteration)
is a noun; it is a stem; POS=Parts of Speech, N=Noun; its Lemma is {som
(whwre hamzah is deleted for widespread use) which is derived from the
triliteral ROOT smw ie س م و . It is a |M|masculine noun used here in
Genitive case ie اضافة

LOCATION FORM TAG FEATURES

(1:1:1:1) bi P PREFIX|bi+

(1:1:1:2) somi N STEM|POS:N|LEM:{som|ROOT:smw|M|GEN

The First Explicit Verb of the Quran

The First Explicit Verb of the Quran is located in the 2nd word of the Fifth verse of First chapter Fatihah:

(1:5:2:1) naEobudu V STEM|POS:V|IMPF|LEM:Eabada|ROOT:Ebd|1P

This is an IMPERFECT Verb (Present-Future Tense)used in 1st Person Plural

The Second Verb

(1:5:4:1) nasotaEiynu V STEM|POS:V|IMPF|(X)|LEM:{sotaEiynu|ROOT:Ewn|1P

This is also an IMPERFECT verb used in (X) Form and the ROOT is Ewn ie ع و ن


How To Analyze:

Download the txt file, copy and paste it to Excel 2007/2010 (Excel 2003 won't help)

The rows and columns will be separated. Now the analysis depends on what you want out of the QAC.

If you want to know how many prepositions are used i quran, you can do so
by auto-filtering the TAG column: choose Data>Filter, from drop-down
deselect 'Select all' and check P. You will get all prepositions used in
the Quran. How many?

Ok, in the last blank cell of Column C, write
this formula =COUNTIF(C1:C128215, "P"), press ENTER, you will get 13006.
Unfortunately, you will not get this stat from the site

http://corpus.quran.com/morphologicalsearch.jsp
You will get only 7679, here prepositions as stems are counted, not the
prefixed and suffixed prepositions.There are 7679 stem prep, 5325
prefix prep and 2 suffix prep in Quran, so the total is 7679+5325+2=
13006.



Sometimes Quranic Arabic Corpus morphological data 0.4 is
very helpful for you to find specific Data. For example if you want to
know The Past Passive Verbs used in the Quran, you can do that within
seconds. Here is the list of Past Passive Verbs used in Quran. (Here
FORM is the passive form, Go to ayat and check it)

LOCATION FORM TAG

(4:157:15:1) $ub~iha V

(6:118:3:1) *ukira V

(5:3:23:1) *ubiHa V

(5:13:15:1) *uk~iru V

(76:14:4:2) *ul~ilato V

(2:283:16:1) {&otumina V

(33:11:2:1) {botuliYa V

(2:173:14:1) {DoTur~a V

(14:26:6:1) {jotuv~ato V

(7:75:8:1) {sotuDoEifu V

(42:16:8:1) {sotujiyba V

(5:44:17:1) {sotuHofiZu V

(6:10:2:1) {sotuhozi}a V

(2:166:4:1) {t~ubiEu V

(11:110:5:2) {xotulifa V

(54:9:9:2) {zodujira V

(22:39:1:1) >u*ina V

(2:24:12:1) >uEid~ato V

(9:58:7:1) >uEoTu V

(10:22:28:1) >uHiyTa V

(2:187:1:1) >uHil~a V

(4:128:18:2) >uHoDirati V

(69:5:3:2) >uholiku V

(4:25:36:1) >uHoSi V

(77:12:3:1) >uj~ilato V

(7:120:1:2) >uloqiYa V

(4:60:21:1) >umiru V

(18:56:17:1) >un*iru V

(2:4:4:1) >unzila V

(72:10:5:1) >uriyda V

(4:91:13:1) >urokisu V

(7:6:3:1) >urosila V

(9:108:6:1) >us~isa V

(2:25:25:2) >utu V

(11:60:1:2) >utobiEu V

(6:19:11:2) >uwHiYa V

(7:43:33:1) >uwrivo V

(8:70:18:1) >uxi*a V

(2:246:40:1) >uxorijo V

(2:93:15:2) >u$oribu V

(6:70:34:1) >ubosilu V

(3:185:14:2) >udoxila V

(22:22:8:1) >uEiydu V

(51:9:4:1) >ufika V

(10:27:16:1) >ugo$iyato V

(71:25:3:1) >ugoriqu V

(2:173:9:1) >uhil~a V

(11:1:3:1) >uHokimato V

(2:196:6:1) >uHoSiro V

(5:109:7:1) >ujibo V

(16:106:9:1) >ukoriha V

(25:40:6:1) >umoTirato V

(77:11:3:1) >uq~itato V

(11:116:23:1) >utorifu V

(3:195:22:2) >uw*u V

(32:17:5:1) >uxofiYa V

(26:90:1:2) >uzolifati V

(2:101:14:1) >uwtu V

(27:8:5:1) buwrika V

(22:60:9:1) bugiYa V

(16:58:2:1) bu$~ira V

(82:4:3:1) buEovirato V

(2:258:36:2) buhita V

(26:91:1:2) bur~izati V

(56:5:1:2) bus~ati V

(2:282:77:1) duEu V

(2:61:37:2) Duribato V

(33:14:2:1) duxilato V

(69:14:4:2) duk~a V

(16:126:6:1) Euwqibo V

(2:178:16:1) EufiYa V

(6:91:31:2) Eul~imo V

(18:48:1:2) EuriDu V

(11:28:14:2) Eum~iyato V

(81:4:3:1) EuT~ilato V

(5:107:2:1) Euvira V

(16:71:10:1) fuD~ilu V

(34:54:7:1) fuEila V

(11:1:6:1) fuS~ilato V

(21:96:3:1) futiHato V

(16:110:9:1) futinu V

(82:3:3:1) fuj~irato V

(77:9:3:1) furijato V

(34:23:11:1) fuz~iEa V

(5:64:6:1) gul~ato V

(7:119:1:2) gulibu V

(11:44:7:2) giyDa V

(27:17:1:2) Hu$ira V

(34:54:1:2) Hiyla V

(3:101:14:1) hudiYa V

(69:14:1:2) Humilati V

(84:2:3:2) Huq~ato V

(3:50:11:1) Hur~ima V

(4:86:2:1) Huy~iy V

(22:40:18:2) hud~imato V

(76:21:6:2) Hul~u V

(20:87:7:1) Hum~ilo V

(100:10:1:2) HuS~ila V

(39:69:7:2) jiA[at]Y^'a V

(16:124:2:1) juEila V

(26:38:1:2) jumiEa V

(3:184:4:1) ku*~iba V

(12:110:8:1) ku*ibu V

(17:35:4:1) kilo V

(54:14:6:1) kufira V

(13:31:12:1) kul~ima V

(2:178:4:1) kutiba V

(11:55:3:2) kiydu V

(81:11:3:1) ku$iTato V

(27:90:4:2) kub~ato V

(58:5:6:1) kubitu V

(26:94:1:2) kubokibu V

(81:1:3:1) kuw~irato V

(5:64:8:2) luEinu V

(3:159:5:1) lin V

(23:35:4:1) mi V

(12:63:7:1) muniEa V

(84:3:3:1) mud~ato V

(18:18:20:3) muli}o V

(34:7:10:1) muz~iqo V

(7:43:29:2) nuwdu V

(68:49:7:2) nubi*a V

(18:99:7:2) nufixa V

(4:161:4:1) nuhu V

(12:110:11:2) nuj~iYa V

(6:37:3:1) nuz~ila V

(81:10:3:1) nu$irato V

(21:65:2:1) nukisu V

(74:8:2:1) nuqira V

(88:19:4:1) nuSibato V

(77:10:3:1) nusifato V

(59:11:23:1) quwtilo V

(2:11:2:1) qiyla V

(54:12:9:1) qudira V

(2:210:12:2) quDiYa V

(7:204:2:1) quri}a V

(13:31:8:1) quT~iEato V

(3:144:13:1) qutila V

(12:26:13:1) qud~a V

(33:61:5:2) qut~ilu V

(6:45:1:2) quTiEa V

(4:91:10:1) rud~u V

(88:18:4:1) rufiEato V

(41:50:17:1) r~ujiEo V

(2:25:14:1) ruziqu V

(56:4:2:1) ruj~ati V

(2:108:7:1) su}ila V

(11:77:5:1) siY^'a V

(40:37:15:2) Sud~a V

(47:15:40:2) suqu V

(7:47:2:1) Surifato V

(39:71:1:2) siyqa V

(13:33:30:2) Sud~u V

(81:12:3:1) suE~irato V

(11:108:3:1) suEidu V

(81:6:3:1) suj~irato V

(15:15:3:1) suk~irato V

(7:149:2:1) suqiTa V

(88:20:4:1) suTiHato V

(13:31:4:1) suy~irato V

(39:73:18:1) Tibo V

(9:87:6:2) TubiEa V

(8:2:10:1) tuliyato V

(5:27:10:2) tuqub~ila V

(77:8:3:1) Tumisato V

(3:112:6:1) vuqifu V

(83:36:2:1) vuw~iba V

(3:96:4:1) wuDiEa V

(13:35:4:1) wuEida V

(3:25:8:2) wuf~iyato V

(12:75:4:1) wujida V

(19:15:4:1) wulida V

(7:20:7:1) wu,riYa V

(32:11:6:1) wuk~ila V

(6:27:4:1) wuqifu V

(26:21:4:1) xifo V

(4:28:6:2) xuliqa V

(9:118:4:1) xul~ifu V

(16:88:7:1) zido V

(4:148:10:1) Zulima V

(2:212:1:1) zuy~ina V

(3:185:11:1) zuHoziHa V

(2:214:16:2) zulozilu V

(81:7:3:1) zuw~ijato V

 Examples: (4:157:15:1)
وَقَوْلِهِمْ إِنَّا قَتَلْنَا الْمَسِيحَ عِيسَى ابْنَ مَرْيَمَ رَسُولَ اللَّهِ وَمَا قَتَلُوهُ وَمَا صَلَبُوهُ وَلَٰكِن شُبِّهَ لَهُمْ ۚ
That
they said (in boast), "We killed Christ Jesus the son of Mary, the
Messenger of Allah";- but they killed him not, nor crucified him, but so
it was made to appear to them,

(6:118:3:1)
فَكُلُوا مِمَّا ذُكِرَ اسْمُ اللَّهِ عَلَيْهِ إِن كُنتُم بِآيَاتِهِ مُؤْمِنِينَ
So eat of (meats) on which Allah's name hath been pronounced, if ye have faith in His signs.


(5:3:23:1)
وَمَا ذُبِحَ عَلَى النُّصُبِ وَأَن تَسْتَقْسِمُوا بِالْأَزْلَامِ
and those which are sacrificed on stone altars, and [prohibited is] that you seek decision through divining arrows.

(81:7:3:1)
وَإِذَا النُّفُوسُ زُوِّجَتْ [٨١:٧]
When the souls are sorted out, (being joined, like with like);

 In Salat everyday We recite إِيَّاكَ نَعْبُدُ وَإِيَّاكَ نَسْتَعِينُ
[١:٥] You alone we worship. You alone we ask for help. Do you know how
many times the detached pronoun (iyya = alone) occur in Quran? This
occurs 24 times in the Quran.
1. With 1 Person singular 5 times
2. With 1 Person plural 2 times
3. With 3 Person Masculine singular 8 times
4. With 3 Person Masculine plural 1 time
5. With 2 Person Masculine singular 2 times
6. With 2 Person Masculine Plural 6 times

1 فَإِيَّايَ فَارْهَبُونِ [١٦:٥١] then fear Me (and Me alone)."
2
وَقَالَ شُرَكَاؤُهُم مَّا كُنتُمْ إِيَّانَا تَعْبُدُونَ [١٠:٢٨] and
their "Partners" shall say: "It was not us alone that ye worshipped!
3
يَا أَيُّهَا الَّذِينَ آمَنُوا كُلُوا مِن طَيِّبَاتِ مَا رَزَقْنَاكُمْ
وَاشْكُرُوا لِلَّهِ إِن كُنتُمْ إِيَّاهُ تَعْبُدُونَ [٢:١٧٢] O ye who
believe! Eat of the good things that We have provided for you, and be
grateful to Allah, if it is Him alone ye worship.
4 نَّحْنُ نَرْزُقُكُمْ وَإِيَّاهُمْ We provide sustenance for you and for them;-
5. إِيَّاكَ نَعْبُدُ وَإِيَّاكَ نَسْتَعِينُ [١:٥] You alone we worship. You alone we ask for help.
6.
وَلَا تَقْتُلُوا أَوْلَادَكُمْ خَشْيَةَ إِمْلَاقٍ ۖ نَّحْنُ
نَرْزُقُهُمْ وَإِيَّاكُمْ Kill not your children for fear of want: We
shall provide sustenance for them as well as for you.

AL Hamdu lillah. This is rather easy with Quranic Arabic Corpus.

 Want to know how many times the word Rahman occurs in the Quran. It is
easy. In Access 2007 import the corpus data, Select the "LEM:r~aHoma`n" ,
copy it, again select and right click, from drop down go to Text
Filter, Select Contain, paste the text, click OK. You will get 57
occurrences.

Analyzing Quranic Arabic Corpus morphological data based on word ROOT is
easy.
For example if you want to filter all the Words based on the
ROOT:wqy (whose derivatives are muttaqeen, taqwa, waq etc) just copy the
ROOT:wqy in MS Access 2007, right-click the mouse and select Text
Filter > Contains > paste > OK. You will get all the 258
occurences of words with this root.

  • Like 1

Share this post


Link to post
Share on other sites
PropellerAds

I think it is better to import the Quranic morphology data txt file into Excel 2007 / 2010. The process is as follows:

1. Open a blank Excel sheet, go to Data tab, click Fro Text >

text%20to%20x.png

 

2. File selection dialogue box opens > Select the file "quranic-corpus-morphology-0.4" and click open / import.

3. The text import wizard opens up,

importtext05.gif

4> select  Delimited and click next.> step 2 comes

importtext06.gif

5. Select Tab, click next. step 3 comes

importtext07.gif

6. select General , then click Finish

importtext09.gif

7. Click OK You will see the data well furnished.

 

 

 

  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×