Sorani Kurdish Iranian languages Catalogue


Sorani Kurdish, also known as Central Kurdish, is a Northwestern Iranian language and the second most widely spoken Kurdish language (after Kurmanji), with at least 9 million speakers (Thackston, 2001) in Iran and Iraq. In Iraq, Sorani Kurdish has official status alongside Arabic, and is the dominant language of government, education, and media in the Iraqi Kurdistan region. Unlike Kurmanji which is written using a Turkish-influenced version of the Latin alphabet, the Sorani writing system which is in use in both Iran and Iraq is based on the Arabic script.

Different varieties of Kurdish reflect traces of influence by different neighboring languages. In the same way that the varieties of Kurmanji Kurdish spoken in Turkey have been under Turkish influence (especially in adopting loan words), Sorani Kurdish has been under the influence of Arabic and Persian in Iraq and Iran. An evident manifestation of the influence of Arabic, for example, is the presence of the pharyngeal sounds /ʕ/ and /ħ/ in the consonant inventory of many varieties of Sorani Kurdish.

Like other Kurdish languages, no predecessors of Sorani Kurdish are yet known from Old and Middle Iranian times. The extant Kurdish texts may be traced back to no earlier than the 16th century CE (Paul, 2008).


Writing system

The Sorani writing system is based on the Perso-Arabic script, in which letters are generally written joined to each other. However, unlike most other writing systems based on this script (including Arabic and Persian) in which short vowels are represented as optionally written diacritics, all vowels are represented by separate characters in the Sorani writing system. For example, the word /sær/ (meaning “head”) which is shared by Persian and Sorani Kurdish is written as “سر” (two letters) in Persian since the short vowel /æ/ is not an independent letter in the Persian script, but is written as “سه‌ر” in Sorani, where the letter “ـه” represents the vowel /æ/.

Moreover, unlike many writing systems based on the Perso-Arabic script, Sorani is not faithful to the original spelling of Arabic loan words when the spelling represents sounds that do not exist in Sorani. For example, the Arabic word /sˤaħraːʔ/ (“صحراء”) meaning “field” is written as “صحرا” in Persian and Urdu (note that the initial consonant “صـ” represents a pharyngeal sound non-existent in Persian and Urdu but the character is retained in spelling even though it is read as a normal /s/ in these languages). In Sorani spelling, however, the word is written as “سه‌حرا”, in which not only an additional character is added to represent the short vowel /æ/, but the pharyngeal letter “صـ” is replaced with the non-pharyngeal “سـ” which is a more accurate reflection of how the word is actually pronounced in Sorani.




Sorani Kurdish has a total of 8 vowels; four front vowels (/i ɪ e æ/) and four back vowels (/u ʊ o ɑ/). Vowel reduction is common in some Sorani varieties and in particular the vowel represented as corresponding to /æ/ in the writing system is pronounced like a schwa (/ə/) in many cases. The exact quality of the vowels may differ from dialect to dialect. The vowel transcribed as corresponding to the sound /ɑ/, for example, is pronounced considerably more fronted in many varieties.

The vowel /æ/ undergoes a number of important allophonic changes. It is realized as /ə/ before /w/ and before a coda /j/, and as /ɛ/ before an onset /j/.


The consonant inventory of Sorani Kurdish is presented in the table below.


  bilabial labiodental alveolar postalveolar palatal velar uvular pharyngeal glottal
plosive p b   t d     k g q    
nasal m   n            
trill     r            
tap/flap     ɾ            
affricate       t͡ʃ d͡ʒ          
fricative   f v s z ʃ ʒ     χ ɣ (ħ) (ʕ) h
approximant w       j        
lateral approximant     l ɫ            

The two pharyngeal consonants /ħ/ and /ʕ/ are the result of Arabic influence, and are found in most Sorani (and Kurmanji) dialects outside of Iran. Interestingly, however, their use is not limited to loan words and they are sometimes added to native Iranian words that obviously lacked these sounds in their original form, e.g. /ʕɑsmɑn/ (“عاسمان”) is sometimes heard instead of the more faithful Iranian form of the word /ɑsmɑn/ (“ئاسمان”). Similarly, the number seven is generally pronounced as /ħəwt/ (“حه‌وت”) in spite of its native Iranian origin (corresponding to Persian /hæft/ “هفت”).

One of the most conspicuous allophonic changes concerning consonants in some Sorani varieties is the palatalization of the velar consonants /k/ and /g/ before non-low front vowels (/i/ and /e/). Almost the same effect is found in neighboring languages (i.e. Persian and Turkish). In Persian, however, the palatal form seems to be the default form and the effect is best described as velarization of palatal consonants following back vowels.


Unlike many languages of the region, Sorani Kurdish allows complex onsets (e.g. /dreʒ/: “long”). On the other hand, complex codas, although allowed in Sorani Kurdish, are not as unrestrained as they are in Persian and Classical Arabic. In particular, in Sorani the Sonority Sequencing Principle rules out many complex coda combinations that are allowed in these languages. This effect is easily visible in Arabic loan words such as /qæb(ɪ)z/ (“receipt”), where the epenthetic vowel added before the last consonant is intended to prevent the ill-formed coda cluster /bz/ from occurring (Zahedi, 2012).

Like Persian and Turkish, stress falls on the last syllable of nouns and adjectives in Sorani. In verbs, however, stress placement depends on the presence of specific morphemes.



Like Kurmanji Kurdish and many other Iranian languages, Sorani Kurdish has split ergativity, with a nominative-accusative arrangement in the present tense  and an ergative-absolutive arrangement in the past tense. Unlike Kurmanji, however, Sorani does not have an overt case marking system and does not have grammatical gender.


A Sorani noun in a sentence can be definite, indefinite, or in its absolute state. The definite state is marked with the definite marking suffix /ækæ/ (in some varieties /ægæ/) when singular and with the suffix /ækɑn/ when plural:

شاره‌که جوانه
ʃɑr-ækæ       d͡ʒwɑn-æ
city-DEF       beautiful-COP.3SG
The city is beautiful.

سێوه‌که ده‌خۆم
sew-ækæ       dæ-χo-m
apple-DEF    PROG-eat.PRS-1SG
I am eating the apple.

سێوه‌کان ده‌خۆم
sew-æk-ɑn          dæ-χo-m
apple-DEF-PL    PROG-eat.PRS-1SG
I am eating the apples.

The definite marker accompanies the noun even when it is attached to enclitic possessive pronouns:

سێوه‌که‌ی ده‌خۆم
sew-æke-j                           dæ-χo-m.
apple-DEF-3SG.POSS     PROG-eat.PRS-1SG
I am eating his/her/its apple.

Indefinite nouns are followed by the suffix /ek/:

سێوێک ده‌خۆم
sew-ek               dæ-χo-m.
apple-INDF     PROG-eat.PRS-1SG
I am eating an apple.


As in many other Iranian languages, the Ezafe is used in Sorani to connect nouns to other nouns and to adjectives. The most obvious cases where the Ezafe morpheme (/ɪ/) is used are possessive constructions and noun-adjective combinations.

کورێکی باش
kʊr-ek-ɪ                         bɑʃ
boy-INDF-EZAFE      good
a good boy

ده‌وله‌تی کوردستان
dəwlæt-ɪ                         kʊrdɪstɑn
government-EZAFE    Kurdistan
The government of Kurdistan

ده‌سته‌که‌ی مه‌حموود
dæst-æke-j                    mæħmud
hand-DEF-EZAFE     Mahmud
Mahmud’s hand

Personal Pronouns

As in many Iranian languages, there are two sets of personal pronouns in Sorani; independent personal pronouns and enclitic personal pronouns. These pronouns are shown in the table below.


  Independent Enclitic
I mɪn من ɪm م
you (sg.) to تۆ ɪt ت
he/she/it əw ئه‌و i ی
we emæ ئێمه mɑn مان
you (pl.) ewæ ئێوه tɑn تان
they əwɑn ئه‌وان jɑn یان

Independent personal pronouns are treated as separate words. In the most natural case, the independent personal pronoun appears as the subject in a sentence. Sorani Kurdish is a pro-drop language, so including these subjects is generally optional (note that the person and number of the subject can still be determined by looking at the agreement marker on the verb).

من تووره بووم
mɪn   turæ    bu-m
1SG     angry  COP.PST-1SG
I was angry

ئه‌و هات
əw     hɑt-ø
3SG   come.PST-3SG
He/She/It came.

تۆ ده‌گری
to       dæ-gri-(t)
2SG   PROG-cry-2SG
You are crying.

من سێوه‌که ده‌خۆم
mɪn  sew-ækæ       dæ-χo-m
1SG  apple-DEF    PROG-eat.PRS-1SG
I am eating the apple.

The independent personal pronouns can also  appear as possessors in possessive construction, although this is not the default strategy for producing possessive constructions in Sorani.

چاوه‌کانی من
t͡ʃɑw-æk-ɑn-ɪ          mɪn
my eyes

The normal way for expressing possession is by using the enclitic personal pronouns as suffixes attached to the last word of the noun phrase.

my eyes

his/her/its hand

The enclitic personal pronouns can also serve as objects in non-ergative constructions (sentences in the present tense and intransitive sentences in the past tense). If the verb is a complex predicate (a compound verb), the enclitic pronoun attaches to the non-verbal element of the complex predicate:

هه‌لیان ده‌گرین
hæl-jɑn     dæ-gɪr-in
hæl-3SG   PROG-take.PRS-1PL
We are picking them up.

ئاگادارمان ده‌که‌یت
ɑgɑdɑr-mɑn   dæ-kæ-j(t)
aware-1PL       PROG-make-2SG
You (sg.) are informing us.

If there is no non-verbal element preceding the verb in the verb phrase, the enclitic pronoun is inserted in the middle of the verb, right after the first morpheme (the first morpheme may be the prefix /dæ/ which indicates progressive tense and indicative mood, the prefix /bɪ/ which indicates subjunctive mood, and the negation prefixes /næ/).

He/She/It knows me.

Do you see him/her/it?

I don’t know him/her/it.

Write it down!


Like most Iranian languages, each verb in Sorani has a past stem and a present stem, and there is no regular morphological relation between the two forms. The most common verb form using the present stem is the present simple verb. The present simple is formed of the prefix /dæ/ (/æ/ in Sulaymaniyah dialect) followed by the present stem of the verb followed by the agreement marker. As an example, the verb /nusen/ نوسێن (“to write”) is conjugated in the table below.


English Sorani
I write. mɪn dænusɪm
من ده‌نووسم
You (sg.) write. to dænusi(t)
تۆ ده‌نووسی
She/He/It writes. əw dænuse(t)
ئه‌و ده‌نووسێ
We write. emæ dænusin
ئێمه ده‌نووسین
You (pl.) write. ewæ dænusɪn
ئێوه ده‌نووسن
They write. əwɑn dænusɪn
ئه‌وان ده‌نووسن

The present simple tense expresses actions taking place at the moment or in the future, as well as habitual states and actions. Some examples in this tense are presented below:

مه‌حموود له‌گه‌ڵ ئه‌و قسه ده‌کا
Maħmud lægæl̴   ow      qsæ           dæ-kɑ-(t).
Mahmud with      3SG    speech     PROG-do.PRS-3SG
Mahmud is talking (/talks/will talk) to him/her/it.

ده‌چی بۆ کوێ؟
dæ-t͡ʃ-i                  bo    kwe?
PROG-go.PRS    to     where
Where are you going?

We see(/are seeing/will see) you.

In the last example above, the object has been inserted inside the verb in the form of an enclitic pronoun (/t/).

In order to describe the conjugation pattern of the past simple, transitive and intransitive sentences must be addressed separately. In intransitive sentences, the past simple is formed by attaching the past stem to an agreement marker suffix that agrees with the subject just like the one used in the present simple. The conjugation of the verb /hɑtɪn/ هاتن (“to come”) in the past simple tense is presented in the table below, followed by example sentences using intransitive past simple verbs.


English Sorani
I came. mɪn hɑtɪm
من هاتم
You (sg.) came. to hɑti(t)
تۆ هاتی
She/He/It came. əw hɑt
ئه‌و هات
We came. emæ hɑtin
ئێمه هاتین
You (pl.) came. ewæ hɑtɪn
ئێوه هاتن
They came. əwɑn hɑtɪn
ئه‌وان هاتن

ئه‌وان رۆیشتن
əwɑn rojʃt-ɪn
3PL    leave-3SG
They left.

من و مه‌حموود هاتین
mɪn u       mæħmud hɑt-in
1sg  and   Mahmud  come.PST-1PL
Mahmud and I came.

In transitive past sentences (including those in the past simple tense), the sentence has an ergative-absolutive arrangement. First consider how the logical subject (which no longer appears as the subject of the sentence) behaves. The logical subject is expressed in the form of an enclitic pronoun (identical to the ones used as possessive pronouns and as objects in non-ergative sentences). This enclitic pronoun attaches to the first element in the verb phrase. If there is no phrase before the verb for the enclitic pronoun to attach to, it is inserted inside the verb after the first pre-stem morpheme (such as the negation morpheme). If there is no such morpheme, it follows the stem.

نانمان خوارد
nɑn-mɑn       χwɑrd
food-1PL   eat.PST
We ate food.

We didn’t eat.

We ate.

Note that the morpheme that indicates the logical subject is not the typical verb agreement morpheme for the first person plural (/in/), but the enclitic pronoun. In fact the verb does not agree with the subject in these sentences, but with the third person singular object (which requires the null suffix). In fact, the verb is “conjugated” based on the logical object (which no longer behaves like a “normal” object). This can be seen more clearly by considering similar cases where the object agreement morpheme is not the null suffix. In the following table, the conjugation of the verb “to inform” in the past simple tense for different “objects” is presented.


English Sorani
They informed me. ɑgɑdɑr-jɑn kɪrdɪm
ئاگاداریان کردم
They informed you (sg.). ɑgɑdɑr-jɑn kɪrdi(t)
ئاگاداریان کردی
They informed him/her/it. ɑgɑdɑr-jɑn kɪrd
ئاگاداریان کرد
They informed us. ɑgɑdɑr-jɑn kɪrdin
ئاگاداریان کردین
They informed you (pl.). ɑgɑdɑr-jɑn kɪrdɪn
ئاگاداریان کردن
They informed them. ɑgɑdɑr-jɑn kɪrdɪn
ئاگاداریان کردن

Sorani is often claimed to not demonstrate true (or full) ergativity. This is partly because even though the verb is conjugated as described above in ergative sentences, the logical subject can still appear as an independent pronoun in the sentence in a position that looks like a regular subject:

ئێمه نانمان خوارد
emæ nɑn-mɑn       χwɑrd
1PL   food-1PL       eat.PST
We didn’t eat food.

This can be compared with the fully ergative system of the Kurmanji Kurdish past tense, where independent pronouns are marked for case and the logical object appears in the oblique case (rather than the “default” nominative case), making it distinct from the way it appears in a nominative-accusative sentence.


Bibliography & References

(Most of the data on which this article is based comes from original research by the Iranian Languages Group at the department of Linguistics of the University of Arizona and the work of Wheeler Thackston (2001) on Sorani Kurdish)

Blau, Joyce. “Le Kurde.” Compendium Linguarum Iranicarum (1989): 327-335.

Paul, Ludwig. “Kurdish Language i. History of the Kurdish Language” Encyclopædia Iranica. Online at: (2008)

Sheyholislami, Jaffer. “Identity, language, and new media: the Kurdish case.” Language policy 9.4 (2010): 289-312.

Thackston, W. M. “Sorani Kurdish.” A Reference Grammar with selected readings. Online at: http://www. fas. harvard. edu/~ iranian/Sorani/sorani_1_grammar. pdf (retrieved Nov. 2016) (2001).

Zahedi, Muhamad Sediq, Batool Alinezhad, and Vali Rezai. “The sonority sequencing principle in Sanandaji/Erdelani Kurdish: An optimality theoretical perspective.” International Journal of English Linguistics 2.5 (2012): 72.