Pashto, sometimes also romanized as Pashtu or Pakhto, is the most widely spoken modern Eastern Iranian language, spoken primarily in south of Afghanistan and northwest of Pakistan. It is the only Iranian language, beside Persian, with nation-wide official status in a country. Pashto is one of the two official languages of Afghanistan (along with Persian) and a recognized minority language in Pakistan.

As of 2017, Ethnologue estimates approximately 38 million people to speak Pashto. Nearly two third of the Pashto-speaking population resides in Pakistan, where Pashto is a minority language spoken by approximately 13% of the population (Rahman 1995) without any significant use in government and media. In Afghanistan, however, Pashto has official status and is considered a national language and the mother tongue of the largest ethnic group of the country (approximately 52% according to Rahman 1995). Pashto has been a written language for almost four centuries, and has a rich literary tradition.

Government, media, and education

Traditionally, Persian (locally known as Dari or Farsi in Afghanistan) has been the main written language and the language of prestige in Afghanistan, but the status of Pashto has been on the rise since at least early twentieth century. During the rule of  Taliban in Afghanistan (1996-2001), Pashto became the de facto sole official language of the country (Schiffman, 2011). Since the fall of Taliban in 2001, Pashto and Persian have held official status alongside each other, but Persian still has a higher status in media and education and is widely used as a lingua franca in spite of Pashtun being the largest ethnicity in the country.

According to a 2003 estimate, only 30% of the newspapers published in Afghanistan since the fall of Taliban were in Pashto, compared to a 50% for Persian (the rest being officially bilingual). According to the same source, 80% of national broadcasting in Afghanistan is estimated to be in Persian (Najibullah, 2003). The situation is similar in education. While the language of education in schools depends on the region (with Pashto being used in southern parts of the country), Persian is far more widely used in universities as the language of education.


Pashto dialects have high degrees of mutual intelligibility, and the existence of a standardized written tradition guarantees the linguistic unity of Pashto dialects. In McKenzie’s words:

“The morphological differences between the most extreme north eastern and south western dialects are comparatively few and unimportant. The criteria of dialect differentiation in Pashto are primarily phonological. With the use of an alphabet which disguises these phonological differences the language has, therefore, been a literary vehicle, widely understood, for at least four centuries.” (McKenzie, 1959)

Pashto dialects are commonly identified by the way the name of the language is pronounced in them (especially the second consonant, transcribed in the Pashto alphabet with the letter “ښ”). As a very rough estimate it can be said the northern dialects pronounce the name of the language as /paxto/ and the southern dialects pronounce it as /pəʃto/. The reality, however, is more nuanced. The table below introduces the main four dialects of Pashto.


    Pronunciation of "ښ" Pronunciation of "ږ" Pronunciation of "څ" Pronunciation of "ځ"
SW Kandahar ʂ (retroflex fricative) ʐ (retroflex fricative) t͡s d͡z
SE Quetta, Waziri ʃ ʒ t͡s d͡z
NW Kabul province, Central Ghizlay ç (palatal fricative) ʝ (palatal fricative) s z
NE Peshawar, Yusufzay, Northeastern Ghizlay x ɡ s z



Pashto has four so-called “short” vowels ( /a/, /i/, /u/, /ə/) which are usually omitted in writing and five long vowels (/ɑː/, /iː/, /uː/, /eː/, and /oː/). The quality of the vowels may vary across dialects, and some of the vowels undergo mergers in some of the dialects.


The main distinctive feature of the Pashto consonant inventory is the presence of the retroflex phonemes (/ʈ/, /ɖ/, /ɳ/, and  /ɻ/ as well as /ʂ/ and /ʐ/ in the case of the SW dialect), probably as a result of contact with Indo-Aryan languages. Pashto and Balochi are the only Iranian languages with retroflex consonants. Pashto’s consonant inventory is similar to Balochi also in the absence of /f/ as a native consonant.

In general, the consonant inventory varies considerably across dialects. The consonant chart for the Kandahar dialect is presented below.


  bilabial alveolar postalveolar retroflex palatal velar glottal
plosive p b t d   ʈ ɖ   k ɡ ʔ
nasal m n   ɳ      
trill   r   ɻ      
affricate   t͡s d͡z t͡ʃ d͡ʒ        
fricative   s z ʃ ʒ ʂ ʐ   x ɣ h
approximant w       j    
lateral approximant   l          

In addition to these consonants, the consonants /f/, /q/, and /ħ/ are also sometimes heard in more careful speech. In most cases, however, they are replaced by /p/, /k/, and /h/ respectively.


Writing system

The Pashto writing system is similar to that of Persian and Arabic in all its general aspects. It has a number of additional symbols, however, to represent the sounds that do not exist in these languages. The symbols for the four retroflex consonants that are found across different dialects of Pashto (/ʈ/, /ɖ/, /ɳ/, and  /ɻ/) are formed by adding a small “ring” to their non-retroflex counterparts:

/ʈ/:    “ټ”
/ɖ/:   “ډ”
/ɳ/:   “ڼ”
/ɻ/:    “ړ”

For the other two consonants that are pronounced as retroflexes only in the SW (Kandahar) dialect, characters with dots above and below them are used:

/ʂ/ (in other dialects: /ʃ/, /ç/, /x/):            “ښ”
/ʐ/ (in other dialects: /ʒ/, /ʝ/, and /ɡ/):     “ږ”

For the voiced velar stop, Pashto uses “ګ” instead of the more common “گ”.  Pashto also has unique symbols for its alveolar affricates:

/t͡s/:    “څ”
/d͡z/:   “ځ”

The vowel /ə/ is represented by “ۀ”, and four different variants of the base character “ی” are used to represent different vowel and diphthongs (“ی”, “ې”, “ۍ”, and “ي”) which do not have fixed pronunciations across dialects. A full list of the letters of the Pashto alphabet is given below. The pronunciations in the table are based on the Kandahar dialect, and alternative pronunciations in other dialects are given in parentheses.


Letter IPA
ا ɑ, ʔ
ب b
پ p
ت t
ټ ʈ
ث s
ج d͡ʒ
چ /t͡ʃ
ح h
خ x
څ t͡s (in some dialects: s)
ځ d͡z (in some dialects: z)
د d
ډ ɖ
ذ z
ر r
ړ ɻ
ز z
ژ ʒ
ږ ʐ (in other dialects: ʒ, ʝ , ɡ)
س s
ش ʃ
ښ ʂ (in other dialects: ʃ, ç, x)
ص s
ض z
ط t
ظ z
ع ʔ
غ ɣ
ف f
ق q
ک k
ګ ɡ
ل l
م m
ن n
ڼ ɳ
و w, u, o
ه h, a
ۀ ə
ي j, i
ې e
ی ai, j
ۍ əi
ئ əi, j


Pashto has one of the most complicated morphological systems among Iranian languages. It has retained many aspects of the complex morphology of Old Iranian; both nouns and adjectives are marked based on gender, number, and case, and are divided into different noun classes (declensions). Animacy also plays a role in noun endings in plural form.

Beside the vocative, Pashto nouns have two main cases: direct and oblique. Some analyses also acknowledge the existence of a second type of oblique case (Skjærvø 1989), or an ablative case (David 2014). Pashto morphosyntax is further complicated by the fact that Pashto, like many other Iranian languages, retains the split ergativity arrangement, with logical subjects getting the oblique case and logical objects getting the direct case in past transitive verbs.


Pashto has two genders, masculine and feminine, and the gender of most nouns is predictable by their ending. As a general rule, nouns ending in a consonant or the diphthong /aj/ (the letter “ی”) are masculine. The following classes of noun endings are generally expected to make a noun feminine:

/ə/ and /a/ (the letter “ه”)

/əy/ (the letter “ۍ”)

/e/ (the letter “ې”)

There are many exceptions to these rules.

Personal Pronouns

Personal pronouns come in three main cases: direct, oblique, and genitive. In addition, there is a set of enclitics that appear in the form of second position clitics, as will be described shortly. The table below presents different forms of personal pronouns in Pashto.


  Direct Oblique Genitive Enclitic
I mɑː zmɑː me
you (sg.) tɑː stɑː de
he/she/it daj (masc.)
dɑː (fem.)
də (masc.)
de (fem.)
da, də (masc.)
da, de (fem.)
we muʐ muʐ zmuʐ mu
you (pl.) tɑːsoː tɑːsoː stɑːsoː mu
they duːj duːj da, dio ye

It must be noted that third person personal pronouns shown here are for persons or things that are close to the speaker (in sight). For persons or things that are out of sight, the pronoun /haɣa/ and its variants are used.

Examples for the oblique and direct cases will be presented in the next section. For genitive (possessive) pronouns, see the following examples. The pronoun precedes the possessee, as expected.

ستا کور چېرته دئ
stɑː                  kor-ø                             t͡ʃerta         dəj
2SG.POSS     house-MASC.DIR       where        COP.PRS.3SG
Where is your house?

دا زما کتاب دئ
dɑː        zmɑː             ktɑːb        dəj
this       1SG.POSS   book         COP.PRS.3SG
This is my book.

The enclitic pronouns are not limited to a specific case. They come as second position clitics after the first phrase in the sentence, and can serve as objects (in present and future sentences), logical subjects (in past transitive sentences, which have ergative-absolutive arrangement), or possessors depending on the context. The enclitics are sometimes called the “weak” pronouns, as opposed to the other pronouns, which are known as the “strong” pronouns. A few examples of the weak pronouns with different roles are given below.

As logical object (only possible in the present tense):
ولي مې مچوی؟
wali        me        mat͡ʃaw-i
why        1SG       kiss-3SG
Why is he kissing me?

As logical subject (only possible in the past tense):
مچولم یې
mat͡ʃaw-əl-əm       ye
kiss-PST-1SG        3SG
He was kissing me.

As possessor:
زوی مي ګډېږي
zuj       mi        ɡəɖeʐ-i
son      1SG      dance.PRS-3SG
My son is dancing.

Note that when the weak pronoun is used as the possessor (as in the last example above), the position of the possessor is only dependent on where the first phrase in the sentence ends, and does not necessarily precede the possessor (unlike the “strong” possessive pronouns). In this particular example, one would expect the pronoun to precede the word /zuj/ if it was a regular possessive construction.


As is typical in Iranian languages, verbs have a present and a past stem. The past stem is formed from the present stem either in the “regular” way by adding the morpheme /əl/ to the end of it, or by changing the present stem in irregular ways. Final /t/’s are, as expected, common in past stems. For certain verbs (called the “doubly irregular” verbs), the perfective and imperfective past stems differ as well.

The present simple verb is formed by adding personal endings to the present stem. The conjugation table for the verb “to see” in the present simple tense is shown below.


English Pashto
I see. winəm.
You (sg.) see. win.
She/He/It see. wini.
We see. winu.
You (pl.) see. winəj.
They see. wini.

Adding personal endings to the past stem gives the past continuous (not the perfective/past simple). The verb endings for the past tense are similar, the only difference being in the third person verbs, as expected. As an example, the use of these suffixes in the past continuous tense for the verb “to dance” is shown below.


English Pashto
I was dancing. gaɖedəm.
You (sg.) were dancing. gaɖede.
He was dancing.
She was dancing.
We were dancing. gaɖedu.
You (pl.) were dancing. gaɖedaj.
They (masc.) were dancing.
They (fem.) were dancing.

As comparison between the two tables above shows, gender distinction for verbs only exist in the past tense, and only for the third person.

To make the past simple, a perfective prefix is added to the verbs. In past transitive verbs, logical subjects get the oblique case and the logical object gets the direct case. The verb, as expected in an ergative arrangement, agrees with the logical object. Compare the two sentences below.

زه تا وینم
zə                 tɑː                  win-əm.
1SG.DIR     2SG.OBL      see.PRS-1SG
I see you.

ما ته ولیده
mɑː                tə                   wə-lid-e.
1SG.OBL      2SG.OBL      PRF-see.PST-2SG
I saw you.

Sample Text

The following sample text and most of the commentary are taken from Oranskij (1963).

دې مقالې د پښتنو ادیبانو په محفل کی یوه هنگامه جوره کړه
de             makɑːle    də     paʂtano                adiːbɑːno                 pə          mahfal     ki     yawa      hangɑːma    d͡ʒoɻ-a                  kɻa
this.OBL   article    EZF   Pashto.OBL.PL    littérateur.OBL.PL   ADP     gathering  ADP  one.FEM  turmoil      done-FEM.DIR.SG do.3SG.PST.IRR

This article has caused a stir in Pashto literary circles (lit. the gatherings of Pashto literature scholars/authors/poets)

de: Oblique demonstrative pronoun. Direct form: /dɑː/

makɑːle: “article” (from Arabic “مقالة”). Feminine noun, subject of transitive sentence in the past (ergative construction).

: Morpheme used in noun-noun and noun-adjective constructions, corresponding to the Ezafe in other Iranian languages.

paʂtano: Noun, oblique form, plural (with syncretic morphological marking), from /paʂtun/

adiːbɑːno:  Noun, oblique form, plural of /adiːb/ (“ادیب”). The word is originally Arabic.

pə … ki: Circumposition  indicating location.

mahfal: gathering, circle (from Arabic “محفل”)

yawa: Direct form of of “one” (feminine noun). From Old Iranian aiwa). Serving as the adjective of the following noun and agreeing with it in case, gender, and number.

hangɑːma: “turmoil”. Also common in Persian (from Old Iranian han-gām, “to gather”). Direct, feminine, singular word serving as logical object.

d͡ʒoɻ-a kɻa: Third person singular past irrealis form. From complex predicate  /d͡ʒoɻ-a kawəl/ (to give rise to, to create, to set up). The non-verbal element /d͡ʒoɻ/ means “complete, done, healthy” in isolation, and the verb /kawəl/ means “to do” in isolation (compare with Persian /kærdæn/). The sentence ergative arrangement and the verb agrees with the logical object (/hangɑːma/) in person, gender, and number.




