Languages of the Indian subcontinent
The geographic region of Indian subcontinent is native to two language families, Dravidian and Indo-Aryan languages, together colloquially known as Indic languages.[1][2] The languages are spread across the South Asian countries of Bangladesh, India, Maldives, Nepal, Pakistan[n 1] and Sri Lanka. The subcontinent is home to the third most spoken language in the world, Hindi-Urdu; the sixth most spoken language, Bengali; and the thirteenth most spoken language, Punjabi; all 3 of which are transnational languages.[3]
In the context of Indo-European studies, the Indic branch of Aryan languages is also referred as Indic languages.[4] However in the modern and digital contexts, Indic family refers to both the native language-families of the subcontinent.[5] By extension of script, some Tibeto-Burman languages like Manipuri (written in Eastern-Nagari), Bodo (written in Devanagari), etc. and Austro-asiatic languages like Santali are sometimes erroneously classified as Indic languages.
Writing systems[edit]
The Indic languages of India, Bangladesh, Nepal and Sri Lanka are written using Indic scripts, which are descendants of the Brahmi script, and those of Pakistan are written using extended Perso-Arabic scripts. The former scripts are abugida-based and latter scripts are abjad-based. The divide in scripts among peoples is predominantly[n 2] due to religious affiliations: followers of Indo religions prefer the native (Indic) scripts and followers of Islam prefer Arabic-based scripts.
Digraphia[edit]
Digraphia is a very common phenomenon in the northern subcontinent, especially due to the Hindu-Muslim divide. The Hindustani language, with an Urdu literary standard written in Arabic script and a High Hindi standard written in Devanagari, is one of the 'textbook examples'[6] of synchronic digraphia, cases where writing systems are used contemporaneously.
In addition to Hindi–Urdu, there are other Indo-Pakistani digraphic languages like Sindhi (written in extended Perso-Arabic in Sindh of Pakistan and in Devanagari by Sindhis in partitioned India); Punjabi (written in Gurmukhi in East Punjab and Shahmukhi in West Punjab); Saraiki (written in extended-Shahmukhi script in Saraikistan and unofficially in Sindhi-Devanagari script in India); and Kashmiri (written in extended Perso-Arabic by Kashmiri Muslims and extended-Devanagari by Kashmiri Hindus).[7][8][9]
The script of Maldives, known as Thaana, is a special form of script, derived from both Indic and Perso-Arabic scripts. Specifically, the main consonants are derived from Indic and Farsi numerals, whereas the vowels (diacritics) are inspired directly from the impure abjad. The Dhivehi language of Maldives is written using the Thaana script and Mahl dialect of Minicoy is written using Devanagari.
During the pre-colonial era, the Dravidian languages of Tamil and Malayalam were written by Muslims using Arabu-Tamil script and Arabi Malayalam script respectively.[10][11] In Konkan coast, the Konkani language is written in Latin script by the Catholics and Devanagari by Hindus, although some in Karnataka also use the Kannada script.[12]
Official languages[edit]
The following table lists the spoken Indic languages.
Spoken language | Total speakers[n 4][3] | Ranking[n 5][3] | Script(s) | ISO 639 code | Recognition |
---|---|---|---|---|---|
Hindostani | 830M | 3 | Devanagari for Hindi Perso-Arabic for Urdu |
hi ur |
India, Pakistan |
Bengali | 268M | 6 | Eastern Nagari | bn | Bangladesh, India |
Punjabi | 117M | 13 | Gurmukhi for East Punjabi Shahmukhi for West Punjabi |
pa pnb |
India Pakistan |
Marathi | 99M | 14 | Devanagari | mr | India |
Telugu | 96M | 15 | Telugu-Kannada script | te | India |
Tamil | 85M | 17 | Tamil script | ta | India, Sri Lanka, Singapore |
Gujarati | 62M | 28 | Gujarati script | gu | India |
Kannada | 59M | 30 | Kannada-Telugu script | kn | India |
Oriya | 40M | 42 | Odia script | or | India |
Malayalam | 38M | 43 | Malayalam script | ml | India |
Maithili | 34M | 46 | Devanagari | mai | India, Nepal |
Sindhi | 33M | 49 | Perso-Arabic in Pakistan Devanagari in India |
sd | Pakistan, India |
Saraiki | 26M | 55 | Shahmukhi | skr | Pakistan |
Nepali | 25M | 58 | Devanagari | ne | Nepal, India |
Sinhala | 17M | 72 | Sinhala script | si | Sri Lanka |
Assamese | 15M | 78 | Eastern Nagari | as | India |
Kashmiri | 7M | 143 | Perso-Arabic | ks | India, Pakistan |
Dogri | 5M | Devanagari | doi | India | |
Konkani | 3M | Devanagari | gom | India | |
Dhivehi | 0.4M | Thaana | dv | Maldives |
Unclassified languages[edit]
The following list of languages are currently not classified under any language families:
Computational Resources[edit]
- Microsoft Indic Language Input Tool - Currently supports only languages of India
- AksharaMukha Transliteration - To convert between various scripts
- IndicNLP Catalog - List of Natural Language Processing resources
- Indic-NLP Library - Python library for text processing
See also[edit]
- Languages with official status in India
- Languages of Pakistan
- Languages of Bangladesh
- Languages of Nepal
- Languages of Sri Lanka
References[edit]
- ↑ Kak, Subhash. "Indic Language Families and Indo-European". Yavanika.
The Indic family has the sub-families of North Indian and Dravidian
- ↑ Kak, Subhash. "On The Classification Of Indic Languages" (PDF). Louisiana State University.
- ↑ 3.0 3.1 3.2 "What are the top 200 most spoken languages in 2021?". Ethnologue. 2018-10-03. Retrieved 2021-10-27.
- ↑ "Overview of Indo-Aryan languages". Encyclopædia Britannica. Retrieved 8 July 2018.
- ↑ Reynolds, Mike; Verma, Mahendra (2007), Britain, David, ed., "Indic languages", Language in the British Isles, Cambridge: Cambridge University Press, pp. 293–307, ISBN 978-0-521-79488-6, retrieved 2021-10-04
- ↑ Ahmad, Rizwan (June 2011). "Urdu in Devanagari: Shifting orthographic practices and Muslim identity in Delhi". Language in Society. 40 (3): 259–284. doi:10.1017/S0047404511000182. hdl:10576/10736. ISSN 0047-4045.
- ↑ "Perso-Arabic To Indic Script Transliteration". sangam.learnpunjabi.org. Retrieved 2021-04-07.
- ↑ "Saraiki - Devanagari Machine Transliteration System - SDMTS". www.sanlp.org. Retrieved 2021-08-09.
- ↑ Lawaye, Aadil; Kak, Aadil; Mehdi, Nali (January 2010). "Building a Cross Script Kashmiri Converter: Issues and Solutions". Proceedings of Oriental COCOSDA.
- ↑ Torsten Tschacher (2001). Islam in Tamilnadu: Varia. (Südasienwissenschaftliche Arbeitsblätter 2.) Halle: Martin-Luther-Universität Halle-Wittenberg. ISBN 3-86010-627-9 Search this book on .. (Online versions available on the websites of the university libraries at Heidelberg and Halle: http://archiv.ub.uni-heidelberg.de/savifadok/volltexte/2009/1087/pdf/Tschacher.pdf and http://www.suedasien.uni-halle.de/SAWA/Tschacher.pdf).
- ↑ Kunnath, Ammad (15 September 2015). "The rise and growth of Ponnani from 1498 AD To 1792 AD". Department of History. hdl:10603/49524.
- ↑ Mother Tongue blues – Madhavi Sardesai
Notes[edit]
- ↑ Mostly eastern side of Pakistan.
- ↑ Except Bangladesh: although it is a Muslim-majority nation, it uses the Eastern Nagari script.
- ↑ The Indo subcontinent is a geographical region inside South Asia spanning the Indian Plate, which is predominantly home to Indo-Aryan and Dravidian-speaking peoples currently.
- ↑ Including both L1 and L2 speakers worldwide
- ↑ Ranked by population worldwide
This article "Languages of the Indian subcontinent" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Languages of the Indian subcontinent. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.