banner
「雲華」CloudSino

網絡一隅¦Net`Corner

願我的祝福與你同在︕
github
bilibili
zhihu
steam
discord user
misskey
follow
email

Chinese Translation | IRC

Translation Notes#

The original link is: 『Language Recognition Chart - Wikipedia』;
This translation is based on the latest version as of 8 PM on February 15, 2025, UTC+8;
It is shared under the original work: "CC BY 4.0 International" license agreement;

Overall based on AI translation, formatting, proofreading, and additions have been made as far as possible, with translated names following the conventions of mainland China;

Why not vote for Zhongwei? Good question, I reward you with IP exemption rights()


Language Recognition Chart provides various clues that can be used to identify the language used in a text.

Characters Alphabet#

The language of foreign text can often be determined by identifying specific characters.

  • ABCDEFGHIJKLMNOPQRSTUVWXYZ (Latin letters)

    • And no other characters — English, Indonesian, Latin, Malay, Swahili, Zulu

    • àäèéëïijöü — Dutch (these letters appear very rarely in Dutch, except for the ligature ij/IJ. Even longer Dutch texts often lack diacritics)

    • áêéèëïíîôóúû — Afrikaans

    • êôúû — West Frisian

    • ÆØÅæøå — Danish, Norwegian

    • Single diacritic, mainly umlauts

      • ÄÖäö — Finnish (BCDFGQWXZÅbcfgqwxzå are only found in proper nouns and loanwords, occasionally ŠšŽž)
      • ÅÄÖåäö — Swedish (é is occasionally seen)
      • ÄÖÕÜäöõü — Estonian (BCDFGQWXYZcfqwxyz are only found in proper nouns and loanwords, occasionally ŠšŽž)
      • ÄÖÜẞäöüß — German
    • Tone marks

      • ÇÊÎŞÛçêîşû — Kurdish
      • ĂÂÎȘȚăâîșț — Romanian
      • ÂÊÎÔÛŴŶÁÉÍÏâêîôûŵŷáéíï — Welsh (ÓÚẂÝÀÈÌÒÙẀỲÄËÖÜẄŸóúẃýàèìòùẁỳäëöüẅÿ also exist but are rare)
      • ĈĜĤĴŜŬĉĝĥĵŝŭ — Esperanto
    • Three or more types of diacritics

      • ÇĞİÖŞÜçğıöşü — Turkish
      • ÁÐÉÍÓÚÝÞÆÖáðéíóúýþæö — Icelandic
      • ÁÐÍÓÚÝÆØáðíóúýæø — Faroese
      • ÁÉÍÓÖŐÚÜŰáéíóöőúüű — Hungarian
      • ÀÇÉÈÍÓÒÚÜÏàçéèíóòúüï· — Catalan
      • ÀÂÆÇÉÈÊËÎÏÔŒÙÛÜŸàâæçéèêëîïôœùûüÿ — French (Ÿ and ÿ are only found in specific proper nouns)
      • ÁÀÇÉÈÍÓÒÚËÜÏáàçéèíóòúëüï (· is only used in the Gascon dialect) — Occitan
      • ÁÉÍÓÚÂÊÔÀãõçáéíóúâêôà (ü is used in Brazilian Portuguese, k, w, y are not used in native words) — Portuguese
    • ÁÉÍÑÓÚÜáéíñóúü ¡¿ — Spanish

    • ÀÉÈÌÒÙàéèìòù — Italian

    • ÁÉÍÓÚÝÃẼĨÕŨỸÑG̃áéíóúýãẽĩõũỹñg̃ — Guarani (the only language that uses g̃)

    • ÁĄĄ́ÉĘĘ́ÍĮĮ́ŁŃ áąą́éęę́íįį́łń (FQRVfqrv are not used in native words) — Southern Athabaskan languages

      • ’ÓǪǪ́ āą̄ēę̄īį̄óōǫǫ́ǭúū — Western Apache
      • 'ÓǪǪ́ óǫǫ́ — Navajo
      • ’ÚŲŲ́ úųų́ — Chiricahua/Mescalero
    • ąłńóż Lektitic languages

      • ąćęłńóśźż — Polish
      • ćśůź — Silesian
      • ãéëòôù — Kashubian
    • A, Ą, Ã, B, C, D, E, É, Ë, F, G, H, I, J, K, L, Ł, M, N, Ń, O, Ò, Ó, Ô, P, R, S, T, U, Ù, W, Y, Z, Ż — Kashubian letters

    • ČŠŽ

      • And no other characters — Slovenian
      • ĆĐ — Bosnian, Croatian, Serbian (Latin transcription)
      • ÁĎÉĚÍŇÓŘŤÚŮÝáďéěíňóřťúůý — Czech
      • ÁÄĎÉÍĽĹŇÓÔŔŤÚÝáäďéíľĺňóôŕťúý — Slovak
      • ĀĒĢĪĶĻŅŌŖŪāēģīķļņōŗū — Latvian (ŌŖ and ōŗ have been discontinued in modern Latvian)
      • ĄĘĖĮŲŪąęėįųū — Lithuanian
    • ĐÀẢÃÁẠĂẰẲẴẮẶÂẦẨẪẤẬÈẺẼÉẸÊỀỂỄẾỆÌỈĨÍỊÒỎÕÓỌÔỒỔỖỐỘƠỜỞỠỚỢÙỦŨÚỤƯỪỬỮỨỰỲỶỸÝỴ đàảãáạăằẳẵắặâầẩẫấậèẻẽéẹêềểễếệìỉĩíịòỏõóọồổỗốơờởỡớợùủũúụưừửữứựỳỷỹýỵ — Vietnamese

      • ꞗĕŏŭo᷄ơ᷄u᷄ — Middle Vietnamese
    • ā ē ī ō ū — May appear in romanized Japanese or transliterated texts, or Hawaiian, Māori texts

    • é — Sundanese

    • ñ — Basque/Spanish

    • āáǎàaēéěèeōóǒòoīíǐìiūúǔùuüǖǘǚǜ êẑĉŝŋ — Chinese Pinyin letters (the last five are actually very rare)

  • أ ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ك ل م ن ه ؤ و ئ ى ي ء Arabic letters

    • Arabic, Malay (Jawi), Kurdish (Sorani dialect), Punjabi, Pashto, Sindhi, Urdu, etc.
    • پ چ ژ گ — Persian (Farsi)
  • Brahmic scripts

    • Bengali
      • অ আ কা কি কী উ কু ঊ কূ ঋ কৃ এ কে ঐ কৈ ও কো ঔ কৌ ক্ কত্‍ কং কः কঁ ক খ গ ঘ ঙ চ ছ জ ঝ ঞ ট ঠ ড ঢ ণ ত থ দ ধ ন প ফ ব ভ ম য র ৰ ল ৱ শ ষ স হ য় ড় ঢ় ০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯
      • Used for Bengali, Assamese
    • Devanagari
      • अ आ इ ई उ ऊ ऋ ॠ ऌ ॡ ऍ ऎ ए ऐ ऑ ऒ ओ ओ क ख ग घ ङ च छ ज झ ञ ट ठ ड ढ ण त थ द ध न प फ ब भ म य र ल ळ व श ष स ह ० १ २ ३ ४ ५ ६ ७ ८ ९ प् पँ पं पः प़ पऽ
      • Used for Sanskrit, Hindi, Maithili, Magahi, Marathi, Kashmiri, Sindhi, Bihari, Konkani, and Nepali in Nepal
    • Old Mukhi
      • ਅਆਇਈਉਊਏਐਓਔਕਖਗਘਙਚਛਜਝਞਟਠਡਢਣਤਥਦਧਨਪਫਬਭਮਯਰਲਲ਼ਵਸ਼ਸਹ
      • Mainly used for Punjabi, as well as Braj, Haripuri (and other Hindustani dialects), Sanskrit, Sindhi
    • Gujarati
      • અ આ ઇ ઈ ઉ ઊ ઋ ઌ ઍ એ ઐ ઑ ઓ ઔ ક ખ ગ ઘ ઙ ચ છ જ ઝ ઞ ટ ઠ ડ ઢ ણ ત થ દ ધ ન પ ફ બ ભ મ ય ર લ ળ વ શ ષ સ હ ૠ ૡૢૣ
      • Used for Gujarati, Kachchi
    • Tibetan
      • ཀ ཁ ག ང ཅ ཆ ཇ ཉ ཏ ཐ ད ན པ ཕ བ མ ཙ ཚ ཛ ཝ ཞ ཟ འ ཡ ར ལ ཤ ས ཧ ཨ
      • Used for Standard Tibetan, Dzongkha (Bhutanese), Sikkimese
  • АБВГДЕЖЗИКЛМНОПРСТУФХЦЧШ (Cyrillic)

    • ЙЩЬЮЯ
      • Ъ — Bulgarian
      • ЁЫЭ
        • Ў, no Щ, И replaced by І (some variants use Ґ) — Belarusian
        • Rare Ъ — Russian
      • ҐЄІЇ — Ukrainian
    • ЉЊЏ, Ј replaces Й (Vuk Karadžić reform)
      • ЃЌЅ — Macedonian
      • ЋЂ — Serbian
    • ЄꙂꙀЗІЇꙈОуꙊѠЩЪꙐЬѢЮꙖѤѦѨѪѬѮѰѲѴҀ — Old Church Slavonic, Church Slavonic
    • Ӂ — Romanian along the Dniester (other regions use Latin letters)
  • ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ αβγδεζηθικλμνξοπρσςτυφχψω (Greek) — Greek

  • אבגדהוזחטיכלמנסעפצקרשת (Hebrew)

    • May have dots or lines above or inside the letters — Hebrew
    • פֿ; only has dots under א, י, ו — Yiddish
    • No dots, multiple words ending with א (i.e., left-side position) — Aramaic
    • Ladino
  • Hanzi cultural sphere — some East Asian languages

    • Only Hanzi — Chinese
    • With Hiragana (あいうえおの) or Katakana (アイウエオノ) — Japanese
  • 위키백과에 (common oval and circular symbols) — Korean

  • ㄅㄆㄇㄈㄉㄊㄋㄌㄍㄎㄏ etc. — Bopomofo

    • ㄪㄫㄬ — Non-Mandarin/only dialect use

កខគឃងចឆជឈញដឋឌឍណតថទធនបផពភមសហយរលឡអវអ្កអ្ខអ្គអ្ឃអ្ងអ្ចអ្ឆអ្ឈអ្ញអ្ឌអ្ឋអ្ឌអ្ឃអ្ណអ្តអ្ថអ្ទអ្ធអ្នអ្បអ្ផអ្ពអ្ភអ្មអ្សអ្ហអ្យអ្រអ្យអ្លអ្អអ្វ អក្សរខ្មែរ (Khmer letters) — Khmer

  • Ա Բ Գ Դ Ե Զ Է Ը Թ Ժ Ի Լ Խ Ծ Կ Հ Ձ Ղ Ճ Մ Յ Ն Շ Ո Չ Պ Ջ Ռ Ս Վ Տ Ր Ց Ւ Փ Ք Օ Ֆ (Armenian letters) — Armenian

  • ა ბ გდ ევ ზ ჱ თ ი კ ლ მ ნ ჲ ო პ ჟ რ ს ტ ჳ უ ფ ქ ღ ყ შ ჩ ც ძ წ ჭ ხ ჴ ჯ ჰ ჵ ჶ ჷ ჸ (Georgian letters) — Georgian

  • АБВГДЕЖЗИКЛМНОПРСТУФХЦЧШ (Cyrillic)

    • Bold indicates unique letters for that language

Slavic languages#

Belarusian беларуская#

  • Uses: ё, і, й, ў, ы, э, ’
  • Characteristic: шч replaces щ
  • The only Slavic language that does not use и.

Bulgarian български#

  • Uses: ъ, щ, я, ю, й
  • The only Slavic language that uses ъ as a vowel; thus, it often appears between consonants
  • Words: със, в
  • Characteristic: Many words end with the definite article –ът, –ят, –та, –то, –те

Macedonian македонски#

  • Uses: ј, љ, њ, џ, ѓ, ќ, ѕ
  • Words: во, со
  • Characteristic: р often appears between consonants, such as првин

Montenegrin#

  • Uses: З́, С́

Russian русский#

  • Uses: ё (optional), й, ъ (rare), ы, э, щ
  • Does not use: ґ, є, і, ї, љ, њ
  • Before 1918, Russian orthography used і, ѣ, ѳ (rare), ѵ (extremely rare); ъ frequently appears, mainly at the end of words

Serbian српски#

  • Uses: ј, љ, њ, џ, ђ, ћ
  • Does not use: ё, й, щ, ъ, ы, ь, э, ю, я
  • Words: је, у
  • Characteristic: A large number of consonant combinations, such as српски

Ukrainian українська#

  • Uses: є, и, і, ї, й, ґ, щ, ’
  • Does not use: ъ, ё, ы, э

Mongolian#

  • Uses: ө, ү
  • Only for names or loanwords: к, ф, щ

Ossetian#

  • Uses: ӕ

Arabic letters#

  • All languages using Arabic script are written from right to left.
  • Some languages that previously used Arabic script now more commonly use Latin script; for example, Turkish, Somali, and Swahili.

Arabic العربية#

  • Inverted question mark: ?
  • Short vowels are not written, so many words lack vowels
  • Common prefixes: -الـ
  • Common suffixes: ة -ـة-
  • Words: إلی، من، علی

Persian فارسی#

Except in extremely rare cases, verbs always appear at the end of phrases.

  • Common verbs: کرد، بود، شد، است، می‌شود
  • Uses: پ، چ، ژ، گ
  • Words: که، به

Urdu اردو#

  • Uses: ٹ‎، ڈ‎، ڑ‎، ں، ے
  • Many words end with ے
  • Words: اور، ہے
  • Distinguishing from Arabic: In many texts, Urdu is written in a "slanted" style, with words slanting from the upper right to the lower left (unlike the "linear" style of Arabic, Persian, etc.).

Syriac letters#

Syriac ܐܬܘܪܝܐ#

  • Short vowels are usually not written, so many words lack vowels
  • Three writing styles (estrangela, serto, mahdnaya) and two vowel representation methods
  • Basic letters in estrangela style: ܐ ܒ ܓ ܕ ܗ ܘ ܙ ܚ ܛ ܝ ܟ ܠ ܡ ܢ ܣ ܥ ܦ ܨ ܩ ܪ ܣ ܬ
  • Basic letters in serto style: ܬ‎, ܫ‎, ܪ‎, ܩ‎, ܨ‎, ܦ‎, ܥ‎, ܣ‎, ܢ‎, ܡ‎, ܠ‎, ܟ‎, ܝ‎, ܛ‎, ܚ‎, ܙ‎, ܘ‎, ܗ‎, ܕ‎, ܓ‎, ܒ‎, ܐ‎
  • Basic letters in madnhaya style: ܬ‎,ܫ‎,ܪ‎,ܩ‎,ܨ‎,ܦ‎,ܥ‎,ܣ‎,ܢ‎,ܡ‎,ܠ‎,ܟ‎,ܝ‎,ܛ‎,ܚ‎,ܙ‎,ܘ‎,ܗ‎, ܕ‎,ܓ‎,ܒ‎,ܐ‎

Dravidian languages#

  • All Dravidian languages are written from left to right.
  • All Dravidian languages have different scripts. However, their orthography has similarities.

Kannada#

  • Kannada has a 49-letter alphabet.

Tamil#

  • Common suffixes: ள்ளது, கிறது, கின்றன, ம்
  • Common words: தமிழ், அவர், உள்ள, சில
  • Tamil has a unique 30-letter alphabet. With the help of diacritics, it can write up to 247 letters.

அ ஆ இ ஈ உ ஊ எ ஏ ஐ ஒ ஓ ஔ க ங ச ஞ ட ண த ந ப ம ய ர ல வ ழ ள ற ன

Telugu#

Telugu has 56 characters (Aksharamulu), including vowels (Achchulu) and consonants (Hallulu). Telugu uses eighteen vowels, each with an independent form and a diacritic form that combines with consonants to form syllables. The language distinguishes between short and long vowels.

అ ఆ ఇ ఈ ఉ ఊ ఋ ౠ ఌ ౡ ఎ ఏ ఐ ఒ ఓ ఔ అం అః క ఖ గ ఘ ఙ చ ఛ జ ఝ ఞ ట ఠ డ ఢ ణ త థ ద ధ న ప ఫ బ భ మ య ర ఱ ల ళ వ శ ష స హ

౦ ౧ ౨ ౩ ౪ ౫ ౬ ౭ ౮ ౯

Bengali#

Bengali script or Bengali letters (Bengali: বাংলা বর্ণমালা, bangla bôrnômala) or Bengali writing (Bengali: বাংলা লিপি, bangla lipi) is a writing system originating from the Indian subcontinent, used for Bengali, and is the fifth most widely used writing system in the world. This script is also used for Assamese, Maithili, Meitei, and Bishnupriya Manipuri, and historically used for writing Sanskrit in the Bengal region.

Bengali#

Bengali has a unique 50-letter alphabet.

  • Bengali script has 9 vowel letters, each called স্বরবর্ণ swôrôbôrnô "vowel letter". swôrôbôrnôs represent six of the seven main vowels in Bengali and two vowel diphthongs. All of these are used in Bengali and Assamese.

অ আ ই ঈ উ ঊ ঋ এ ঐ ও ঔ

  • Bengali script has 39 consonant letters. Consonant letters in Bengali are called ব্যঞ্জনবর্ণ bænjônbôrnô "consonant letters". The letter names usually consist of the consonant sound plus the inherent vowel অ ô. Since the inherent vowel is assumed and not written, most letter names are the same as the letters themselves (the name of the letter ঘ is ghô, not gh).

ক খ গ ঘ ঙ চ ছ জ ঝ ঞ ট ঠ ড ঢ ণ ত থ দ ধ ন প ফ ব ভ ম য র ৰ ল ৱ শ ষ স হ ড় ঢ় য় ৎ ঃ ং ঁ

  • There are 10 diacritics representing syllable rhymes -

া ি ী ু ূ ৃ ে ৈ ো ৌ

Assamese#

  • Assamese script has 9 vowel letters, each called স্বরবর্ণ swôrôbôrnô "vowel letter".

অ আ ই ঈ উ ঊ ঋ এ ঐ ও ঔ

  • There are 39 consonant letters. Consonant letters in Assamese are called ব্যঞ্জনবর্ণ bænjônbôrnô "consonant letters".

ক খ গ ঘ ঙ চ ছ জ ঝ ঞ ট ঠ ড ঢ ণ ত থ দ ধ ন প ফ ব ভ ম য ৰ ল শ ষ স হ ড় ঢ় য় ৎ ঃ ং ঁ

  • There are 10 diacritics representing syllable rhymes -

া ি ী ু ূ ৃ ে ৈ ো ৌ

Canadian Aboriginal syllabics#

In modern writing, Canadian Aboriginal syllabics indicate Cree, Inuktitut, or Ojibwe, the latter two also written in other scripts. The basic glyphs are ᐁ ᐱ ᑌ ᑫ ᒉ ᒣ ᓀ ᓭ ᔦ, each can appear in four orientations, bold, superscript, and with diacritics, including ᑊ ᐟ ᐠ ᐨ ᒼ ᐣ ᐢ ᐧ ᐤ ᐦ ᕽ ᓫ ᕑ. This vowel diacritic script is also used for Blackfoot.

Other North American syllabics#

Cherokee#

Cherokee writing uses a unique syllabary, which includes the following characters:

ᎡᎢᎣᎤᎥᎦᎧᎨᎩᎪᎫᎬᎭᎮᎯᎰᎱᎲᎳᎴᎵᎶᎷᎸᎹᎺᎻᎼᎽᎾᎿᏀᏁᏂᏃᏄᏅᏆᏇᏈᏉᏊᏋᏌᏍᏎᏏᏐᏑᏒᏓᏔᏕᏖᏗᏘᏙᏚᏛᏜᏝᏞᏟᏠᏡᏢᏣᏤᏥᏦᏧᏨᏩᏪᏫᏬᏭᏮᏯᏰᏱᏲᏳᏴ。

Constructed languages Conlang#

Esperanto#

  • Words: de, la, al, kaj
  • Six letters with diacritics: ĉ Ĉ ĝ Ĝ ĥ Ĥ ĵ Ĵ ŝ Ŝ ŭ Ŭ, which correspond to the H system of orthography as ch Ch gh Gh hh Hh jh Jh sh Sh u U or their corresponding X system as cx Cx gx Gx hx Hx jx Jx sx Sx ux Ux
  • Word endings: o, a, oj, aj, on, an, ojn, ajn, as, os, is, us, u, i,

Klingon tlhIngan Hol#

  • When written in Latin letters, Klingon has case distinctions; q and Q are different letters, while other letters are either all uppercase (like D, I, S) or never capitalized (like ch, tlh, v). This leads to many words looking strange to the unaccustomed, such as: yIDoghQo', tlhIngan Hol (mixed case).
  • Apostrophes are frequent, especially at the end of words or syllables.
  • Common suffixes: -be', -'a'
  • Common words: 'oH, Qapla'
  • One or more apostrophes may appear in words: SuvwI″a'

Lojban#

  • (Almost) all lowercase;
  • Common words: lo, mi, cu, la, nu, do, na, se;
  • Paragraphs are separated by ni'o, sentences by .i (or i);
  • A large number of five-letter words, consonant-vowel structures CCVCV or CVCCV;
  • A large number of short words with apostrophes, such as ko'a pi'o, etc.;
  • Usually no punctuation, except for periods;
  • Commas may be used in words (usually for proper nouns).

Toki Pona#

  • All letters are lowercase, except for names/loanwords
  • No diacritics
  • Only uses clear consonants, such as p, t, k

Complete alphabet: p, t, k, s, m, n, l, j, w, a, e, i, o, u

  • Common words: li, mi, e, sina, ona, jan
  • Often resembles simplified and phonetic English or Swedish
  • A large number of disyllabic words
  • No ji ti wu wo syllables
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.