Continuing from the previous text, there was an article titled "Writing of Chinese Pinyin - From Mastery to Not Being Able to Type It Out"
The Chinese Pinyin converter is relatively mature, but some obscure rules still have poor support (then just make one yourself
Currently, the best user experience is with Google Translate, which can achieve capitalization at the beginning of sentences, word segmentation, and is free without ads.
This article serves as a secondary conversion to connect with it (
The following writing methods all follow the current writing “Chinese Pinyin Scheme” (PDF).
Special characters used in short pinyin:
-
zh → ẑ / Zh → Ẑ
-
ch → ĉ / Ch → Ĉ
-
sh → ŝ / Sh → Ŝ
-
ng → ŋ / ( NG → Ŋ ) 1
These few double letter combinations "ẑ, ŝ, ĉ" are easy to handle, but the following "ŋ" is a bit troublesome, such as:
- 相安 "xiang'an"
- 线杆 "xiangan"
This is because the silent symbol is automatically added only before syllables starting with the finals "a/o/e" to prevent confusion, and not added before the initials.
When there is no initial before "i/u/ü", add "y/w" as the initial letter.
So, to distinguish whether it is a syllable ending with "ng" or a syllable ending with "n" and starting with "g".
Only when "ng" is not followed by a final [āáǎàaēéěèeōóǒòoīíǐìiūúǔùuüǖǘǚǜ], it is converted to "ŋ".
Or it should also be converted only when followed by initials [bpmfdtnlgkhjqxzcsr] (
Erhua
The "er" of erhua is generally displayed as a neutral tone, that is, without tone marks.
I privately checked Han Dian, and did not find a separate character for the neutral tone "er".
Let's just replace all:
- "(there is a space here) er" → "r"
Code section:
It can be seen that it is not difficult to implement, just write a regular expression and replace everything.
HTML
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Real-time Preview</title>
<style>
body {
margin: 0;
padding: 0;
font-family: 'SimHei', sans-serif;
display: flex;
height: 100vh;
background-color: #f5f5f5;
}
.container {
display: flex;
width: 100%;
}
.textarea-container {
width: 50%;
height: 100%;
}
textarea {
width: 100%;
height: 100%;
border: none;
padding: 10px;
font-size: 16px;
resize: none;
box-sizing: border-box;
outline: none;
font-family: inherit; /
}
</style>
</head>
<body>
<div class="container">
<div class="textarea-container">
<textarea id="input" placeholder="Please enter pinyin here..."></textarea>
</div>
<div class="textarea-container">
<textarea id="output" placeholder="Display converted content..." readonly></textarea>
</div>
</div>
<script>
const input = document.getElementById('input');
const output = document.getElementById('output');
function transformText(text) {
text = text.replace(/Zh/g, 'Ẑ')
.replace(/zh/g, 'ẑ')
.replace(/Ch/g, 'Ĉ')
.replace(/ch/g, 'ĉ')
.replace(/Sh/g, 'Ŝ')
.replace(/sh/g, 'ŝ')
.replace(/ er/g, 'r');
text = text.replace(/ng(?![āáǎàaēéěèeōóǒòoīíǐìiūúǔùuüǖǘǚǜ])/g, 'ŋ');
return text;
}
input.addEventListener('input', () => {
const transformed = transformText(input.value);
output.value = transformed;
});
</script>
</body>
</html>
https://wikidot.eu.org/tool.html
The single-page tool is hung here.
Not used:
"ê" and those nasal letters are mostly used in spoken language, and rarely in written language, so please allow the author to skip over them.
Footnotes#
-
Google Translate automatically capitalizes the first letter of a sentence, "ng" is generally used as a final, so this rule is not applied. ↩