Skip to content

Customizing Input Content

If you want to customize some content, this article provides some explanations.

It is recommended to use the following software to customize the content of this article:

This article refers to the following sources:

Writing a Lexicon

Due to the design of Rime, English words and super simple abbreviations are not suitable in the Pinyin lexicon:

yaml
hello	hello
世界	s j
蒙奇·D·路飞	meng qi d lu fei

As you can see, 世界 is represented by s and j, which will cause the input method to fail to suggest words or phrases starting with s. The same applies to j and d, which will result in the failure to suggest relevant words or phrases starting with j.

If all lexicons are designed in this way, when you input s, the input method will try to suggest all words or phrases starting with s, leading to lagging, memory leaks, and even crashes.

Therefore:

  • It is recommended to use full Pinyin for all lexicon entries.
  • English lexicons should be placed in the English dictionary. In the Mint input method, the English lexicon is used as the secondary input source, and automatic prediction and sentence generation for English input should be disabled.

When you examine the files in the Mint input method, you will find that the directory structure is as follows:

text
dicts
├── custom_simple.dict.yaml        # Custom dictionary, where you can add your favorite words
├── luna_pinyin.biaoqing.dict.yaml # Emoji dictionary
├── luna_pinyin.emoji.dict.yaml    # Emoji dictionary (may be removed in the future)
├── luna_pinyin.extended.dict.yaml # Extended dictionary for Luna Pinyin (may be removed in the future)
├── rime_ice.41448.dict.yaml       # Extended single-character dictionary for Rime Ice
├── rime_ice.8105.dict.yaml        # Basic single-character dictionary for Rime Ice
├── rime_ice.base.dict.yaml        # Core basic lexicon for Rime Ice
├── rime_ice.en.dict.yaml          # English dictionary for Rime Ice
├── se_words.dict.yaml             # Commonly used lexicon for software industry
├── terra_rime_ice.base.dict.yaml  # Earth Pinyin dictionary (generated by Python)
└── wubi98_base.dict.yaml          # Wubi 98 dictionary

The content of the lexicon files is written as follows:

yaml
---
name: Lexicon Name
version: "Version Number"
sort: by_weight (sort by weight) | original (sort in the order of the character table)
columns:    # When columns attribute is not specified, the default order is:
- text    # Vocabulary
- code    # Code
- weight  # Weight
- stem    # Word-building code (not related to Pinyin scheme)
  ...
  你好	ni hao	123
  For lexicon files without phonetic annotation but with weight specified, modify the columns accordingly:

---

For example, note that the format inside the lexicon is 'Word'<Tab>'Pinyin'<Space>'Pinyin'<Space>'Pinyin'<Tab>'Weight':

yaml
# Rime dictionary
# encoding: utf-8
#
# Personalized terms - by @Mintimate
# It is recommended to add custom phrases or words here
---
name: custom_simple
version: "2023.11.30"
sort: by_weight
...

# Personal names
# Common phrases
哈哈	ha ha	99
macOS	mac	99
可以	ke yi	99
# (。>ㅅ<。)
Mintimate	mintimate	1
https://www.mintimate.cn	mintimate	2
Mintimate's Blog	mintimate	3

These lexicons are referenced by the lexicon driver configuration in the root directory:

text
├── custom_dict_en.all.dict.yaml        # English lexicon for Mint input method
├── custom_dict_terra.all.dict.yaml     # Terra Pinyin Mint custom lexicon
├── custom_dict.all.dict.yaml           # Mint Pinyin lexicon
└── custom_dict.wubi.dict.yaml          # Wubi 98 Mint custom lexicon

Let's see how internal references are used:

yaml
---
name: custom_dict.all ## Note that the name should match the filename
version: "2020.6.7"
sort: by_weight
# This is where you specify the dictionaries used by the input method, to supplement the extended dictionary.
import_tables:
  - dicts/rime_ice.8105 # Mist Pinyin common character collection
  - dicts/rime_ice.41448 # Mist Pinyin complete character collection
  - dicts/custom_simple # Custom
  - dicts/rime_ice.base # Mist Pinyin https://github.com/iDvel/rime-ice
  - dicts/se_words # Internet network vocabulary
  - dicts/luna_pinyin.biaoqing # Emoticons
  - dicts/luna_pinyin.emoji # Emoji Ext
...

Important:

  • name: The name is the filename without the .dict extension, and the filename should end with .dict.
  • import_tables: Enumerate the lexicons that need to be imported.

After modifying the lexicon, remember to redeploy the input method.

The above information can help you customize the lexicons.

Custom Text

"Custom Text" refers to the custom_phrase.txt file within the input method. You may not see it in the Mint input method...

In my understanding, "Custom Text" refers to lexicons with particularly high weights (this is the default behavior, but the weights of each translator can be adjusted using initial_quality). Therefore, I have removed the configuration for "Custom Text." If needed, you can configure it yourself. The format is the same as the lexicon:

yaml
# Rime table
# encoding: utf-8
# Custom Text
# Do not write any comments after this line
噷	hm
哼	hng


去	q	2
千	q	1

我	w	3
万	w	2
往	w	1

等等	dd
的地得	ddd
等等等等	dddd
刚刚	gg
才刚刚	cgg
知道	zd
不知道	bzd

Also, the input method configuration needs to include:

yaml
translators:
    - table_translator@custom_phrase      # Custom Phrase custom_phrase.txt

Custom Text does not interact with other translators in word-building. If you use a complete code, the character or word cannot participate in word-building. That is, self-built words cannot be remembered.

Therefore, it is recommended to fix non-complete code characters or words. For example, '的de' should be '的d', '是shi' should be '是s', and '仙剑xianjian' should be '仙剑xj'.

Note that the full Pinyin 'a o e' is also a complete spelling, so single characters of 'a o e' should not be included in the Custom Text. Otherwise, words like '啊 哦 呃' cannot be used for word-building.

Multi-device Sync

Most input methods based on the Rime input method framework do not have online synchronization capabilities. So how can you synchronize input on multiple devices?

You can use the synchronization feature of the Rime input method framework.

Sync Config

The installation.yaml file in the configuration directory will be automatically generated after the first deployment. You can edit the ID and sync directory of the current device here, for example:

yaml
distribution_code_name: Squirrel
distribution_name: "鼠鬚管"
distribution_version: 0.16.2
install_time: "Tue Aug  1 00:28:37 2023"
# ID of the current device, default is a UUID
# You can customize the name for more elegant backup files
installation_id: "c5f45f7e-3c1c-4257-8ff7-bce78e9b5fb5"
rime_version: 1.8.5

You can add a sync_dir configuration:

yaml
distribution_code_name: Squirrel
distribution_name: "鼠鬚管"
distribution_version: 0.16.2
install_time: "Tue Aug  1 00:28:37 2023"
# ID of the current device, default is a UUID
# You can customize the name for more elegant backup files
installation_id: "Macbook-M2Max"
# If not set, the default is the `sync/` directory under the current configuration directory
sync_dir: "/Users/mintimate/Documents/rimeSync"
rime_version: 1.8.5

After setting up the synchronization, generate the lexicons and user configurations: Generate Sync Files

After completing the synchronization, the *.userdb.txt files generated in the sync directory will contain the input content.

There are also some other unnecessary files in the sync directory. Rime also backs up the YAML and TXT files in the configuration directory in addition to the root directory, but only the ones in the root directory. Lexicons in the dicts folder of the Mint input method and Lua scripts are not synchronized.

⚠️ Windows users should pay attention to YAML syntax. Backslashes need to be escaped in double quotes but not in single quotes:

yaml
sync_dir: "c:\\file\\path\\sync"
sync_dir: 'c:\file\path\sync'

Multi-device Sync

Set the sync_dir on all platforms to the same directory, such as the directories of iCloud, Dropbox, or OneDrive.

Multiple devices will generate parallel folders in this directory, which contain user lexicons.

On PC-1, click "Sync" and synchronize it to PC-2 via cloud storage. Then click "Sync" on PC-2 to obtain the input content from PC-1.

Note: Set different installation_id for each device. You can set it as "PC-1" on PC-1 and "PC-2" on PC-2.

User dictionary data migration

If you were using other schemes before, such as pinyin_simp or luna_pinyin.

You can follow these steps:

  • Place the previous pinyin_simp.userdb.txt or luna_pinyin[_simp].userdb.txt in the sync directory.
  • Rename it to custom_dict.userdb.txt.
  • Modify the #@/db_name in the file to custom_dict.
  • Click "Sync" afterwards.

If you were using a traditional Chinese lexicon, you need to convert it to simplified Chinese beforehand. Be careful not to convert all tabs to spaces.

One simple method is to use VSCode:Open the file ➡️ Select all ➡️ Click "Code" in the upper left corner ➡️ Services ➡️ Convert Text to Simplified Chinese.

Or you can use OpenCC:

shell
opencc -c t2s -i in.txt -o out.txt