Pytech Resources

Building org-drill files for learning Chinese

May 24, 2024

Posted in:

  • Emacs
  • Python

Python Script to Convert Text File to org-drill file

Emacs org-drill mode is great for learning Chinese (or any other subject that requires a lot of memorization). However, it seems that there are not many org-drill files available for learning Chinese. In this article I will show you how to build your own using a Python script shown below.

Download Script Python Script to Build Chinese Org-Drill File

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
""" This module is used for creating an org-drill file from a 
text file containing lines with 
Chinese characters\tpinyin\tenglish\n.

Example input file :
这个	zhège	This
那些	nàxiē	Those
几本	jǐ běn	Several/How many items (e.g. books)
我不是学生。	Wǒ bú shì xuésheng.I am not a student.

Run python make_chinese_org_drill_file.py -h for help on
how to use the script and the command-line options.

By default hide1_firstmore cards are created.
To make twosided cards pass the option -c twosided.
"""
import argparse

parser = argparse.ArgumentParser(
    description = 'Convert text file to org drill file for learning Chinese')

parser.add_argument('input_file',
                    help="text file with chinese, pinyin and english")
parser.add_argument('output_file',
                    help="name of output file which should with .org extension")
parser.add_argument('-c', '--card_type', choices=['twosided','hide1_firstmore'],
                    default='hide1_firstmore', help="Set DRILL_CARD_TYPE property")
parser.add_argument('-f', '--first_level_heading',
                    help="Top level org heading, e.g. HSK2 Sentences")
parser.add_argument('-s', '--second_level_heading', choices=['word', 'sentence'],
                    help="Flash card for words or sentences?")

args = parser.parse_args()

of = open(args.output_file, "w", encoding='utf-8')

of.write("# -*- mode: org; coding: utf-8 -*-\n")
of.write("#+STARTUP: showall\n\n")
of.write(f"* {args.first_level_heading}\n")

with open(args.input_file, encoding='utf-8') as f:
    for line in f:
        of.write(f"** {args.second_level_heading.capitalize()}\t\t\t:drill:\n")
        of.write(f"\t:PROPERTIES:\n\t:DRILL_CARD_TYPE: {args.card_type}\n")
        of.write('\t:END:\n\n')
        ch, py, en = line.rstrip().split('\t')
        if args.card_type == 'hide1_firstmore':
            of.write(f"Cn: [{ch}]\n")
            of.write(f"En: [{en}]\n\n")
            of.write('*** Pinyin\n')
            of.write(f"{py}\n\n")
        else:
            of.write('Translate this sentence.\n\n')
            of.write('*** Chinese\n')
            of.write(f"{ch}\n")
            of.write('*** English\n')
            of.write(f"{en}\n")
            of.write('*** Pinyin\n')
            of.write(f"{py}\n\n")
of.close()

As a practical example of how to use this script, let's look at the raw text file we can download from this github page : https://github.com/glxxyz/hskhsk.com/blob/main/data/lists/HSK%20Examples.txt

The file is mostly in the format required by the conversion script except for the two comment lines //HSK 1 Examples and //HSK 2 Examples. I removed the comment lines and at the same time copy and paste the lines under HSK 1 Example into a new file named hsk1_examples.txt. You also download this file from HSK 1 Examples Text File

Example of Source for Chinese Text

The file is mostly in the format required by the conversion script except for the two comment lines //HSK1 Examples and //HSK2 Examples. I removed the comment lines and at the same time copy and paste the lines under HSK1 Example into a new file named hsk1_examples.txt.

You can also download the file from here HSK 1 Examples Text File

Copy this into the same folder as the Python conversion script. To convert this file open a command prompt (or terminal) and run the following command :

python make_chinese_org_drill_file.py -f "HSK 1 Examples" -s sentence hsk1_examples.txt hsk1_examples.org

The hsk1_examples.org file should look like this :

HSK 1 Examples Org-Drill File

Great! Now you are ready to use this file to learn Chinese as described in my earlier blog Learning Chinese with Emacs.

You can also download HSK 1 Examples Org-Drill File .

And you can create your own org drill files and edit the ones you have right from Emacs itself.