This is the basic idea for Gridwrite and why it is a stroke-based method (or glyph-based method).To implement this idea, a different approach is needed for chinese character :
- What is a stroke ? When we use a pen to write a stroke on a flat surface such as a paper, there involves three processes :
- The pen is down.
- The pen moves and draw a path on the flat surface.
- The pen is up.
- The code number defined is not unique. That is, two different chinese characters can have same code number. The code number 1422 of the example is also the code number for another chinese character "曰" which means 'say'. The shape of both characters are nearly the same except that their width to height ratio are different. The reason for non-unique of the code number is that the number is just the stroke order and so only provides 'partial' information about the shape of the character.
- What is stroke order ? One explanation is the following : It is the order of writing a character that minimizes our 'mental loading'.
Let's use writing an english word as an example. When you write an english word 'is', normally you would write 'i' first and 's' second. This is because english sentence is read from left to right and so we must write the sentence from left to right. This induces that every words of the sentence must be also written from left to right including 'is' if that sentence contains this word. Because if you write 's' first and then 'i', you need to reserve a space for writing 'i'. This means that you need to write 'i' mentally in your mind first. On the other hand, write 'i' first do not need to do tis reservation process, so it is quite 'natural' to first write 'i' first, then 's'.
Therefore, we can say that english word writing has a stroke order : from left to right.
We think similar conclusion can also applied to writing of chinese character. Based on historical experience, there indeed exists at least three common rules for writing a chinese character :
- From left to right
- From top to bottom
- From outter to inner
-
What is a character ? Today's character is defined by an unique number. There are standards such as Unicode, Big-5 or GB to assign such number to represent a character. These numbers are for categorical purpose, sorted in phonetic order or radical based order, the number itself do not contain direct information about the glyph of a character. This means that theoretically you can assign any bitmap image to a character. For example, you can draw a cat to represent 'c' under the same number, a fish to represent 'f' etc. (To see Unicode and its symbols, see our 'Unicode' page.)
In contrast, Gridwrite uses a different concept to identify a character. Gridwrite's code number reflects 'partial' information about the glyph of a character by forming a sequence of strokes and link it to a number. In extreme case, a character can be thought as a mathematical object that contain 'information' of drawing paths in a normalized 2D flat surface.
-
The glyph and the stroke order of chinese characters for Gridwrite are referenced below :
Ref 1 : Dictionary "初階中文字典" version 2 published by Pearson. It is our reference for traditional characters. The dictionary has strokes order information and it is mainly for traditional characters reader. It also contained corresponding simplified character for a traditional character.
Ref 2 : Dictionary "通用汉字正刑字典" version 1 published by "语文出版社". It is our reference for simplified characters. The dictionary has strokes order information and it is mainly for simplified characters reader. It also contained corresponding traditional character for a simplified character.
- What is a font ? A font is 'character + display style'. When we saw a character in a dictionary, it must be printed using some kind of font. So to get the stroke types that compose a chinese character, we need to extract them back from the printed font of the dictionary.
- There exists trouble when using two reference dictionaries : Ref 1 and 2 are printed using different type of font. For Ref 1, it used the most commonly used font for traditional chinese which is "楷體" but for Ref 2, it used "宋体" which is the most commonly used font for simplified chinese. These two fonts are not using the same stroke types for some chinese characters. For example, the character "小", as a traditional character and simplified character in Ref 1, it should be coded as 533 but as a simplified character and traditional character in Ref 2, it is coded as 563. Since we use Ref 1 for traditional character and Ref 2 for simplified character, the traditional character would be coded as 533 while the simplified character would be coded as 563. Therefore, when using Gridwrite, the code number change between traditional and simplified involved two factors, one is glyph change due to tradational to simplified, the other is due to font change.
- The font "楷體" that are used in Ref 1 is actually referenced to another reference dictionary called "常用字字形表" Year 2000 version published by The Hong Kong Institute of Education (香港教育學院). Interestingly, the font that is used in this glyph dictionary is what it can be called a 'ball-pen' style font. The characters in the dictionary are printed in hand-writing format with a ball pen. We guess the reason for using a ball pen is to minimize the effect of display style and to show clearly the drawing paths of a chinese character. We think that the idea behind this dictionary comes close to our concept of a character as a mathematical object.