Hi, Taco:
Last week I stayed in my grandma's home (located in a beautiful
countryside) which has limited internet access. So I planned to study
the luatex source code during the stay. I read a small subsect of the
LuaTeX source and most parts of MetaPost source code and have a few
newbie questions:
I will refer to the mp.w as an example, since it is finished and I am
more familiar with the C programming language.
- wouldn't it be great if we remove the packed data dependency?
Knuth used packed data since memory space was very valuable during the
years when he wrote the TeX and MetaFont program. By doing so, he can
split a memory word (the space needed for a 32bit integer) into two,
three or four so that he can save more spaces. In pdfTeX and MetaPost,
almost all the important data structures (like all kinds of nodes,
boxes, lists) are build on top of the packed data structure. For
example, In TeX's char_node presentation, instead of using one word
for font and one word for character, he can use one word to do so by
defining two web macros font==type and character==subtype. However,
Memory consumption is not that important these years and in LuaTeX
other representations uses much more memory (like parsing a
complicated opentype font and dumping it into a lua table using the
fontforge library). So wouldn't it be bettter if we remove the packed
data? It will make the code more readable at the cost of consuming a
few more memory (the TeX web side is very memory efficient, so there
won't be big memory footprint even if we double the memory
consumption) since we can get rid of the messy web or C macros.
For example, in Part 21, mp.w, we define more than a dozen macros to
get all the variables of a edge which A points to its start. Such code
may confuse modern programmers who learn to program in C after the
mid-1990s. They prefer to do something like this:
typedef fill_node struct {
path *path_p;
pen *pen_p;
.....
int miterlim_var;
};
if we want to create a new fill node, instead of creating its node,
and setting its path_p to a given pointer p using
mp_path_p(t)=p;
it will be more understandable if we can do this:
node * mp_new_fill_node (pointer, p){
fill_node *fn = malloc (sizeof (fill_node)); /* a fill_node*/
fn->path_p = p;
fn->pen_p=NULL;
fn->red_val=0
....
return fn
}
-Why can't we use IEEE floating number specification?
TeX/MetaPost have it's own number presentation builtin. But this is a
very complicated and strange floating number system which represent
each number as a integer. I think IEEE floating number specification
is good enough for implementing TeX? Today almost every operating
system support float/double, and the precision is great. If we can do
this, Part 7 of the mp.w and luatex.web can be totally removed, and we
can also clean the conversion macros in the rest of the macros. If we
remove the dependency of part 7 and part 9, maybe we can also make our
code more portable on different machines.
- Why should we do memory management ourselves?
I can see that in mp.w, we maintain a node pool (a fixed continuous
space in physical memory). When we want to allocate a node, we should
call mp_get_node, which does several things like finding available
place for memory allocation, reporting error if the memory is exceed
then it tries to put everything with different structures into the
pool. Wouldn't it be great if we just ask the operating system's C
library to handle these tasks (like using malloc and free as the
example code of mp_new_fill_node shows). Modern operating system's
library is more efficient, and it will also make the luatex code looks
better.
- From the part of the code I read, if I understand correctly, the
only reason of incorporating fontforge is to get the font Metrics
data?
If so, I think FreeType2 is sufficient to accomplish the task? [I
think freetype2 is much smaller and efficient, and fontforge use
freetype2]
Yue Wang