Jonathan Fine
I'm looking for a what might be called a Unicode savvy Device Independent binary format. And I'm looking for XeTeX and LuTeX to share code and ideas, when possible.
Glyph indexes plus a ToUnicode map. That's how Unicode-savviness is done in a PDF: each glyph index is mapped to a Unicode string. For my own hobby-work I have a slightly hacked xdvipdfmx that reads in pre-computed ToUnicode maps (made with a FontForge script and put side-by-side with the font files) in spots where xdvipdfmx wasn't generating them. Really xdvipdfmx should be doing that on the fly. I think all the necessary code is there--it already reads my pre-computed maps and then generates a more efficient version, magically--the code just needs to be called at the right times, but I wasn't interested in working on it. (I'm disabled and have to ration my keyboard time.) However, LuaTeX should be perfectly capable of doing its own ToUnicode maps. Otherwise AFAIK the advantages of xdvipdfmx are that it's already written, supports various specials, and can use the list of fonts that is known to fontconfig (which isn't a big deal, and which LuaTeX could just as easily do).