[NTG-context] Ligature suppression word list

Arthur Rosendahl arthur.reutenauer at normalesup.org
Thu Apr 8 21:37:22 CEST 2021


On Sat, Apr 03, 2021 at 06:02:10PM +0200, Hans Hagen wrote:
> german is just an example, dutch has some specific things, and i bet other
> languages have their demands so my aim is some general mechanism

  I appreciate that, but if you want to have data of sufficiently good
quality to use this mechanism for individual languages, you need to
invest a *lot* of time for each one of them.  German is one of the very
few languages I know of that has an active group of people working to
produce that data, the “Trennmuster people”, as Mojca calls them ;-)
Their word list supports many fine points of typography, even those that
few programs can use, for example weighted hyphenation.  Ligature
prevention came in as a side project.

  Dutch, by contrast, does not seem so well served: the OpenTaal group
is dormant and no longer offers the hyphenated word list that was once
available (that was already the case five years ago).  The most relevant
page I find: https://www.opentaal.org/projecten/woordafbreking is from
2009.  There have apparently been recent updates by a single person (who
incidentally sometimes contributes to the German hyphenation working
group), but they’re rather generic.

	Best,

		Arthur


More information about the ntg-context mailing list