Il 23/11/19 13:02, Henning Hraban Ramm ha scritto:
Am 2019-11-23 um 08:12 schrieb Mojca Miklavec
: Then you can use one of the online JS editors like CKeditor.\
Only if you spend an enormous amount of effort making sure that the code is properly cleaned up rather than containing a gazillion random html style tags which you can never reconstruct back into some structured form.
(And yes, my impression is that Massi spent a huge amount of effort in configuring the editor and cleaning up the mess. My company didn't and ended up with sometimes literally every word in a sentence using a different font size or style. They gave up on html + cke pretty soon, but couldn't be convinced that this was a bad idea upfront.)
Indeed, an ongoing effort. Markup should be mostly semantic, leaving the most of styling to ConTeXt or CSS or whatever. But then you must consider the features and limitations of the tools you use. CKEditor lets you define rules to specify what can enter your sources; it's great, but it's essentially a HTML editor, not a semantic editor. In CKE, the HTML is *the* document, not a representation of it inside a browser. Prosemirror is the best editor i know, if you care about complete control of what goes into your sources. It's document agnostic, HTML is used only to represent the document in a browser. Your document could be JSON or markdown or whatever. Prosemirror is actually an editor kit, not an editor: - Tiptap editor combines Prosemirror with Vue.js - wax-prosemirror instead combines Prosemirror with React.js I'm developing on top of wax-prosemirror, which should be the next version of the editor inside Editoria by Coko Foundation (Luigi posted a couple of links in another reply of this thread). Jure Triglav of Coko Foundation wrote a good post on open source collaborative editors: https://juretriglav.si/open-source-collaborative-text-editors/
Don’t exaggerate. Or maybe your company didn’t think about which tags are really necessary. A proper configuration that doesn’t allow nonsense, even if users paste text from Word documents, is not such a big effort.
Even though we started with a semantic tagging mindset, we always find alien tags or wrong combinations of allowed tags in our sources. It's not only pasting from Word or a web page, sometimes it's browsers' plugins or different behaviors among different browsers. And bugs of my code, of course. I feel it's hard to exaggerate in paying attention to that, there's always something unwanted sneaking in.
I can’t remember which JS editor I used >10 years ago for the editorial system of a city magazine, but I remember I only allowed a few tags (authors weren’t allowed to use font and color settings) and also run a HTML cleaner before saving. It was an effort until it worked, but not that much.
We have filters to clean up a source before editing, rules inside a CKE-based editor and filters to do other cleanups while saving. We have quite a good control over sources, yet it's not complete. A lot depends on the complexity of your documents: you start simple and do some assumptions; later you want to increase complexity, add new features that must combine with the legacy of every assumption you did in the past. More complex documents, more room for unwanted markup to enter them. I admire Pandoc's document model: it's simple enough, well specified, with generic tags (Div for blocks and Span for inlines) that carry information with classes and key-value data, RawInlines and RawBlocks to inject low level tagging for specific formats. Generics and Raw objects let you represent and convert many elements that are not built in. Massi