KOSH Documentation plan

Part I:
The documentation server architecture
Date: 21 Dec 1998
Rudi Chiarito
Formal Development Environment WG

Part I of the documentation plan addresses the overall client/server architecture of KIS (KOSH Information Server). Part II, which is currently being revised, addresses the processing and the authoring procedures at a closer level of detail.

The repository

The KOSH Information Server (KIS) is implemented on top of a CVS server. CVS stands for Concurrent Versions System and is a popular, robust system which allows many contributors to work on the same group(s) of files. It comprises access control, versioning, branching and archiving.

The server is going to be hosted by a Unix machine. There are no other sensible choices to accomplish the task. Clients exist for most operating systems: Unix variants, AmigaOS, Windows, even MacOS. There's also a GUI frontend for Java.

The data

The server will provide a broad range of information: articles, manuals, workgroup reports, tutorials, news, archives, pointers to other resources. Technical documents will be authored in SGML using the DocBook DTD. They will be delivered to users in different distribution formats:

  • HTML for online browsing;
  • RTF, PS and PDF for printing.

HTML, RTF and PS can also be distributed in compressed archives, to allow offline browsing and/or reduced bandwidth usage. Formatting will be done by a DSSSL engine, Jade, using customized versions of the Modular DocBook Stylesheets by Norman Walsh. Jade is freely available for Unix, Amiga and Windows.

Non-technical documents do not require the complexity of the DocBook DTD, therefore a few custom document types (DTDs) will be created to cover those needs.

The ultimate aim of this project is a system which generates information and maintains the web site automatically. During the early stages some steps will be handled manually by users and administrators, but as the system grows, tools will be created to accomplish those tasks without requiring human interaction.

The authoring process

Every workgroup (or every project, if a workgroup is involved in two of more of them) will be assigned a directory in the CVS tree. Members will be able to obtain a copy of the directory's contents, update documents and commit changes to the repository. The CVS server will keep track of all the revisions. Upon committing, the documents will be formatted and a notification will be emailed to workgroup members and administrators; such notification will include log changes and any warnings or errors occurred while processing data.

The formatted data will not be made immediately available online to the public; rather, it will be available under a password-protected subtree of the web site for members to discuss. Only when consensus is reached the data will be exported to the web server.

It has to be noted that the web site is actually just a member of the workgroup which gets updates of the CVS repository only when told so. Not only it is an easy and elegant solution, but it also lets the repository and the web site be transparently on the same machine as well as different ones.

Solely workgroup managers can trigger the web site update. This is accomplished in two different ways:

  • by logging using telnet or preferably ssh on the server and launching the export script;
  • by simply replying to the notification email.

The second method is easier and faster. It is also more convenient when managers are not using a system which doesn't allow telnet or ssh, e.g. when using a machine behind a firewall or when travelling.

In case a manager is unreachable or should any problems like infringements or violations occur, the web administrators can trigger the update or revert to an older revision of the project.

On formatted data

Members are encouraged to install on their machines as much of the processing subsystem as possible. This makes publishing faster because any experiments are done locally and do not need data to be exchanged between the server and the client, which is wanted by people who have slow access or who are charged by the minute. Distribution archives of tools will be made available for Amiga, Unix and Windows platforms. They will contain binaries and documentation, ready to be installed.

Members should at least install the SGML parser nsgmls and the necessary DTDs, in order to validate documents before committing them. In this basic setup a member can commit documents and then retrieve the output files.

An Amiga system with less than 32Mbytes of RAM, for example, simply can't process DocBook documents with Jade; in such cases the server will help members to get the job done anyway, albeit in more time.

Another example is the conversion of pictures from vector to bitmap and viceversa. Technical illustrations like diagrams or flow charts are to be stored in Encapsulated PostScript (EPS) format. The file will be included as is in the final PostScript and PDF output files, in order to get the best results on paper. On the other hand, HTML and RTF will be using GIF/PNG files for graphics. They will be generated using a tool that relies on Ghostscript and pnmtogif/pnmtopng. Users who can install Ghostscript and the PNM tools (or who have them already installed) will convert the illustrations themselves, otherwise they will commit the EPS pictures and then retrieve the GIF/PNG files generated by the server.

Converting pictures the other way round, i.e., from bitmap to vector, is accomplished in a similar fashion using the giftopnm/pngtopnm and pnmtoeps tools. (Note: actually this only generates a Postscript bitmap. It does not attempt to reconstruct vector entities like lines or curves from the bitmap. Most of the times it isn't wanted either.)