The GELLMU project has many facets. For mathematicians the most
interesting point is that
*regular GELLMU* provides a way to use LaTeX-Like Markup
to write in an author-level XML document type that admits reliable
automatic translation both to PDF (via regular LaTeX) and to the
modern form of
HTML extended with MathML.

More generally, the GELLMU project provides a way to use LaTeX-Like
markup to write directly for most author-level XML document types
with the availability of *newcommand*-like macros taking arguments.

**Q.**Why use a LaTeX-Like interface to XML rather than use a LaTeX translator?- A. LaTeX translates easily only to the DVI and PDF formats. Translating LaTeX to formats such as HTML and other SGML or XML document types requires Herculean efforts.

GELLMU is not *LaTeX* nor is it
an *HTML* or *XML* generator (but see below);
rather it is a general-purpose *SGML* authoring language that
is based on traditional *LaTeX* syntax (to the extent possible).

When used beyond
*regular GELLMU*,
it is like XML in that there is no fixed set of tags (i.e., LaTeX-like
commands). That feature is both good and bad. It is good because
it creates flexibility. It is bad because it puts upon the author
the responsibilities of

- having a coherent set of tags.
- writing or obtaining codelet packages (or style processors) to provide translation to standard formats.

In exchange for the extra effort the author gets to pick the target formats.

A well-tuned GELLMU authoring system might admit translation not only to LaTeX and HTML/MathML but also to other formats.

Some have said that SGML is not "rich enough" to encompass the needs
of real mathematicians. That is certainly a correct statement about
the language in the SGML family that most of us know best, the one
that is called HTML. One needs to understand that SGML is about the
*organization* of automatic processing. A single dialect may
not be rich enough for everything. We want to think about the
*category* of markup languages (modulo a somewhat elusive
notion of equivalence) and the clever use of *morphisms* in
that category.

Of course, the class of markup languages will not give rise to a
*category* unless one knows what is a *morphism*.
A morphism is any translation from one to another (or itself).
That is, a morphism takes a document in one as input and produces
a document in the other as output. There is certainly the identity
translation for each markup language, and there is a null translation
in any case where the target language contains the empty document.

Strictly speaking an *isomorphism* in this category would then
be given by a pair of mutually inverting morphisms.

With this notion of *isomorphism* there could easily be
infinitely many documents in a markup language that were reasonably
equivalent to the empty document.

So one wants the objects in the category to be classes of markup language for some notion of equivalence. Even then it is not clear that one would arrive at a small category.

Ultimately the nature of restrictions imposed in SGML systems is
that the morphisms in the category need to work **fast**. That
understood, one can get where one wants by

- trying to find a language that enjoys suitable
*initiality*properties relative to those markups in which one is interested. - composing morphisms.

I offer an example of a document in HTML, a brief introduction, which is not the GELLMU entrance document, that was translated automatically from a Gellmu input document (plain text with tags). There is also a LaTeX document (plain text with tags) that was produced automatically from the same document.

Computing Goal: one (virtual) operating system, one editor, one mailer, and one authoring language, regardless of where I sit.

Much of what we need today for mathematical research is available electronically at our desks. The New York Journal of Mathematics is just one example. Many mathematics journals are now available online, although often not freely, and various mathematical preprint archives are also available.

The online appearance of *Mathematical Reviews* in the guise
of MathSciNet (American
Mathematical Society) enables a mathematician to accomplish in perhaps
half an hour what formerly would have consumed **days** of work
in the library gathering references.

The JSTOR project even brings
crisp images of the journal pages, for selected journals, of **bygone
years** to our screens.

Unfortunately, the mathematical community has lacked a format for
presenting electronic articles that is (1) **robust** for
**notation-based searching**, (2) satisfactory for **efficient
network delivery**, and (3) **easily renderable** in various
presentation formats by widely available inexpensive tools. Publishers
still might wish to obstruct the free delivery of high quality typeset
forms from network delivery while providing free (slightly ugly) forms
with all content in tact.

What may **not** work well is for publishers to provide only
"search hits" and "indexing information" to the network for free
without providing free (slightly ugly) forms with content in tact.
In both cases cataloging and indexing sites might fear that their
customers will think that they are not being served well when all
they get are pointers that cannot be followed with some means of
verification of the soundness, for the customers' interest, of that
which was retrieved. This reasoning might lead cataloging and indexing
sites to ignore such publications.

That would represent a dissemblance of the mathematical community in
which there will be at most suboptimal international network support
for digging out the state of information about a specific topic that
goes beyond what is now possible by going to *MathSciNet* (which
is enormously better than going to a paper library for *Math
Reviews*).

**Much, much more is possible soon** if there is this level
of cooperation of publishers.

Beyond that free (slightly ugly) forms with content in tact of
research articles and research monographs give publishers the hope
of churning interest and increasing receipts thereby. There will be
a relation between what can be charged for a high quality print copy
and the dollar cost, not to mention the labor and the "mess", of
printing from the web. For *good* articles and monographs
the number of international sales should be *enormous*, as the
price goes below the multiplier 1.5, relative to the dollar cost of
printing from the web, compared to the mere 2500 subscriptions that
a *good* contemporary mathematics print journal hopes to have today.

One might even imagine an increase in the quality of the average research article as publishers, as well as editors, become sensitive to the quality of the individual article as opposed to the running average quality of several years worth of the articles in a journal.

The widespread distribution, beginning around 1995, of a free viewer
for Adobe's *Portable Document Format (PDF)* made it possible for the majority
of PC users, not just those with special mathematically oriented tools
(principally "TeX"), to view typeset mathematics on their screens.

From the standpoint of a mathematician wishing to serve typeset mathematics on the network the "PDF" format is not the final answer. Indeed, while it greatly expands the viewing audience, it still, in the typical situation, requires both a web browser and a PDF viewer, and it does not push the realm of possibilities much beyond that which had been accomplished with Knuth's TeX-related "DVI" format. "DVI" is desirable because it is a public standard. (Indeed, the complete definition of "basic DVI" is relatively short.) Finally, the use of PDF requires the author to acquire at cost a tool for the generation of PDF.

An early draft for version 3 of HTML made provision for
mathematical text in Web pages. This was adequate for casual
mathematical needs, but this simple math markup was later excluded
from HTML. Today most observers regard HTML as a "closed" language.
Nonetheless, the addition of a **single tag**,
the "`<lg>`" ("lg" for "logical group") could make
a very big difference if *MathML* does not take hold in the way
that has been promised.

After the mathematical provisions in early HTML-3 were discarded from HTML, the World Wide Web Consortium (W3C) revisited the question of provision for mathematics on the web. This led to the development of Mathematical Markup Language (MML), which acquired the status of a W3C recommendation in the Spring of 1998.

Beyond the issues of electronic presentation the mathematical
community needs formats that are (1) **efficient** for human
authoring, (2) **robust** for **notation-based searching** and
(3) **satisfactory** for **perpetual archiving**. Such formats
must be powerful enough to admit subsequent automatic processing in
many different directions.

Since 1991 I have been working, in part, as a provider to the network of information related to mathematical research. I have been involved in maintaining a public electronic archive since 1992. This archive became free-standing as a gopher in 1993, and began serving HTML hypertext and links to HTTP (HyperText Transfer Protocol) servers in early 1994. It began to function in parallel with an HTTP server (working with the same database) in 1995, functioning as part of the mathematics web, when it became apparent that it would be a while longer before most WWW browsing programs would be brought up to date (as of late 1993) on gopher protocol. Indeed by 1996 the use of gopher in the mathematical community had declined steeply.

I encourage those with an interest in the InterNet as a medium of information exchange to become acquainted with the UNIX (tm) philosophy. The principles therein have application as well to the design of information exchange protocols and mechanisms on the InterNet. Mathematicians tend to gravitate toward the UNIX philosophy because they understand how a magnificent structure can be made from the assembly of many carefully crafted nuggets.

TOP | Department