Gopher in the World-Wide Web


This is a local copy

This is a local copy of the item found on 27 Oct 95 at the URL-of-origin

gopher://gopher.ocf.berkeley.edu/hh/gopher/gopher-www

written 8/94 by Alan Coopersmith - alanc@ocf.berkeley.edu.


1. General Gopher/WWW Issues

Almost all World Wide Web client programs can connect to gopher servers using the gopher0 protocol. Very few use the extended gopher+ protocol yet, but hopefully it will become more supported in the future.

1.1 Gopher URL's

The "Uniform Resource Locator" (URL) was first used by WWW software to describe the servers it could connect to in a uniform way. It has since become popular outside WWW software as a simple, standard way to locate resources.

The traditional style gopher link of:

Type=<type>
Host=<host>
Path=<path>
Port=<port>
Name=<name>

can be represented in URL form as
gopher://<host>:<port>/<type><path>

Notice the "Name" assigned to the link is not contained in the URL. (People have always been free to use whatever Names they want on links, and the name is not used when retrieving documents, so it's not needed in the URL.)

If the port is the gopher default of 70, the URL can optionally be shortened to:
gopher://<host>/<type><path>

The "root level" of a gopher server has always been defined as a menu of type 1 with no path required, and it can be represented as:
gopher://<host>:<port>/
(or without the :<port> if the port is 70)

The popular Unix gopher server software from the University of Minnesota uses path strings with the type as the first character, which results in a URL that looks like the type character has been doubled.

A gopher+ link is often shown with a + after the type character -- the + is not actually part of the type code and is not used when building URL's. (The RFC 1630 definition of URL's specifies adding the "gopher plus syntax" to the end of the URL, but I haven't seen this implemented in any software yet.)

In order to make URL's more transportable, spaces and other special characters in paths are "escaped" into the form "%XX", where XX is the hexadecimal ASCII/ISO-8859-1 character code. (For more details, see the URL defintions on the Web or in Internet RFC 1630.)

Some examples:

	Type=9+
	Host=gopher.rodent.net
	Path=9/golf-courses
	Port=70
	URL=gopher://gopher.rodent.net/11/golf-courses
	#
	Type=0
	Host=gopher.turnip.com
	Path=Turnip Recipes
	Port=1070
	URL=gopher://gopher.turnip.com:1070/0Turnip%20Recipes
	#
	Type=1
	Host=gopher.tc.umn.edu
	Path=
	Port=70
	URL=gopher://gopher.tc.umn.edu/

1.2 Using "GET /" to fake http

There is currently no officially defined way to create a gopher or gopher+ link pointing at a http server. However, by what seems to be a fortunate accident, the http0 and gopher0 protocols are very similar. In fact, most gopher clients can be faked into thinking they're speaking gopher0 when they connect to an http0 server. The primary difference is that a gopher0 request consists of just the path, while http0 requests are of the form "GET /path". People have used this similarity to create gopher links that look like
	Type=0
	Path=GET /file.txt
	Host=www.foo.net
	Port=80
which fool clients into retrieving the file normally known by the URL
	http://www.foo.net/file.txt
More commonly used is the gopher type 'h' which is the HTML hypertext format used on the Web.

One problem with trying to fool clients like this is that http0 and gopher0 have both been upgraded, and the resultant HTTP/1.0 and gopher+ protocols are not similar enough for this trick to work. Fortunately, most http servers still support the old protocol, so this method still has some life left in it. (And in fact, there are those who believe this method should become the officially defined way to link gopher to http.)

2. Using the Univ. of Minnesota Unix/VMS Gopher Client 2.0x on the Web

The client software provided by the University of Minnesota Gopher Team for Unix and VMS computers has some features to help use http servers and html documents. These features have changed over time, and may not be available in older versions.

2.1 Connecting to http servers and viewing HTML documents

The gopher client knows that it doesn't speak the HTTP/1.0 protocol very well or display HTML documents in a very useful form. Fortunately, it's willing to ask for help from software that can. If you have defined a viewer for the type "text/html" (in the conf.h at compile time, or in the global gopher.rc or the users ~/.gopherrc) the gopher client will ask it to deal with any html files it comes across. The viewer needs to be able to accept either a URL or a local HTML file name on the command line -- lynx (with the -force_html flag) meets both these requirements and is a good choice for a standalone WWW browser as well.

The client knows about the "GET /" trick described earlier, and when it comes across a link of type 'h' with a path starting "GET /" it will translate the path into a http URL and pass that to the text/html viewer, letting the viewer use HTTP/1.0 to retrieve the document if it can. Links to type 'h' items in gopher+ menus will use a http URL passed from the gopher+ server if present. Links of type 'h' without a "GET /" path or a URL passed via gopher+, are retrieved by the gopher client and placed into a temporary file, which the text/html viewer is then asked to display.

2.2 Using URL's

Since URL's are becoming quite popular as a standard method of describing resources in places like Usenet postings and magazine articles, the gopher client software is being extended to allow users to specify a starting place by giving its URL on the command line with the '-u' option and while inside the client to open a new connection or retrieve a document specified by a URL with the 'w' command key. The current UMinn distribution of gopher 2.016 doesn't have these features, but the next release from University of Minnesota is expected to. They are already available in the VMSGopher 2.016 client derived from the UMinn software, and in a unofficial modified version of the Unix 2.016 client.

3. Using the Univ. of Minnesota Unix Gopher Server 2.0x on the Web

The server software provided by the University of Minnesota Gopher Team for Unix systems has features that help server admins build links to the Web and to make their gopher server nicer for WWW client users. These features have changed over time, and may not be available in older versions.

3.1 Making links to http servers and other URL's

Older versions of the gopherd 2.0x server can use the "GET /" trick described earlier to make links http servers from their menus. With newer versions however, the following link format is preferred:
	#
	Type=h
	Name=Turnip Lovers of America (TLA) Home Page
	URL=http://www.turnip.com/~tla/index.html
	#
The current software will generate the "GET /" gopher0 link from this, and it will also send the unmodified URL in gopher+ menus to gopher+ clients.

The "URL=" link line can also be used with gopher, ftp, telnet, and tn3270 URL's. The software will take what information it can from the URL to fill in empty spots in the gopher link information. You can explicitly specify any part of the link tuple to override the URL information. The exact conversions are in the gopher source code, but the current URL handling code translates as follows:

	URL=http://<host>[:<port>]/<path>
	Type=h
	Host=<host>
	Port=<port> 	[or 80 if unspecified]
	Path=GET /<path>
	#
	URL=gopher://<host>[:<port>]/[<type><path>]
	Type=<type>		[1 if type & path are missing]
	Host=<host>
	Port=<port>		[70 if not given]
	Path=<path>		[blank (for "root menu") if not given]
	#
	URL=telnet://[<user>@]<host>[:port]
	Type=8
	Host=<host>
	Port=<port>		[23 if not given]
	Path=<user>
	#
	URL=tn3270://[<user>@]<host>[:port]
	Type=T
	Host=<host>
	Port=<port>		[23 if not given]
	Path=<user>
	#
	URL=ftp://<host>/<path>
	Type=[1 if path ends in '/', 0 otherwise]
	Host=+
	Port=+
	Path=ftp:<host>@<path>

Remember, you can override any of these guesses, like this:

	#
	Type=g
	Name=Miss Turnip Festival 1994
	URL=ftp://ftp.turnip.com/pub/miss-turnip.gif
	#

3.2 Serving HTML documents

The gopher+ protocol allows gopher+ clients to choose which format a document is returned in. One of the formats you can create documents in is "text/html". However, since most WWW clients don't speak gopher+, they can't choose the text/html document. To help work around this, the Unix gopher server checks the type code on the front of the path a client requests. If the code is 'h', the server will return the 'text/html' form if available.

Note: this is not part of the gopher or gopher+ protocols. It is a feature of the UMinn Unix gopherd, like the "ftp:" and "waissrc:" special-meaning paths. Other servers may or may not work this way.

3.3 Connecting from WWW clients

Most WWW clients can connect to gopher servers and display gopher menus. The URL's used when retrieving a gopher menu from a UMinn Unix gopherd look like:
	gopher://<host>:<port>/11/<path>

Recent versions of the gopherd can generate HTML documents from their menus. These often look nicer, allow the non-gopher+ speaking WWW clients to see the abstract information for items in the menu, and pass http links in the http: URL form all WWW clients understand instead of the gopher "GET /" form that very few WWW clients realize are really http links. Since these menus are just text/html views of a gopher+ object, the above method can be used to access these menus by merely changing the "11" in the URL to a "hh". (The first h tells the client what to expect, the second h is part of the request sent to the server.) For instance, some places use links like this:

	#
	Type=h
	Name=WWW Hypertext version of this gopher server
	Path=h/
	Host=+
	Port=+
	#
which allows users to switch to the html forms of menus when they select that item from the menu.

WARNING: The gopherd 2.016 distributed by the University of Minnesota has a bug that causes generation of the html menu views to fail. The fix is available as part of the unofficial enhanced gopher 2.016 software that can be found at
gopher://gopher.ocf.berkeley.edu:70/11/gopher/alanc-patches
or, since this server is running a UMinn Unix gopherd
gopher://gopher.ocf.berkeley.edu:70/hh/gopher/alanc-patches