I was wrong about the query string in a Gopher URI by Christopher Williams 2026-04-01 A couple weeks ago I was discussing Gopher URIs with Sean Conner[1], and one thing I realized is that a question mark in a selector is _not_ safe to leave unencoded in a Gopher URI. As Sean pointed out, if a URI contains both a selector with a non-percent-encoded question mark and a search string (separated from the selector with a percent-encoded tab character, i.e., `%09`), a generic URI parser (following RFC 3986[2]) would include the tab and the search string as part of the query string. Here’s an example: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - gopher://example.com/7/foo?query%09search - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - In that example, the parser would set the query string to `query%09search`, which is clearly not intended; according to the URI generic syntax defined in RFC 3986, the query string would have to come after the search string as in: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - gopher://example.com/7/foo%09search?query - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - However, the Gopher URI scheme (RFC 4266[3]) does not define how a query string should be handled by a client. Even after reading the various URI RFCs, I’m not quite clear on what a client is expected to do when a Gopher URI contains a query string. Many Gopher clients seem to use the query string as the search string (so that `gopher://example.com/7/foo?query` effectively becomes `gopher://example.com/7/foo%09query`), but while this is a common practice, there’s no standard nor spec defining this behavior, and it’s not portable across all Gopher clients. This is like trying to fit a round peg into a square hole.[4] ---------------------------------------------------- Hindsight being what it is, I would have defined a Gopher URI as something like - - - - - - - - - - - - - - - - - - - - - - - - - - gopher://:/ - - - - - - - - - - - - - - - - - - - - - - - - - - where is one of: - - - - - - - - - - - - - - - - - - - - - - - - - - ? ?& - - - - - - - - - - - - - - - - - - - - - - - - - - (Where `?` and `&` are literal, but characters in each component are percent-encoded as needed—if the user’s search input contains `&`, it’ll be percent-encoded as `%26` in the URI.) This would treat the Gopher search input as a proper query string in the URI generic syntax (with the Gopher+ string separated from the search string by `&`), and we wouldn’t have to work around the current round peg/square hole situation. ---------------------------------------------------- So my current stance on this is to always percent-encode a question mark in a Gopher URI. It’s the only way to be sure. ------------------------------------------------------------ Query string in a CGI ------------------------------------------------------------ The other thing is that, since the Gopher protocol itself has no way to send the query string from a URI to a Gopher server, the `QUERY_STRING` CGI metavariable must be empty according to RFC 3875[5]. In my past writings on the subject, I mentioned that most Gopher servers set this variable to either the text following a question mark in the selector or to the Gopher search string. As I noted to Sean, both of these options are “wrong”. Semantically, however, a query string implies that it’s user input, so using the search string as the value of `QUERY_STRING` is perhaps the less “wrong” of the two options. On the other hand, I have found value in including both a parameter string in a selector and user input in the search string; so in my Gopher server (Thirteen) the strategy I took is to treat text after a question mark in a selector as the value of `QUERY_STRING` and assign the search string to a separate, non-standard metavariable. I am open to changing this, such as using a character other than a question mark (to make it clear that it’s not really a query string) and assigning the parameter string to another non-standard metavariable. An `&` should be safe—it’s allowed to appear literally in the path part of a URI, so a URI might look like `gopher://asciz.com/1/foo&x=1&y=2` which isn’t too ugly (less ugly than `%3F`) and is still reminiscent of a query string. I would avoid using `;` as the parameter string delimiter since that is often “used to delimit parameters and parameter values applicable to” a segment (rather than to the path as a whole), per RFC 3986 section 3.3. On the gripping hand, one of my goals in developing Thirteen is to make it compatible with other Gopher servers as much as possible; parameter strings, on top of query or search strings, are a guaranteed incompatible feature with all other servers. Thoughts? ------------------------------------------------------------ References and Footnotes ------------------------------------------------------------ [1] gopher://gopher.conman.org [2] gopher://asciz.com/0/rfc/rfc3986.txt [3] gopher://asciz.com/0/rfc/rfc4266.txt [4] Square Hole Girl has no problem with this. [5] gopher://asciz.com/0/rfc/rfc3875.txt