Newsgroups: comp.lang.icon
Path: utzoo!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!sdd.hp.com!news.cs.indiana.edu!ux1.cso.uiuc.edu!midway!ellis.uchicago.edu!goer
From: goer@ellis.uchicago.edu (Richard L. Goerwitz)
Subject: Re: Comment on longstr.icn (will this thread die?)
Message-ID: <1991Mar20.171035.14579@midway.uchicago.edu>
Sender: goer@midway.uchicago.edu (Richard L. Goerwitz)
Organization: University of Chicago
References: <9103200028.AA02704@laguna.Metaphor.COM>
Distribution: inet
Date: Wed, 20 Mar 91 17:10:35 GMT

alex@LAGUNA.METAPHOR.COM (Bob Alexander) writes:

>Gee, this is fun.

Sure is.

>So here's my entry in the series of suggestions.  Somehow I can't help
>but wonder if I've missed something obvious -- but if not, do I get my
>name in the growing list of credits? :-)

The credits don't fit onto one line anymore :-).

Jerry Nowlin gave this procedure its name, and started the whole dis-
cussion of whether backtracking though a list was not better than my
original (terrible) solution.  Steve Wampler made the code somewhat
terser and more elegant.  Ken Walker suggested turning while into
every, to make it possible to go through the list only once.  Bob
Alexander pointed out that explicit handling of offsets i/j was not
necessary, as long as we were using match().

In this most recent post, I've created a static structure to hold re-
verse-sorted versions of arg 1 (l).  Under normal circumstances, each in-
vocation will not contain a completely different first argument.  In
most cases, the same l will be used at least two or three times.  On
my machine, the magic number of re-uses for a five or six-element l,
with strings of 3-8 characters, is about 3.  With three or more invo-
cations of longstr(l, s, i, j) with the same first arg, it becomes
worthwhile to sort l in reverse order, and store this list for later
use.

The worst-case scenario for this version would be if it were called
repeatedly, with different first arguments, each of which was a list
with elements of the same length.  The performance, even in this case,
however, is close enough to Bob Alexander's version that it's worth
using it.

-Richard


############################################################################
#
#	Name:	 longstr.icn
#
#	Title:	 match longest string in a list or set of strings
#
#	Authors: Jerry Nowlin, Steve Wampler, Kenneth Walker, Bob
#                Alexander, and Richard Goerwitz
#
#	Version: 1.8
#
############################################################################
#
#  longstr(l,s,i,j) works like any(), except that instead of taking a
#  cset as its first argument, it takes instead a list or set of
#  strings (l).  Returns i + *x, where x is the longest string in l
#  for which match(x,s,i,j) succeeds.  Fails if no match occurs.
#
#  Defaults:
#      s     &subject
#      i     &pos if s is defaulted, otherwise 1
#      j     0
#
#  Errors:
#      The only manual error-checking that is done is to test l to
#      be sure it is, in fact, a list or set.  Errors such as non-
#      string members in l, and non-integer i/j parameters, are
#      caught by the normal Icon built-in string processing and sub-
#      scripting mechanisms.
#
############################################################################
#
#  Links: none
#
############################################################################


procedure longstr(l,s,i,j)

    local elem, tmp_table
    static l_table
    initial l_table := table()

    #
    # No-arg invocation wipes out all static structures, and forces an
    # immediate garbage collection.
    #
    if (/l, /s) then {
	l_table := table()
	collect()		# do it NOW
	return			# return &null
    }

    #
    # Is l a list, set, or table?
    #
    type(l) == ("list"|"set"|"table") |
	stop("longstr:  list, table, or set expected (arg 1)")

    #
    # Sort l longest-to-shortest, and keep a copy of the resulting
    # structure in l_table[l] for later use.
    #
    if /l_table[l] := [] then {

	tmp_table := table()
	# keys = lengths of elements, values = elements
	every elem := !l do {
	    /tmp_table[*elem] := []
	    put(tmp_table[*elem], elem)
	}
	# sort by key; stuff values, in reverse order, into a list
	every put(l_table[l], !sort(tmp_table,3)[*tmp_table*2 to 2 by -2])

    }

    #
    # First element in l_table[l] to match is the longest match (it's
    # sorted longest-to-shortest, remember?).
    #
    return match(!l_table[l],s,i,j)

end
