Subj : Re: Polymorphism sucks [Was: Paradigms which way to go?]
To   : comp.programming,comp.object
From : topmind
Date : Fri Jul 29 2005 05:30 pm

Chris Sonnack wrote:
> topmind writes:
>
> >> That might be the taxonomy, but it's a piss poor OOD.
> >
> > It not necessarily about OOD. [...] it was talking about trees in
> > general, not necessarily related to polymorphism or OO. The context
> > was about the appearent popularity of trees among computer users
> > IIRC, not just developers.
>
> In fact--as your own task break down showed--it's a natural form for
> any intelligent person when breaking down something comprised of many
> parts.  Or more properly, of parts and sub-parts...as all tasks are.

Like I keep saying, small trees are fine in some cases. When routines
get bigger, one has to form sub-routines to avoid duplicating
code (ie, duplicate nodes) in most cases. I rarely encounter routines
that grow more than about 200 lines before candidates for
duplication-factoring start to appear.

>
> If all we're talking about is the taxonomy, then one can use as many
> trees as needed to express different views of the information depending
> on what one is looking for.

Yes, but IMO sets are superior to multiple trees in
most cases.

>
> If we're talking about programming, then are are elegant ways--as
> shown in previous posts--to do it.
>
> >> What this buys you is the ability to pass around "Drink" objects
> >> that know what they really are and can behave according to their
> >> qualities.
> >
> > Just like a record from a "Drink" entity.
>
> Nope.  A record is dumb.  It requires code to process it.

I find that the behavior "taxonomy" often is not one-to-one
with attribute taxonomies. One could encode the behavior
"taxonomy" as 1-to-1 attribute references, but I rarely see
benefits in this.

>
> >> Imagine I have a collection of Drink objects and I want to list
> >> them in order by the amount of caffeine.  simple, since they all
> >> have a common interface that lets me ask for the caffeine content.
> >
> > Just like a record from a "Drink" entity.
>
> Nope.  Records are dumb.  They have no interface.

They have no "interface"? Data queries.

>
>
> >>>> Here we seem to be talking about tree-shaped *data*,...
> >>
> >> ...the larger the scale gets, the more hierarchical **data** is
> >> a win, simply because you don't have to deal with all the data at
> >> once. You can pick the scale and branch of interest.
> >
> > You are kidding, right? Can you provide that taxonomy of people I
> > keep asking for?
>
> We're talking about **data** here, not taxonomies.  (What taxonomy
> have you 'kept' asking for?)
>
> > To not "have do deal with all the data at once", use queries.
> > We've been over this already.
>
> Yes, and it's stupid to query a huge dataset when you can query a
> much smaller one.

That's why we have indexes.

>
> >> You KNOW this is true, since you've already agreed that DB indexes
> >> must use tree structure to speed access to a selected data subset.
> >
> > I don't know about "must". Maybe a good non-tree indexing algorithm
> > has yet to be discovered.
>
> Database design is a pretty well-explored domain.  There are some very
> fundamental limitations here.  You either need to look at each record
> to determine if it matches your query, OR you need to--da da--look at
> a subset....a child of the full dataset.

Perhaps, but there are potentially other indexing schemes that
are non-tree. But it is a moot point anyhow. Just because the
underlying machine uses 1's and 0's does not nec. mean
that programmers should also.

>
> > But that is kind of an internal "mechanical" issue. What works
> > underneath may not work on the domain level also.
>
> When it comes to dealing with large datasets, partitioning is a win.
> Database designers know this, hence the indexing technology.

Again, you are guilty of faulty extrapolation. Philosophers
have long known there is rarely One Right Taxonomy/Partitioning
for a given thing.

>
> >> The simple
> >> fact is, you broke a common task into levels.  Just about any set of
> >> tasks naturally breaks down that way.  (In fact, I've spent this week
> >> working with a project leader for a coming project doing just that.  The
> >> project is very large, and without a hierarchical breakdown, it'd be
> >> beyond the capacity of any human to deal with.)
> >
> > I bet you start to have cycles (or node duplication) past a certain
> > point.
>
> Nope.  There are *no* duplicate tasks.  Each is distinct.

I don't believe you. I would have to see the code. I think
there is probably a misunderstanding somewhere here.

>
>
> >> And what do you mean by "relational techniques" for file systems?
> >> Give me an example.
> >
> > One could use SQL, another relational query language, and/or
> > Query-By-Example to find stuff.
>
> Not seeing anything "relational" about this.  You do understand the
> term, right?   (SQL, for example, works just fine in non-relational
> databases.)

Hmmmm. I don't think I agree with this, but will have
to think about that one.

>
> > Take a look at this link for some suggestions:
> >
> > http://www.geocities.com/tablizer/sets1.htm
>
>
> Numerical identifiers for folders?  Very dumb idea!
> The example shows:
>
> }  westsrvr:4251/slides.shw
>
> "4251" has no connection to anything real.  How can anyone think
> users can remember "4251"?  I have thousands of files in hundreds
> of directories.  No human could remember distinct numbers for all
> those folders.  I can't believe anyone sane could think this was
> a viable option.

You are obviously not wearing your relational cap. There
may be other ways to label folders, and each may also have
a discription attribute, and perhaps even make it a
primary key. However, experience has shown many times
that "dumb" keys make the best unique keys in many cases.
For example, if we call a folder "BobsSalesProject", but
then Bob leaves the company and is replaced, we may want
to rename the folder name. This can bust existing references
outside of our database and/or file-server.

For example, when companies move articles around in
web URL paths, often it busts existing browser bookmarks.
The same thing can happen with "meaningful" names.
A "dumb" key is safer from such because it carries no
external meaning.

In other words, we *can* have unique identifiers
with real-world meaning (text), but on a large scale
it usually turns into a headache. Usually a primary
auto-generated unique key used along with a description
attribute (non-unique) is the best way to handle
such things in practice. One can still search on
the description, I would note. One can even index
it. However, it may return multiple answers.

This is a matter of "best practices" of RDBMS,
not ability versus non-ability.

>
> Further, all it's done is create an "abbreviation" for a location
> (but a very difficult abbreviation to remember).  The same issue
> of changes applies.  Change the location, and everyone's links
> are wrong.

Whaaaaaat?

>
> > I can't speak for everybody, but I would rather have a relationional
> > file system.
>
> I suspect you speak only for yourself.

I've heard some others desire the same thing.

>
> Later on the page is the idea of associating properties to a file and
> later searching for it by properties.  Which is fine until you forget
> what properties you used, make up too many to handle, or change them.
> How do you browse through the data to find the lost file??

How is this worse than forgetting a giant path???

>
> It also requires a big database in which to store all this.

So? Higher abstractions require more horse-power. I don't
think 30,000 desktop files requires that much horse-power. I've
seen RDBS that big on 286's (well, actually they were
semi-relational).

>
> Having worked with virtual file systems that used technology somewhat
> like this......[shudder] it's a nightmare.
>

Perhaps you were just not familiar-enough with the concept.
Non-tree org systems generally take longer to learn
to use effectively. I don't dispute that.
It is a longer-term investment. That is why they
are not wide-spread.

>
> >> Why?  Show me the structure of a set.
> >
> > One generally looks at one *aspect* of a set at a time.
>
> In other words, there is no structure.
>
> >> It's simple and undeniable: sets have less structure than trees.  EOS.
> >
> > Prove it!
>
> I don't have to, you just admitted it.

It depends entirely on how one defines "structure".

>
> >> In fact, your thesis depends on this, for one of the main things you
> >> rail against IS that structure and your perception that it is artificial
> >> and may not match reality and may be hard to change.  Sorry, I don't
> >> think you can have it both ways.
> >
> > Please clarify. Your statement is not clicking in.
>
> Your thesis is "trees are bad, because their structure is hard to change.
> Sets are better, BECAUSE THEY LACK THAT STRUCTURE."
>
> Therefore, by your own thesis, sets lack structure (which is true).

Again, your use of "structure" is word-play.

>
> > Being structured and being easy-to-change may be generally orthogonal.
>
> Then why are you so set against trees?  They have structure and if ease
> of change is orthogonal to it.... what's the problem?

I said "may be", not always. "Rigidity" is probably a better
word choice for tree's limits.

>
>
> >>>> A tree is really any structure with a "root" and child nodes.
> >>>
> >>> Cross links can cause ambiguous roots and children.
> >>
> >> Nonsense.  Show me.
> >
> > Having 2 bosses in a really small company. Who is the "root"?
>
> The CEO.

I worked at a company that had 3 owners and I had 3 bosses.

> As for two bosses, in the cases I know there is a primary
> boss and a "dotted line" boss--so called because that's typically
> exactly how it's shown on the org chart: as a dotted line signifying
> the secondary relationship.

The dotted line busts a "pure" tree, period.

>
> In other cases--and I can speak to this personally, because I was IN
> this situation for a couple years--each boss had "half" a person.
> That is, I was 50% allocated to one group and 50% to another.  Within
> each org, I appeared on their org TREE in my proper place with a note
> that they only got 50% of me.
>
> To the point: the situation is *entirely* hierarchical.

No, that is not a pure tree. You either represent it with
duplicate nodes, or draw lines across branches.

>
>
> >>> Interactive software usually has a lot of indirect recursion in my
> >>> experience.
> >>
> >> Are you sure about that?  *Recursion* in I/A software?  Cite a case.
> >
> > A GUI page A where you launch page B, but click a link which opens
> > another instance of page A. This is common during web-browsing, I would
> > note.
>
> ?? What does browser history have to do with anything?  That doesn't in
> any way speak to recursion in the software.

If A calls B, B calls C, and C calls A, it *is* recursion.
Maybe we should make a distinction between direct recursion
and indirect recursion.

>
>
> >>> But a pure tree has no duplicate nodes.
> >>
> >> That's BS.  Show me one authority that agrees.
> >
> > Connect the subroutine calls on the paper. Don't take my word for it,
> > get your pen out.
>
> So,... when you can't respond to a point, you just throw in something
> totally irrelevant?  Okay.  We'll just assume capitulation.

No, I just cannot easily describe it without a visual.
I thought it was obvious, but appearently not, and
I cannot readily solve that without images at this point.
If I was in the mood, I might try some ASCII art.

>
>
> >>> The tree becomes far bigger than the actual thing it is representing
> >>> because of the duplication.
> >>
> >> Totally False.  The call tree represents reality.
> >
> > And lots of duplications of parts of reality.
>
> If a routine is entered multiple times, that **IS** the reality.

And it is duplication. Multiple nodes are referencing the
same thing. Even in the compiled code, the subroutine
calls are reduced to addresses, and each call to the
same routine uses the same address. If you draw lines
instead of addresses to represent this, you will see
it is not a tree-shape.

>
>
> >> More handwaving.  Show me an actual example. I DO event-driven programming,
> >> and my programs definately have high level routines and low level routines
> >> (and many medium level routines).
> >
> > Show me the hierarchy here in out-line form then.
>
> Let's consider the most recent project I've worked on:
>
> Main
> 	Initialize_Common_Globals()
> 	Initialize_Program()
> 	Load_Program()
> 		LoadProperties()
> 	Get_Special_Target_List()
> 		LoadList(SpecialTargetList)
> 			OpenTable("[SpecialTargets]")
> 			<load loop>
> 			Close
> 	Load(ApplicationForm)
>
> Et cetera.  Get the picture?

That looks like an implementation of the
event-handling engine, not actual events
themselves. With many E.D. tools
one does not have to know or care how the
event engine was implemented.

Perhaps one defines which is a "main" screen or
"start-up" screen, but
beyond that the "call tree" is mostly
dependant on what the user does.


>
>
> >> My point is your assessment seems unreasonably skewed.  You don't
> >> seem to have the ability to see the situation clearly due to your
> >> extreme bias and your apparent lack of experience and training.
> >
> > Lack of experience? I am a middle-aged developer. Started out on VAX's
> > and PRIME minicomputers.
>
> Time in the saddle doesn't necessarily translate to knowledge.  You just
> don't talk like someone who really understands data structures or OOD or
> how polymorphism is used.

I similarly feel you don't have experience with relational
and databases, such as your complaint about auto-generated
folder keys above. You would have known better if you
were educated in that area. (Some relational fans don't
agree with auto-gen keys, but know that named primary keys
are also a possibility without giving it second thought.
I had to remind you.)

>
> > I used to be more of a tree-fan when I was younger, I would note.
> > Studying reality changed my viewpoint.
>
> My father, an educated man, has come to have some very peculiar ideas
> in his old age.  It's possible your viewpoint is skewed.
>

So I got senile and turned into a set fan, eh?

>
> >> Sets are raw, unstructured data.  If you claim otherwise, SHOW ME.
> >
> > I don't have a mathematically precise definition of "raw" nor
> > "unstructured". Do you?
>
> I think the definitions of "raw data" and "unstructured data" are pretty
> clear to anyone who knows what they're talking about.  Only someone with
> no viable response resorts to silly word play.

It is YOU with the word-play. However, finding a clear
definition of "structure" that applies to the cyber-world
is sticky philosophical territory, I will agree.

>
> >> Higher: above, superior.
> >> Order: degree.
> >> Structure: a complex construction or entity.
> >
> > Define "superior". Define "complex". Sounds more like vague sales-talk
> > or Dilbertian management-speak.
>
> If you really don't know the definitions of these basic concepts, how can
> any opinion you have be worth anything?  Alternatively, you do know and
> have recognized you're in a losing position.
>
> My guess is you know what "superior" and "complex" mean.

Those terms tend to be relative and vague. They may be fine
for general "street talk", but not when precision is needed
to settle an argument. It is too imprecise for our
purposes. It lacks technical or mathematical precision.

For example, Bill Gates may be "superior" from a money tally
standpoint, but if he ends up in hell when he dies (as a
hypothetical example), then he is not superior from a
religious standpoint. Or if World-War III starts and it
turns into a physical dog-eat-dog world, Gates would
be next-morning's breakfast. (A "superior" breakfast
perhaps?)

>
> --
> |_ CJSonnack <Chris@Sonnack.com> _____________| How's my programming? |

-T-

.