Subj : Re: Polymorphism sucks [Was: Paradigms which way to go?]
To   : comp.programming,comp.object
From : topmind
Date : Thu Jul 14 2005 12:10 am

Chris Sonnack wrote:
> topmind writes:
>
> > Here is an example borrowed from c2.com. We taxonomize drinks by
> > making each orthogonal factor a level:
>
> That might be the taxonomy, but it's a piss poor OOD.

It not necessarily about OOD. Here is the context:

[quote]
> > I suspect in part it is because trees can be well-represented visually
> > such that the primary pattern is easy to discern. I say "primary"
> > because trees often visually over-emphasis one or few factors at the
> > expense of others.

> You make a lot of statements you don't support with examples.  Can you
> provide an example of this happening?
[end quote]

As you can see, it was talking about trees in general, not necessarily
related to polymorphism or OO. The context was about the appearent
popularity of trees among computer users IIRC, not just developers.

>
> > DRINKS
> > 	Coke
> > 	--Diet
> > 	----Caffeinated
> > 	----No Caffeine
> > 	--Regular
> > 	----Caffeinated
> > 	----No Caffeine
> > 	Lemon-Lime soda
> > 	--Diet
> > 	----Caffeinated
> > 	----No Caffeine
> > 	--Regular
> > 	----Caffeinated
> > 	----No Caffeine
> > 	Iced Tea
> > 	--Diet
> > 	----Caffeinated
> > 	----No Caffeine
> > 	--Regular
> > 	----Caffeinated
> > 	----No Caffeine
>
> The rather obvious observation is that state of caffeination is a
> single quality--perhaps a boolean value, but better yet a quantity
> (amount of caffeine, from 0 to OHMYGODIMWIRED).  Secondly, since
> this quality (or "factor" if you prefer) appears in every drink,
> it belongs at the top to be inherited by all children.
>
> > Any one of the three factors (flavor, dietness, caffeiness) could be
> > put at the top level, but we cannot put them all there.
>
> Of course you can.

I meant at the same time.

> The reality is that a drink--in the abstract--has
> each of those factors.  Therefore they arguably *belong* at the top.
>
> +Drink
>   {caffeination-level: 0-OHMYGODIMWIRED}
>   {sweetener-type: none|sugar|Nutrasweet|...}
>   {flavor: none|cola|lemon-lime|tea|coffee|...}
>
> The taxonomy just expresses the combinations.  So far, there's really
> no need to even bother with sub-types.  Whether it makes sense to create
> them depends on your mission statement--what problem are you trying to
> solve.  (The reality is they're ALL just different sorts of water.)
>
> If you wanted to, you could use sub-typing in which each of the three
> qualities had a fixed value depending on type.  So:
>
> +Drink
>   +Water {caff: 0}{sweetener: none}{flavor: none}
>   +Cola
>     +Classic Coke {caff: 21}{sweetner: sugar}{flavor: cola}
>     +Lemon Coke {caff: 21}{sweetner: sugar}{flavor: cola+lemon}
>     +Diet Coke {caff: 21}{sweetner: Nutrasweet}{flavor: cola}
>     +Diet Lemon Coke {caff: 21}{sweetner: Nutrasweet}{flavor: cola+lemon}
>   +(etc)
>
> What this buys you is the ability to pass around "Drink" objects that know
> what they really are and can behave according to their qualities.  So when
> you instance a [Classic Coke] object, it automatically has the qualities
> of Classic Coke, but can be passed around like any Drink.

Just like a record from a "Drink" entity.

>
> Imagine I have a collection of Drink objects and I want to list them in
> order by the amount of caffeine.  simple, since they all have a common
> interface that lets me ask for the caffeine content.
>

Just like a record from a "Drink" entity.

>
> >> I think the reason is fairly simple.  But I think we need to draw a
> >> distinction between representing DATA in a tree and representing some
> >> taxonomy as a tree.  I (and others) have agreed that large taxonomies
> >> (unless the environment is very stable and clearcut) can be a problem.
> >> (I'm not sure that tables necessarily fix those problems, however, the
> >> problems come more from the difficulty of classifying things, IMO.)
> >
> > I agree that classification is difficult. However, sets are more
> > forgiving of getting it "wrong" up front. Trees make one sweat to "get
> > it right" up front because the penalty for design mistakes is big and
> > hard to undo.
>
> I think that's probably true.  I think it's also true that very often
> the difficult things to do are the better things to do.  And I suspect
> that certain types of changes are easier with sets and harder with trees,
> but also that the reverse is true.  That is usually the case in life--
> there's always tradeoffs and balances.

Well, my general observation of my domain is that there is no
consistent herding factor that keeps things classified in or changed in
a tree-shape. There are groupings and similarities, but they are too
unpredictable to shoehorn safely into a tree.

>
> >> Here we seem to be talking about tree-shaped *data*, each datum being
> >> a step in a larger task.  It was entirely natural for you to break it
> >> down as an outline (aka tree) because that's exactly the shape of the
> >> data.
> >
> > Yes, but that is only on a small scale.
>
> Actually, the larger the scale gets, the more hierarchical **data** is
> a win, simply because you don't have to deal with all the data at once.
> You can pick the scale and branch of interest.


You are kidding, right? Can you provide that taxonomy of people I keep
asking for?

To not "have do deal with all the data at once", use queries. We've
been over this already.


>
> You KNOW this is true, since you've already agreed that DB indexes must
> use tree structure to speed access to a selected data subset.

I don't know about "must". Maybe a good non-tree indexing algorithm has
yet to be discovered. But that is kind of an internal "mechanical"
issue. What works underneath may not work on the domain level also. I
have also tentatively agreed that polymorphism may work better in
systems software than biz apps.

>
> > The flaws of trees are less of a problem at a small scale.
>
> (Part of what makes your thesis necessarily wrong is the pre-supposition
> that trees are "flawed".  They are no more flawed than sets or any other
> tool.  Until you grasp that, you're missing the point entirely.)

Sets can represent more graphs without node duplication than trees.
This is an accepted fact.

>
> > If you need to refactor code in trees, it is often fairly easy because
> > it can fit on one page/screen.
>
> Huh?  What has that to do with anything?  And again, right here, we're
> talking about hierarchical **data**, not code.
>

I don't see why it makes much of difference. Software engineering is
mostly about the "interface" to developers so that they are efficient
and can change the code easier.

>
> >> People find this useful and natural for the simple reason that it allows
> >> you to focus on the level of detail desired at the moment.  That's the
> >> whole point of an outline.
> >
> > Well, not everything divides nicely into "levels".
>
> Meaningless hand waving again.  Cite some cases, if you can.

I already cited about 6 already, include the above drink example.

> The simple
> fact is, you broke a common task into levels.  Just about any set of
> tasks naturally breaks down that way.  (In fact, I've spent this week
> working with a project leader for a coming project doing just that.  The
> project is very large, and without a hierarchical breakdown, it'd be
> beyond the capacity of any human to deal with.)

I bet you start to have cycles (or node duplication) past a certain
point. At least if happens at the code-level of task breakdown. But I
cannot inspect it to keep you honest.

>
> > Plus, one can put levels into sets if they want, but usually one finds
> > less of a need to.
>
> Yes, if one needs hierarchical data one CAN simulate it poorly with sets.
> But why?  Why not use a tool appropriately fitted to the need?
>

Because sets handle non-trees far better than trees. They are "good
enough" with trees and "good enough" with more or less random graphs.
Trees shine at trees, but get a "D" at handling random graphs. Thus,
sets are a better hedge.

Rather than have something that shines under scenario 1 but sucks eggs
at scenario 2, I would rather have the predictability and consistency
of sets.

>
> >> File systems are hierachical for the same reason.  It allows you to
> >> partition your data (files) into useful categories.  It also allows you
> >> to perform operations on a sub-set easily without needing to access or
> >> filter the rest.
> >
> > I already railed against hierarchical file systems somewhere around
> > here. I would like to see them replaced or augmented with relational
> > techniques.
>
> So every operating system designer is wrong and you're right?


Microsoft is looking into the concept and other researches have
considered it also. Part of the problem is that the hardware still has
not quite caught up. Higher abstraction requires more horsepower.

Further, sets require more training. I don't dispute that. They are
less natural for many humans, but the tradeoff is that they bend better
to real change.


> Bet not.
> Bet you'd find a "flat file" system a total nightmare.

Flat?

> What are you
> going to do, write a query every time you want to locate a file?

Better than playing clickety click down a long dark path and not being
able to change the tree structure without busting a jillion existing
path references. Plus, nice shortcuts can be built. SQL is only the tip
of the iceburg.

>
> And what do you mean by "relational techniques" for file systems?  Give
> me an example.

One could use SQL, another relational query language, and/or
Query-By-Example to find stuff. Take a look at this link for some
suggestions:

http://www.geocities.com/tablizer/sets1.htm

I can't speak for everybody, but I would rather have a relationional
file system.

>
>
> >> Companies and the military are hierarchical for a similar reason: the
> >> "higher ups" deal with the big picture, the "low downs" deal with the
> >> details.
> >
> > I also gave an actual situation around here where what initially looked
> > like a org taxonomy for budgeting was not.
>
> Non-responsive to the point, and I'm tired of your one lonely example of
> your miscalculation.  If that's all you've got, you got squat.

Do you have a survey of org structures to present as an alternative?
Stop complaining about my evidence if yours is no better.

>
> > I also know people who have two bosses.
>
> Nevertheless, there are bosses and employees--hierarchical structure.

2 bosses is *not* a tree.

> And bosses have bigger bosses and they have managers and they have VPs
> and so on up the line.  H*i*e*r*a*r*c*h*i*c*a*l!

semi

>
> >> Sets are raw, unstructured data.
> >
> > I disagree with those labels,...
>
> Why?  Show me the structure of a set.

One generally looks at one *aspect* of a set at a time. You have not
seemed to grasp this yet. There is no One Right View of sets. But, this
reflects the real world.

Sets can't compete with trees in showing a *single view* that captures
most of its essense or pattern. I surrender in that category. Here is
your sub-trophy. It is the "relativity power" that makes sets shine.
You view just about what you want to view limited only by your
imagination. The structure is not the limit, but your mind. If there is
a tree hidden in there, you can view that too.


>
> > Trees are often a "psuedo-structure": they lead you to believe
> > information is structured, but in reality things may still be
> > a mess.
>
> Wrong.  The trees have structure--that's why they're trees.  Whether the
> tree matches some reality is a whole different question.

And the most important one.

>
> It's simple and undeniable: sets have less structure than trees.  EOS.

Prove it!

>
> In fact, your thesis depends on this, for one of the main things you
> rail against IS that structure and your perception that it is artificial
> and may not match reality and may be hard to change.  Sorry, I don't
> think you can have it both ways.

Please clarify. Your statement is not clicking in. Being structured and
being easy-to-change may be generally orthogonal.

>
> > Many file hierarchies at big companies are like this: they are a
> > disaster with loooooong tangled directory paths that are nearly
> > unfixable because of all the paths used by other apps to reference stuff.
>
> Whew, there's a lot of things wrong with that sentence.....
>
> 1. Big companies don't have "file hierarchies".  I know, I've worked for
>    a Fortune 50 company for 25 years.  (They have 10s of thousands of
>    file structures--a set of them, if you will, and none the better for
>    being in a set.)

Tens of thousands? What? I mean like LANs and WANs.

>
> 2. Any machine with a file system can have long paths that are hard to
>    change due to references to files in that path (I deal with this all
>    the time).
>
> 3. Tangled is an emotional self-serving word with no reality here.
>
> 4. Regardless of the mess, IT'S STILL STRUCTURE--more than any set has.

Stop with the non-sense.

>
> 5. What's the alternative?  You have to put these files *somewhere*.
>    And things company-wide will refer to them.  Internet webpages and
>    people's bookmarks suffer the same problem, and they are closer to
>    being set-like than tree-like.

My bookmarks are not relational. I cannot readily do relational and
set-math on them.

>
>
> > Trees are for boys, sets are for men!
>
> That sort of nonsense suggests to me you're a kook with nothing of
> value to contribute.  Considering that sets are a fundamental construct,
> one can equally claim sets are for babies, trees are for adults who've
> gone to higher levels.
>
> See?  Bunch of worthless hot air.  (And only one of us is sexist.)
>
>
> >>> Do you mean natural to the human mind or natural to the universe such
> >>> that all intelligent space aliens will also dig them?
> >>
> >> Tree structures are a (universally) natural data type, so I do believe
> >> all intelligent minds will discover and use them.
> >
> > But an intelligent mind also knows when they have reached their limit.
>
> Non Sequitur.  Does this mean you agree trees are universal and will be
> discovered by any intelligence?
>
> > Trees are a poor relativism tool: they generally can't change to
> > emphasize different viewpoints. You get one view and that is it.
>
> One view per tree.  If you need multiple views, use multiple trees.
> My
> C++ environment can show me at least four tree views of the same thing:
> call tree, caller tree, derived classes tree and base classes tree. And
> several tabular views.  All very useful.

And could be implemented in a RDBMS.

>
>
> >> There are binary trees and n-ary trees.  There are trees that are
> >> balanced and unbalanced.  There are trees that require all nodes to
> >> be unique and trees that allow nodes to be repeated.  There are
> >> trees that allow nodes to reference each other (consider links in
> >> a Unix filesystem).  There are trees where the branches are allowed
> >> to reconnect.
> >
> > Those are not technically trees.
>
> They most certainly are.  I'd say they are just not certain types of
> tree, but that's what started this bit!  Suffice to say that you have
> an overly narrow view of what a tree really is:

Rather than get into a definition battle, the more you deviate from a
"pure" tree, the better off sets will be as an alternative.

>
> >> A tree is really any structure with a "root" and child nodes.
> >
> > Cross links can cause ambiguous roots and children.
>
> Nonsense.  Show me.

Having 2 bosses in a really small company. Who is the "root"?

>
> >> In most programs of my experience (30 years, dozens of languages),
> >> there is a fairly strong tree shape to the relationship between
> >> functions.
> >
> > Only at the highest level. When you get into libraries and frameworks,
> > the distinction gets fuzzy.
>
> Nonsense.  Show me.

You have duplication at the nodes. If you draw the lines between
multiple subroutine calls, you will see that you no longer have a tree.
Try it.

>
> > Plus, event-driven UI frameworks are generally not like that. One is
> > dealling with a pretty flat view of event modules.
>
> Not when viewed from outside the system, but within the system you do
> have a "forest" of "bushes".  Each bush is a mini-tree.

Please clarify. I agree that the small snippets of even code is a
mini-tree. Trees are often usable at a small scale. But each event is
not necessarily "higher" than another.

>
> >> You have your root--the code entry point--and a set of nodes that, in
> >> a well-written program--tend to proceed from high level nodes to low
> >> level nodes.
> >>
> >> A fellow I used to work with wrote a program that, given the source
> >> files for any C program CREATED the tree of routines and calls. The
> >> only thing you really have to be aware of is recursion, and that pretty
> >> much just requires not repeating an existing node (or in some call
> >> graphs I've seen, repeating it only once as a leaf).
> >
> > Interactive software usually has a lot of indirect recursion in my
> > experience.
>
> Are you sure about that?  *Recursion* in I/A software?  Cite a case.

A GUI page A where you launch page B, but click a link which opens
another instance of page A. This is common during web-browsing, I would
note.

>
> >> Or (as this fellow did) as branches that refer back to existing nodes.
> >
> > Well, it is not a true tree then.
>
> It certainly is.  The fact that his program could *draw* the tree rather
> shows that.

Like I keep saying, any graph *can* be represented as a tree if we
allow duplication in the nodes. But "can" and "should" are two
different things.

>
> > If you don't give a flying sh8t about duplication, yes, you can get a
> > bigass tree out of it. But it is often of limited use.
>
> Regardless, it IS a tree.  (Why did I just flash on Copernicus? :-)

We are going in circles here. This is about developer usability, not
what is technically possible. It is technically possible to write a
Windows OS clone in pure machine language only, but not something I
would want to do.

>
> >> And consider this: the call path of a program executing (in a single
> >> thread environment) is 100% tree-shaped (with recursion caveat).  That
> >> is, if you graphed it from start to finish, you'd end up with a huge,
> >> perfect tree.
> >
> > No! Repeat subroutine calls bust pure tree-ness.
>
> No!  The call graph is a perfect tree.  Repeat: a perfect tree.  That
> nodes repeat does not detract from this *at* *all*.

Not

>
> > It is all a matter of viewpoint. It *can* be represented as a big tree
> > with dup nodes. We both agree on that. But a pure tree has no duplicate
> > nodes.
>
> That's BS.  Show me one authority that agrees.

Connect the subroutine calls on the paper. Don't take my word for it,
get your pen out.

>
> > The tree becomes far bigger than the actual thing it is representing
> > because of the duplication.
>
> Totally False.  The call tree represents reality.

And lots of duplications of parts of reality.

>  The duplicate nodes
> are there because the function was entered multiple times.  The call
> tree is (1) a perfect tree that (2) represents the reality perfectly.
>

dup

> > That is probably why your friend's C-to-Tree tool did not catch on..
>
> You're really reaching.  First, he never sold it. Second, everyone he
> shared it with **loved** it.  This predated a lot of modern tools--he
> wrote it out of love and because there was nothing like it for us.
>
> (Actually, had he chosen to commercialize it he might have done well, but
> it's hard to sell small software tools effectively.  Had he posted it to
> a (at the time) BBS, other programmers would have loved it as much as the
> rest of us did.)

Whatever. The grand tree tool is hanging out with Elvis and Jimmy Haffa
now.

>
> > I have (semi) parsed code into databases before and
> > could have used that info to also print a giant tree...
>
> Which, in a very real sense, is exactly what his did.  He parsed the C
> code, developed tables (aka a DB) and generated a tree from that data.
>
> > A better use of such info is to query for routines, get a list of calls
> > to other routines, and click on those links as needed.
>
> (This predated clickable links by about a decade...predated Windows for
> that matter.)

Well now we grew up and have relativity-friendly set-oriented tools.

>
> Such lists were a trivial part of what his program did. Having parsed the
> code into tables, that part was simple (and was indeed part of the output).
>
> The very real win was **seeing** the tree.  That was a huge win from other
> tools that did the simple table stuff.
>

I am sure Jimmy Haffa thinks so.

>
> >> Well-written programs ARE usually tree-ish in my experience.
> >
> > Not in my experience. But it does depend on the nature of the app.
>
> More on the nature of the programmer, I'd think.  The hierachical structure
> of programs is very well established.

It was overhyped in the late 70's.

>
> > Batch processes tend to be more tree-ish than good interactive software
> > in my experience.
>
> To be blunt, your experience has low credit from where I sit.  I write far
> more interactive software than I do batch, and you're just plain wrong.

I didn't dispute that one *can* write hierarchical interactive
software. I only  suggested that it does not have to be. The best GUI's
tend to be event-driven in my experience and have no or few clear
hierarchies. If this differs from your experience, so be it.

>
> >> High level routines call low level routines.  I'm sure you've heard the
> >> terms "top down" and "bottom up".  These refer to the tree-ish-ness of
> >> program analysis and design.
> >
> > Yes, and their over-use has been attacked by various gurus, including
> > Bertrand Meyer. (Meyer suggests abandoning procedural as the solution,
> > but toning down the tree-ness is an alternative he ignores.)
>
> (OOP proponent, Meyer? :-)  Gee, I wonder why!
>
> >> But it's NOT random.  Low level routines don't call high level routines.
> >
> > Well, often routines don't fit into such a clean classification of low
> > and high level. Event-driven programming is an example.
>
> More handwaving.  Show me an actual example. I DO event-driven programming,
> and my programs definately have high level routines and low level routines
> (and many medium level routines).

Show me the hierarchy here in out-line form then.

>
>
> >> Hmmm, the "everyone else is dumb but me" line is so often the mark of a
> >> kook that I recommend you be careful about it.  And, for my money, you're
> >> just plain wrong.
> >
> > Well, that is just my frank assessment. I call it as I see it.
>
> My point is your assessment seems unreasonably skewed.  You don't seem to
> have the ability to see the situation clearly due to your extreme bias
> and your apparent lack of experience and training.
>

Lack of experience? I am a middle-aged developer. Started out on VAX's
and PRIME minicomputers. I used to be more of a tree-fan when I was
younger, I would note. Studying reality changed my viewpoint.

>
> > I am wrong about [trees], it is because I have not been given objective
> > evidence of their value.
>
> I'm not sure you are capable of hearing the evidence, seeing as how quite
> a bit has been presented in these threads.  I increasingly have this image
> of you as the child sticking his fingers in his ears so he can't hear what
> he doesn't want to hear.

Your case for a tree-shaped world is underwelming.

>
> >> Sets are raw, unstructured data.
> >
> > No, as described above. They are "unstructured" to you because you are
> > uncomfortable with them.
>
> Oh, please, that's just idiotic.  Comfort--assuming for a second that it
> were even true I could be uncomfortable with something as *basic* as a
> set or table (I've been using them for 30 years, guy)--has nothing to do
> with perceptions of structure.
>
> Sets are raw, unstructured data.  If you claim otherwise, SHOW ME.

I don't have a mathematically precise definition of "raw" nor
"unstructured". Do you?

>
>
> > I remember an ex-Disney cartoonist saying to me about 7 years ago:
> > "Screw these newfangled animation computers. I can carry pencils and
> > paints in a box and use them whenever I want without big expensive
> > heavy computers and without electricity. I don't have to reboot my
> > pencil or put anti-virus software on it."
>
> Which has what to do with what?

You snipped the context.

>
> And let's see him do SHREK, MONSTERS, INC, FINAL FANTASY, TOY STORY,
> FINDING NEMO or THE INCREDIBLES.
>
> Yeah, didn't think so.

Huh? YOU were the one against query technology, not me.

>
>
> >> That's exactly what hierarchies do.  One could just as easily claim that
> >> enlightenment involves embracing all forms of tools that allow you to
> >> mine datasets for their value.  One might also argue that the naivete is
> >> in failing to recognize the value of higher data structures.
> >
> > Please define "higher order structure".
>
> Higher: above, superior.
> Order: degree.
> Structure: a complex construction or entity.

Define "superior". Define "complex". Sounds more like vague sales-talk
or Dilbertian management-speak.

>
> To wit: a superior degree of complexity.
>
> --
> |_ CJSonnack <Chris@Sonnack.com> _____________| How's my programming? |

-T-

.