Subj : Re: Polymorphism sucks [Was: Paradigms which way to go?]
To   : comp.programming,comp.object
From : Chris Sonnack
Date : Wed Jul 13 2005 12:05 am

topmind writes:

> Here is an example borrowed from c2.com. We taxonomize drinks by
> making each orthogonal factor a level:

That might be the taxonomy, but it's a piss poor OOD.

> DRINKS
> 	Coke
> 	--Diet
> 	----Caffeinated
> 	----No Caffeine
> 	--Regular
> 	----Caffeinated
> 	----No Caffeine
> 	Lemon-Lime soda
> 	--Diet
> 	----Caffeinated
> 	----No Caffeine
> 	--Regular
> 	----Caffeinated
> 	----No Caffeine
> 	Iced Tea
> 	--Diet
> 	----Caffeinated
> 	----No Caffeine
> 	--Regular
> 	----Caffeinated
> 	----No Caffeine

The rather obvious observation is that state of caffeination is a
single quality--perhaps a boolean value, but better yet a quantity
(amount of caffeine, from 0 to OHMYGODIMWIRED).  Secondly, since
this quality (or "factor" if you prefer) appears in every drink,
it belongs at the top to be inherited by all children.

> Any one of the three factors (flavor, dietness, caffeiness) could be
> put at the top level, but we cannot put them all there.

Of course you can.  The reality is that a drink--in the abstract--has
each of those factors.  Therefore they arguably *belong* at the top.

+Drink
  {caffeination-level: 0-OHMYGODIMWIRED}
  {sweetener-type: none|sugar|Nutrasweet|...}
  {flavor: none|cola|lemon-lime|tea|coffee|...}

The taxonomy just expresses the combinations.  So far, there's really
no need to even bother with sub-types.  Whether it makes sense to create
them depends on your mission statement--what problem are you trying to
solve.  (The reality is they're ALL just different sorts of water.)

If you wanted to, you could use sub-typing in which each of the three
qualities had a fixed value depending on type.  So:

+Drink
  +Water {caff: 0}{sweetener: none}{flavor: none}
  +Cola
    +Classic Coke {caff: 21}{sweetner: sugar}{flavor: cola}
    +Lemon Coke {caff: 21}{sweetner: sugar}{flavor: cola+lemon}
    +Diet Coke {caff: 21}{sweetner: Nutrasweet}{flavor: cola}
    +Diet Lemon Coke {caff: 21}{sweetner: Nutrasweet}{flavor: cola+lemon}
  +(etc)

What this buys you is the ability to pass around "Drink" objects that know
what they really are and can behave according to their qualities.  So when
you instance a [Classic Coke] object, it automatically has the qualities
of Classic Coke, but can be passed around like any Drink.

Imagine I have a collection of Drink objects and I want to list them in
order by the amount of caffeine.  simple, since they all have a common
interface that lets me ask for the caffeine content.


>> I think the reason is fairly simple.  But I think we need to draw a
>> distinction between representing DATA in a tree and representing some
>> taxonomy as a tree.  I (and others) have agreed that large taxonomies
>> (unless the environment is very stable and clearcut) can be a problem.
>> (I'm not sure that tables necessarily fix those problems, however, the
>> problems come more from the difficulty of classifying things, IMO.)
> 
> I agree that classification is difficult. However, sets are more
> forgiving of getting it "wrong" up front. Trees make one sweat to "get
> it right" up front because the penalty for design mistakes is big and
> hard to undo.

I think that's probably true.  I think it's also true that very often
the difficult things to do are the better things to do.  And I suspect
that certain types of changes are easier with sets and harder with trees,
but also that the reverse is true.  That is usually the case in life--
there's always tradeoffs and balances.

>> Here we seem to be talking about tree-shaped *data*, each datum being
>> a step in a larger task.  It was entirely natural for you to break it
>> down as an outline (aka tree) because that's exactly the shape of the
>> data.
> 
> Yes, but that is only on a small scale.

Actually, the larger the scale gets, the more hierarchical **data** is
a win, simply because you don't have to deal with all the data at once.
You can pick the scale and branch of interest.

You KNOW this is true, since you've already agreed that DB indexes must
use tree structure to speed access to a selected data subset.

> The flaws of trees are less of a problem at a small scale.

(Part of what makes your thesis necessarily wrong is the pre-supposition
that trees are "flawed".  They are no more flawed than sets or any other
tool.  Until you grasp that, you're missing the point entirely.)

> If you need to refactor code in trees, it is often fairly easy because
> it can fit on one page/screen.

Huh?  What has that to do with anything?  And again, right here, we're
talking about hierarchical **data**, not code.


>> People find this useful and natural for the simple reason that it allows
>> you to focus on the level of detail desired at the moment.  That's the
>> whole point of an outline.
> 
> Well, not everything divides nicely into "levels".

Meaningless hand waving again.  Cite some cases, if you can.  The simple
fact is, you broke a common task into levels.  Just about any set of
tasks naturally breaks down that way.  (In fact, I've spent this week
working with a project leader for a coming project doing just that.  The
project is very large, and without a hierarchical breakdown, it'd be
beyond the capacity of any human to deal with.)

> Plus, one can put levels into sets if they want, but usually one finds
> less of a need to.

Yes, if one needs hierarchical data one CAN simulate it poorly with sets.
But why?  Why not use a tool appropriately fitted to the need?


>> File systems are hierachical for the same reason.  It allows you to
>> partition your data (files) into useful categories.  It also allows you
>> to perform operations on a sub-set easily without needing to access or
>> filter the rest.
> 
> I already railed against hierarchical file systems somewhere around
> here. I would like to see them replaced or augmented with relational
> techniques.

So every operating system designer is wrong and you're right?  Bet not.
Bet you'd find a "flat file" system a total nightmare.  What are you
going to do, write a query every time you want to locate a file?

And what do you mean by "relational techniques" for file systems?  Give
me an example.


>> Companies and the military are hierarchical for a similar reason: the
>> "higher ups" deal with the big picture, the "low downs" deal with the
>> details.
> 
> I also gave an actual situation around here where what initially looked
> like a org taxonomy for budgeting was not.

Non-responsive to the point, and I'm tired of your one lonely example of
your miscalculation.  If that's all you've got, you got squat.

> I also know people who have two bosses.

Nevertheless, there are bosses and employees--hierarchical structure.
And bosses have bigger bosses and they have managers and they have VPs
and so on up the line.  H*i*e*r*a*r*c*h*i*c*a*l!

>> Sets are raw, unstructured data.
> 
> I disagree with those labels,...

Why?  Show me the structure of a set.

> Trees are often a "psuedo-structure": they lead you to believe
> information is structured, but in reality things may still be
> a mess.

Wrong.  The trees have structure--that's why they're trees.  Whether the
tree matches some reality is a whole different question.

It's simple and undeniable: sets have less structure than trees.  EOS.

In fact, your thesis depends on this, for one of the main things you
rail against IS that structure and your perception that it is artificial
and may not match reality and may be hard to change.  Sorry, I don't
think you can have it both ways.

> Many file hierarchies at big companies are like this: they are a
> disaster with loooooong tangled directory paths that are nearly
> unfixable because of all the paths used by other apps to reference stuff.

Whew, there's a lot of things wrong with that sentence.....

1. Big companies don't have "file hierarchies".  I know, I've worked for
   a Fortune 50 company for 25 years.  (They have 10s of thousands of
   file structures--a set of them, if you will, and none the better for
   being in a set.)

2. Any machine with a file system can have long paths that are hard to
   change due to references to files in that path (I deal with this all
   the time).

3. Tangled is an emotional self-serving word with no reality here.

4. Regardless of the mess, IT'S STILL STRUCTURE--more than any set has.

5. What's the alternative?  You have to put these files *somewhere*.
   And things company-wide will refer to them.  Internet webpages and
   people's bookmarks suffer the same problem, and they are closer to
   being set-like than tree-like.


> Trees are for boys, sets are for men!

That sort of nonsense suggests to me you're a kook with nothing of
value to contribute.  Considering that sets are a fundamental construct,
one can equally claim sets are for babies, trees are for adults who've
gone to higher levels.

See?  Bunch of worthless hot air.  (And only one of us is sexist.)


>>> Do you mean natural to the human mind or natural to the universe such
>>> that all intelligent space aliens will also dig them?
>>
>> Tree structures are a (universally) natural data type, so I do believe
>> all intelligent minds will discover and use them.
> 
> But an intelligent mind also knows when they have reached their limit.

Non Sequitur.  Does this mean you agree trees are universal and will be
discovered by any intelligence?

> Trees are a poor relativism tool: they generally can't change to
> emphasize different viewpoints. You get one view and that is it.

One view per tree.  If you need multiple views, use multiple trees.  My
C++ environment can show me at least four tree views of the same thing:
call tree, caller tree, derived classes tree and base classes tree. And
several tabular views.  All very useful.


>> There are binary trees and n-ary trees.  There are trees that are
>> balanced and unbalanced.  There are trees that require all nodes to
>> be unique and trees that allow nodes to be repeated.  There are
>> trees that allow nodes to reference each other (consider links in
>> a Unix filesystem).  There are trees where the branches are allowed
>> to reconnect.
> 
> Those are not technically trees.

They most certainly are.  I'd say they are just not certain types of
tree, but that's what started this bit!  Suffice to say that you have
an overly narrow view of what a tree really is:

>> A tree is really any structure with a "root" and child nodes.
> 
> Cross links can cause ambiguous roots and children.

Nonsense.  Show me.

>> In most programs of my experience (30 years, dozens of languages),
>> there is a fairly strong tree shape to the relationship between
>> functions.
> 
> Only at the highest level. When you get into libraries and frameworks,
> the distinction gets fuzzy.

Nonsense.  Show me.

> Plus, event-driven UI frameworks are generally not like that. One is
> dealling with a pretty flat view of event modules.

Not when viewed from outside the system, but within the system you do
have a "forest" of "bushes".  Each bush is a mini-tree.

>> You have your root--the code entry point--and a set of nodes that, in
>> a well-written program--tend to proceed from high level nodes to low
>> level nodes.
>>
>> A fellow I used to work with wrote a program that, given the source
>> files for any C program CREATED the tree of routines and calls. The
>> only thing you really have to be aware of is recursion, and that pretty
>> much just requires not repeating an existing node (or in some call
>> graphs I've seen, repeating it only once as a leaf).
> 
> Interactive software usually has a lot of indirect recursion in my
> experience.

Are you sure about that?  *Recursion* in I/A software?  Cite a case.

>> Or (as this fellow did) as branches that refer back to existing nodes.
> 
> Well, it is not a true tree then.

It certainly is.  The fact that his program could *draw* the tree rather
shows that.

> If you don't give a flying sh8t about duplication, yes, you can get a
> bigass tree out of it. But it is often of limited use.

Regardless, it IS a tree.  (Why did I just flash on Copernicus? :-)

>> And consider this: the call path of a program executing (in a single
>> thread environment) is 100% tree-shaped (with recursion caveat).  That
>> is, if you graphed it from start to finish, you'd end up with a huge,
>> perfect tree.
> 
> No! Repeat subroutine calls bust pure tree-ness.

No!  The call graph is a perfect tree.  Repeat: a perfect tree.  That
nodes repeat does not detract from this *at* *all*.

> It is all a matter of viewpoint. It *can* be represented as a big tree
> with dup nodes. We both agree on that. But a pure tree has no duplicate
> nodes.

That's BS.  Show me one authority that agrees.

> The tree becomes far bigger than the actual thing it is representing
> because of the duplication.

Totally False.  The call tree represents reality.  The duplicate nodes
are there because the function was entered multiple times.  The call
tree is (1) a perfect tree that (2) represents the reality perfectly.

> That is probably why your friend's C-to-Tree tool did not catch on..

You're really reaching.  First, he never sold it. Second, everyone he
shared it with **loved** it.  This predated a lot of modern tools--he
wrote it out of love and because there was nothing like it for us.

(Actually, had he chosen to commercialize it he might have done well, but
it's hard to sell small software tools effectively.  Had he posted it to
a (at the time) BBS, other programmers would have loved it as much as the
rest of us did.)

> I have (semi) parsed code into databases before and
> could have used that info to also print a giant tree...

Which, in a very real sense, is exactly what his did.  He parsed the C
code, developed tables (aka a DB) and generated a tree from that data.

> A better use of such info is to query for routines, get a list of calls
> to other routines, and click on those links as needed.

(This predated clickable links by about a decade...predated Windows for
that matter.)

Such lists were a trivial part of what his program did. Having parsed the
code into tables, that part was simple (and was indeed part of the output).

The very real win was **seeing** the tree.  That was a huge win from other
tools that did the simple table stuff.


>> Well-written programs ARE usually tree-ish in my experience.
> 
> Not in my experience. But it does depend on the nature of the app.

More on the nature of the programmer, I'd think.  The hierachical structure
of programs is very well established.

> Batch processes tend to be more tree-ish than good interactive software
> in my experience.

To be blunt, your experience has low credit from where I sit.  I write far
more interactive software than I do batch, and you're just plain wrong.

>> High level routines call low level routines.  I'm sure you've heard the
>> terms "top down" and "bottom up".  These refer to the tree-ish-ness of
>> program analysis and design.
> 
> Yes, and their over-use has been attacked by various gurus, including
> Bertrand Meyer. (Meyer suggests abandoning procedural as the solution,
> but toning down the tree-ness is an alternative he ignores.)

(OOP proponent, Meyer? :-)  Gee, I wonder why!

>> But it's NOT random.  Low level routines don't call high level routines.
> 
> Well, often routines don't fit into such a clean classification of low
> and high level. Event-driven programming is an example.

More handwaving.  Show me an actual example. I DO event-driven programming,
and my programs definately have high level routines and low level routines
(and many medium level routines).


>> Hmmm, the "everyone else is dumb but me" line is so often the mark of a
>> kook that I recommend you be careful about it.  And, for my money, you're
>> just plain wrong. 
> 
> Well, that is just my frank assessment. I call it as I see it.

My point is your assessment seems unreasonably skewed.  You don't seem to
have the ability to see the situation clearly due to your extreme bias
and your apparent lack of experience and training.


> I am wrong about [trees], it is because I have not been given objective
> evidence of their value.

I'm not sure you are capable of hearing the evidence, seeing as how quite
a bit has been presented in these threads.  I increasingly have this image
of you as the child sticking his fingers in his ears so he can't hear what
he doesn't want to hear.

>> Sets are raw, unstructured data.
> 
> No, as described above. They are "unstructured" to you because you are
> uncomfortable with them.

Oh, please, that's just idiotic.  Comfort--assuming for a second that it
were even true I could be uncomfortable with something as *basic* as a
set or table (I've been using them for 30 years, guy)--has nothing to do
with perceptions of structure.

Sets are raw, unstructured data.  If you claim otherwise, SHOW ME.


> I remember an ex-Disney cartoonist saying to me about 7 years ago:
> "Screw these newfangled animation computers. I can carry pencils and
> paints in a box and use them whenever I want without big expensive
> heavy computers and without electricity. I don't have to reboot my
> pencil or put anti-virus software on it."

Which has what to do with what?

And let's see him do SHREK, MONSTERS, INC, FINAL FANTASY, TOY STORY,
FINDING NEMO or THE INCREDIBLES.

Yeah, didn't think so.


>> That's exactly what hierarchies do.  One could just as easily claim that
>> enlightenment involves embracing all forms of tools that allow you to
>> mine datasets for their value.  One might also argue that the naivete is
>> in failing to recognize the value of higher data structures.
> 
> Please define "higher order structure".

Higher: above, superior.
Order: degree.
Structure: a complex construction or entity.

To wit: a superior degree of complexity.

-- 
|_ CJSonnack <Chris@Sonnack.com> _____________| How's my programming? |
|_ http://www.Sonnack.com/ ___________________| Call: 1-800-DEV-NULL  |
|_____________________________________________|_______________________|

.