Subj : Re: Change Patterns (was: Polymorphism sucks) To : comp.programming,comp.object From : topmind Date : Fri Jul 01 2005 12:22 am > > >>>> Regardless it IS still an external system, > >>> > >>> External to what? > >> > >> The running program, of course. > > > > Back in the 286 days, small semi-relational DB's fit in about 200K. > > It has nothing to do with disk footprint. It has to do with the > comparable overhead of *managing* and *using* an external system. If it is compiled into the EXE, how is it more "external" than the string library, math library, file library, etc.? I see ZILCH difference. You are being evasive. > > > >> With GUI and File libraries, the actual answers are usually no, no and no. > >> With DBSes, the actual answers are usually probably, yes and maybe. > > > > You are obsessed with hardware and installation issues... > > Nope. It's a simple equation. External DBS solutions don't buy anything > in this case and cost more than simple "in-language" solutions. DB does not necessarily mean an external-language. However, it is poor reuse to rewrite a DB library for every new language. Fortunately, a good many languages can link to C DB libraries. (I haven't personally inspected such C libaries, but know they exist.) > >> Consider a common software component: a "blackbox" function that takes > >> a data stream of some sort, does something with/about that data, and > >> emits an output datastream. You like business examples, consider a case > >> where the component reads an XML stream of POS (Point Of Sale) info and > >> emits a report summarizing that data (this is not an imaginary example). > > > > Reports quite often cannot easily be produced in the *same order* that > > they receive input. It is good decoupling to make the report production > > independent of the data source ordering. > > Absolutely. And if there was *sufficient* data, storing it in a DB might > make sense. So might temporary files, mapped memory or any number of other > solutions. > > I would generally use a DB if the ability to search were important, or if > it were important to query for a subset of the total dataset. And that is not an uncommon need. Often one cannot know future uses of a given set of information at the start of a project. Thus, DB's are a good future-change hedge. > > >> From an engineering and maintenance point of view, not to mention from > >> an elegance and simplicity point of view, you'd like to completely > >> decouple the component from the data provider as well as from the data > >> consumer. > > > > What about decoupling from input order? > > Implicit in what I wrote above. Well, I don't see it. If you truly want to 100% decouple, then you would decouple from specific input/output ordering. > > > >> Now here's the crucial point: In OOPLs, this is trivially easy to implement > >> and is native to the language. No libraries, no IPC, no databases. Very > >> low overhead, easy to implement, native to the language, fast running.... > >> that's a hard combo to beat. > > > > I will give you speed for the sake of argument. > > What about trivially easy to implement, native to the language, no > libraries, no IPC, no database, and low overhead? If you want to forever live in the 80's and avoid powerful tools, be my guest. > > > I don't have a need to swap the output format that often anyhow. > > It was YOUR example in the first place. I don't think that is the case. IIRC it stems from one of R. Martin's scenarios where his wiki could "store" messages in RAM, DB, or files. I pointed out that only once out of 100 or so projects I worked on did I have to back out of using a DB and switch to flat files. Using poly to prevent a 1% occurrence is not a very good selling point. If your experience is different, then go with poly. Personally, I am not going to ignore my life experiences. Maybe I am a freak sample set, but it could be visa versa also. > > > >>>>> (Not all polymorphism is related to sub-typing, but most is in > >>>>> practice, at least at the domain level.) > >>>> > >>>> I disagree. I do a lot of VB programming. VB supports interface > >>>> "inheritance" but has no concept of implementation inheritance. > >>>> My polymorphic VB objects are not sub-types. > >>> > >>> Maybe not from the language's perspective, but conceptually > >>> they are usually subtypes. > >> > >> The irony is that, to the extent that's true, it reflects either the > >> innate hierarchical nature of the problem being modeled or a > >> hierarchical way of breaking it down. > > > > Innate, eh? > > Yep. You demonstrated that when you naturally produced a tree structure > in detailing a simple task. Anyone who's ever written an outline has > demonstrated that. Like I keep saying, trees are fine on a small scale. The problem is that they *don't scale*. If the taxonomy/category-system is likely to grow fairly large in either size or the number of factors, then they start to falter. But remember that for a small scale, the alternatives often work well also. > > > Quote: "[Even though...] it has been known since 1847 that > > classifications are dependent on the purpose of the classification, > > people continue to believe that it is possible to create a > > classification system that is context-independent. (Haim Kilov on > > comp.object, 6/01. Note that I consider sub-types to be > > "classifications".)" > > Quoting some guy on amUSENET is a losing game. > > But, classification does depend on the purpose. If I care about what > night tv shows are on, I classify them by day of the week. If I care > about what type of show they are, I classify them by categories such > as comedy, drama, idiot-reality, etc. I can also classify them by > my opinion of them, what studio produces them, what they cost, or how > long they've been on the air. > > All of these classification schemes would be correct. > > When you analyse a problem in order to automate it (write a program > that solves the problem), YOU HAVE A CONTEXT. And, you can have multiple contexts. One feature of the reporting systems (that you deride as a category) is that one is often presenting the same info from multiple perspectives. An executive in charge of TV programming might want reports or QBE screens that allow him/her to search, report, and graph info by time of day, content category, cost to produce, revenue, number of commercials, etc. And, often these are compared with each other to produce ratios or percentages. I am currently working on reporting systems that allow users to define their *own* categories based on combinations of existing factors. > > > >> So now, every time you dispatch a method on an object, you need > >> to go do a DBLookup() function of some sort? > >> > >> Eewwww. > > > > Better than traversing arrays of lists of arrays of.... > > Another one of your mantras. The reality is much simpler. Multiple times when I have been deprived of decent table tools, for whatever reason, I have to use such multi-structure contraptions. Look at the kind of crap people do in OOP apps to get many-to-many relationships. > > > >>>> Compare this to a system using a generic Account object. If you add a > >>>> new sub-type, that's it, you're basically done. > >>> > >>> Yes, that is the most common change scenario promoted by OO books. > >>> But it is relatively improbable in my experience. > >> > >> IT'S YOUR OWN "REAL LIFE" EXAMPLE!!!! In the other thread, you've been > >> trying to get them to accept it as a reality. Now you're disowning it? > > > > I am not sure what "other thread" you are referring to. Unfortunately, > > I wrote a lot of messages. > > But you're only participating in two here. Can't you keep them straight? I meant multiple messages and examples, not multiple topics. > > >> To quote your own words about a dozen lines later: > >> > >>> I've heard it with my own ears. No matter what taxonomy I present, > >>> you can always claim it may be fake. > >> > >> No, see, I took your ball an ran with it. Now you've turned face. > > > > Well, its anecdotes versus anecdotes. At least a reader can see that a > > bank account can potentially be both checking and savings. > > Once more: I agree and I've shown you (as did Robert) how it can be done > in an elegant, clean fashion. Now that we've done that, you want to > disavow your own example. And I have shown how that technique does not scale. Yes, if that was the ONLY change, then a hybrid account "type" may do it. One cannot know up front whether that will be the only change. OO'ers often brag about how OO makes software easy to change, but one can objectively show how introducing more factors and removing mutual exclusiveness on the "lists" of sub-categories would be more code rework in OO. (It should be obvious, so hopefully we don't have to go thru such an exercise.) I am unclear whether you are suggesting that things will stay small, and thus hybridization will be "sufficient", or whether you are rejecting the idea that hybridization does not scale. If the first (stay simple), then competitors, such as case statements, will not be a problem either. Either way, you are between a rock and a hard-place as far as scale and change. Large complex trees with *only* tree-wise changes are almost non-existent. > > > >> The hierarchy of positions in my company hasn't changed in decades > >> (CEO, Senior VPs, VPs, etc on down the line). Nor has the hierarchy > >> of business units changed significantly. > > > > I already gave an anecdote of such breaking down during a budgeting > > summary app. > > Which, from your description, you misunderstood the natural design from > the beginning and had to re-do it. IIRC, it was the user, an accountant, who decided to move away from a tree when she or her boss was not happy with the first round of output. It was not careless analysis on my part. People change their mind, and those changes are not always tree-shaped. For the sake of argument, even if it was due to careless analysis, sets are a better hedge against such mistakes. Trees degenerately more poorly than sets do tree-ness if they have to. Sets are simply more forgiving. > > Also, is this your only example? IIRC, I gave about 5 examples. > I've given you an awful lot of > counter-examples upon which you seem mute. I don't remember any good ones. You seemed to simply deny or question my anecdotes. > > >> BOMs are nicely handled hierarchically (and, in fact, would be very > >> unwieldy if handled as flat lists) > > > > I never proposed a "flat list". (How does a flat list differ from a > > non-flat list?) > > (It's not flat....it's hierarchical, like an outline or BOM.) > > > And tree BOM's have long known to have rough spots. For > > example, accounting may want to search and sort by multiple factors; > > Searching trees is trivially easy. Bull. Only if you use the tree's main factors. The "most recent files regardless of directory" example I already gave is an example. Let's see you search for the largest or smallest parts in a BOM tree. > > > factors probably different than what a material specialist would want > > to see. A material specialist may not really care how stuff is > > connected. > > And would not likely be working from a BOM. So we have to duplicate the information to see it different ways? Not very good factoring there. > > > There is NO One Right Taxonomy for non-trivial stuff. > > A BOM isn't a taxonomy, it's a hierarchical list. Either way, similar problems creep in. > This part of this > discussion is in rebuttal to your claim that real life doesn't have > many tree structures. It doesn't. A tree BOM is a half-lie. > > >> CAD/CAM hierarchies are, well, hierarchical, and thankfully so. > > > > If bolt C holds part A and part B together, does C belong to part A or > > part B? If you answer "both", you just broke away from a pure tree. > > Congratulations! > > (Are you 12? Sheeze. Try to act like an adult.) If I wanted to act 12, I would've said, "Trees suck, neener neener! Smart people use sets." > > A, B and C are typically all children of an assembly. > If everything is connected, where does one assembly start and another end? Suppose we have a chassis and many things are bolted to a chassis. What are the bolt's parents? > > >> Hierarchies and fractals have a lot in common in that there are self- > >> similar *levels* and in the concept of "drilling down" (or "zooming > >> back"). The tree structure--as much as you try to find fault with > >> it--IS a fundamental, natural relationship. > > > > Wrong. It is a "useful lie" on a small scale, a barely tolerable lie on > > a medium scale, and boat heading for Disaster Island on a large scale. > > I see your claim and hand waving. Now let's see some analysis or cogent > thinking to back it up. > Back it up? The default is chaos. If you claim tree-ness, you have to show tree-ness. You claim a pattern, you have to demonstrate that pattern. > > >> Databases fail to efficiently model trees, because trees are "triangular" > >> in nature. Databases are tabular, or "rectangular", in nature. These > >> are mutually exclusive natures. In places where one shines, the other > >> usually does not, and vice versa. > > > > RDB's are whatever shape you want. > > Not at their core. Each table is 2D. Each query returns a 2D dataset. > You can build upon that, but at their root, databases are 2D. Wrong. They are not bound to any dimension limit. The 2D thing is how we typically present them on paper, but tables can represent hundreds of dimensions. > > > RDB's can represent trees also,... > > Inefficiently. You are right that they won't be able to do tree operations as fast as a tree-only system. But DB's are more general purpose so that they can handle new "shapes" without painful overhauls. > > > ...but trees can't do non-trees very easily. > > Of course not, nor should they. So what if a tree degenerates to a non-tree? Pay for a huge overhaul? > > > >> If you mean some branches re-connect, that doesn't stop it from being > >> a tree (just stops it from being certain kinds of tree). > > > > A tree with duplicate nodes. Genetic copy-and-paste. > > No, not duplicate nodes. Nodes that connect. If the connections are cross-branch, it is not a tree. Remember, everybody has/had a mother and father. (Gay births have not been perfected yet.) Strict trees have only one parent. > > >> There are all sorts of trees. And on the small scale, such as when you > >> graph the genealogy of a family, it is entirely treelike. > > > > No, it is not. It may appear tree-like on a small scale, but as you add > > more nodes it is less and less tree-like. It is a directed graph, not a > > tree. > > It's a *type* of directed graph, yes. A d/g is more general than a tree, But it is *not* a tree. > and many trees *are* directed graphs. > > >> Now imagine you have all the power of SQL, PLUS > >> part of your filtering ability is to restrict the search to a subset of > >> records (verses the entire dataset). > >> > >> Wouldn't you call that a win? > > > > Please clarify. Query languages *already* provide filtering ability. > > SELECT * > FROM table_a > WHERE (CREATED >= '2005-06-01') AND (CREATED < '2005-07-01') > > Needs to examine every record in the database to produce the query, right? Not necessarily. Read about "indexes". BTW, SQL has a BETWEEN clause that simplifies ranges. A nice thing about indexes is that you can add them without changing existing queries. > > Now imagine a magical database that is hierarchical with a root, a first > level of "By Year" and a second level of "By Month". > > SELECT * FROM table_a.2005.June > > Only has to search a subset of the total. Databases can take shortcuts on frequently-used factors by using indexes. The difference is that trees MUST use pre-determined factors. Databases allow shortcuts, but easily allow more factors. > > > >> Yes, really. I can trace a unique path from me to the CEO. It's 100% > >> tree shaped. As for armies, have you ever heard the phrase "chain of > >> command"? What do you think that involves, if not a tree? > > > > Most personnel org representations are in RDBMS in practice. > > You're losing focus. Remember, we were discussing where there are natural > trees in real life. How are you defining "natural"? > > > Why? > > Because you need to store stuff *somewhere*. And one of the reasons you > use an *R*DB here is that the actual structure you're modeling IS tree > shaped. You leverage normalization to avoid the duplication of data that > would occur in a non relational DB. I don't remember the context of this answer. I'll have to dig through the history to answer on the weekend. > > > Because many of the searches and processes are not by tree paths even > > if some are. The Hedge Factor again. If you are looking at who to > > blame, a tree VIEW makes sense. > > Of course, because the reality is tree shaped. > > > But if you are trying to figure out if there is a copper processing > > expert in Cleveland, the tree is mostly moot. > > Actually, the tree is very useful, since it lets me look ONLY in > Cleveland, rather than everywhere in the world. Perhaps, but says nothing about copper processors. > > >> That's true of software construction in general. It's an extremely complex > >> task, intellectually on par, I believe, with law and medicine. All three > >> fields, in their daily practice, break new ground (like researchers), but > >> also need to produce approximately immediate useful results (not like > >> researchers!). And all three fields deal with a large network of inter- > >> connected data that often reacts holistically. > > > > In that case, why not use a tool that handles > > unexpected patterns better than trees? > > In places where the problem isn't hierarchical, I do. But one cannot know ahead of time what will stay hierarchical and what will not. > > > >>>> I'm coming to understand you're an evangelist. I don't do evangelists. > >>> > >>> Well, you sure did a damned good *impression* of one. > >> > >> If you pay attention, you'll see I'm standing here in the middle of the > >> road. You're the one over there in the ditch trying to baptize everyone. > > > > No, you are not. You are a tree evangelist. A tree hugger, so to speak. > > Nope. Just debunking your evangelism. I use trees when they make sense, > and I use tables where THEY make sense. I'm not partial to either. You are too focused on *initial* requirements. Remember, things change. And, sometimes our initial analysis is wrong. Sets are more forgiving of classification foopahs. -T- .