[HN Gopher] The case against dual axis charts (and what to use i...
       ___________________________________________________________________
        
       The case against dual axis charts (and what to use instead) (2018)
        
       Author : Leftium
       Score  : 77 points
       Date   : 2024-05-17 16:30 UTC (6 hours ago)
        
 (HTM) web link (blog.datawrapper.de)
 (TXT) w3m dump (blog.datawrapper.de)
        
       | cs702 wrote:
       | I'm 100% in agreement.
       | 
       | To anyone here who thinks plots with two different scales in the
       | same direction sometimes are appropriate:
       | 
       | Please read this.
       | 
       | ---
       | 
       | EDIT: Changed "dual axis charts" to "plots with two different
       | scales in the same direction," which more accurately describes
       | the OP's topic.
        
         | Scene_Cast2 wrote:
         | What about Bode plots?
        
           | cs702 wrote:
           | I have nothing against them. Please note, I edited my comment
           | to change "dual axis charts" (common spreadsheet terminology)
           | to "plots with two different scales in the same direction,"
           | which more accurately describes the plots with which the OP
           | -- and I -- disagree.
        
         | isidor3 wrote:
         | Plotting the average or top percentile latency of an API on the
         | left axis and the number of calls to that API on the right is
         | pretty much standard practice where I work. I would argue it
         | makes things more clear. You get to see exactly how the latency
         | changed as the traffic does, or where more noise is visible
         | because the traffic was low.
         | 
         | Because both scales are using completely different units it's
         | more difficult to confuse the two.
        
       | Leftium wrote:
       | Found this while trying to create an observable plot with
       | multiple scales[1][2].
       | 
       | I'd argue multiple scales are OK if the multiple axes have
       | different units that can't be easily compared/confused and are
       | used for greater information density (instead of relative
       | comparison purposes).
       | 
       | For example: I'd like to plot weather stats like hourly
       | temperature, precipitation, and AQI throughout the day, so
       | several different days can be compared with each other. (And fit
       | all this information on a mobile screen.)
       | 
       | [1]: https://github.com/observablehq/plot/issues/147
       | 
       | [2]: https://github.com/observablehq/plot/discussions/626
        
         | bradford wrote:
         | I was coming here to say something similiar.
         | 
         | The article only shows examples of dual axis charts where line
         | series are used for both axes. This will clearly cause
         | confusion (especially when tooltips are not available).
         | 
         | I've generally found that when displaying a percentage, it is
         | helpful to show the individual counts for
         | numerator/denominator. I believe that showing percentage as a
         | line series on one axis, and raw counts, represented as a
         | column on the other axis, can be a helpful visual.
        
       | o0-0o wrote:
       | Great article. I used to be into charting a lot and ran a
       | charting product at a famous firm. Would love to see the thoughts
       | of the author on other charts like radar and treemap. :) Great
       | read.
        
         | einpoklum wrote:
         | We could really use your help in LibreOffice! I mean, if you
         | are still into coding. There is quite a bunch of work to do on
         | charting:
         | 
         | Desired chart enhancements in general
         | https://bugs.documentfoundation.org/showdependencytree.cgi?i...
         | 
         | Desired additional chart types
         | https://bugs.documentfoundation.org/showdependencytree.cgi?i...
        
       | goldemerald wrote:
       | Solution 4 is so hilariously bad I am shocked it was suggested.
       | Building a 2d landscape where the time dimension seems to take a
       | random walk made laugh a lot. Ignoring the standard convention of
       | "independent variable on x-axis" and instead embedding it as
       | datapoints is a particularly clever way to obfuscate the data and
       | confuse the reader.
        
         | sixthlime wrote:
         | I thought so at first too, but if you look at the link they
         | included [1] it seems like it can actually be quite clear for
         | some datasets
         | 
         | [1]
         | https://archive.nytimes.com/www.nytimes.com/imagepages/2010/...
        
         | msm_ wrote:
         | I don't agree. It's a great way to visualise data when you want
         | to focus on a trend. It makes it very obvious which "direction"
         | is the data heading. But of course it is not very often used,
         | is not a great fit for every use case (in particular, bad for
         | the data in OP) and may be confusing when seen first time.
        
       | einpoklum wrote:
       | The first couple of arguments are weak:
       | 
       | 1. It's possible to mislead by playing with a single series'
       | scale, you don't need two series to lie-with-statistics...
       | 
       | 2. The argument that people will think the data are identical
       | despite the different scale? Don't buy it.
        
         | parpfish wrote:
         | Agreed. dual scale plots are a great visualization to emphasize
         | correlation between time series.
         | 
         | I think of it as depicting an intermediate step in computing a
         | Pearson R when the data have been z-scored but before you've
         | collapsed across data points
        
         | hex4def6 wrote:
         | As an engineer with an oscilloscope, not being able to plot two
         | probes against each other on the same chart would be severely
         | limiting.
         | 
         | For instance, imagine a 10x attenuator / amplifier. Maybe the
         | input has a DC offset. Being able to plot the two against each
         | other to look for (e.g) distortion, is invaluable. This is
         | committing the two cardinal sins (judging from some comments
         | here) of not starting at zero and different scales, and yet
         | it's not misleading at all.
         | 
         | I can believe dual axis charts allow misleading results, but
         | that doesn't mean they don't have completely legitimate uses.
        
         | tofof wrote:
         | And the possibility of fiddling with scale to mislead still
         | exists with side-by-side charts, their #1 alternative. In fact,
         | they use the same misleading scale start and stop points as
         | they criticize in the dual-axis version, so that the "one went
         | up 80%, the other went up only 40%, but it looks like they went
         | up equally" still applies to their replacement.
        
       | ajuc wrote:
       | In case of German GDP vs Global GDP I'd argue the correct thing
       | to do is to draw a graph of a new variable "German GDP as a
       | percentage of Global GDP" and a separate graph of Global GDP.
        
       | pasc1878 wrote:
       | I would be the one in the sample who did not find the charts
       | confusing.
       | 
       | The two separate graphs are much more difficult to compare - you
       | can't see which elements compare to the same year so lose a lot
       | of information.
       | 
       | The information in the chart is if there is a change in one time
       | series is there a change in the other. - that is probably all you
       | can infer as without error bars you can't see if the differences
       | are material. (ie I know they are different scales so when they
       | cross they obviously aren't the same.) If so there might be a
       | correlation which might be worth looking into remembering
       | correlation does not equal causation (so the example in the link
       | are just laughable)
       | 
       | The prioritisation just shows nothing.
       | 
       | The scatterplot shows nothing
       | 
       | The indexed chart does make sense and in this case I would agree
       | would be better.
        
         | ericpauley wrote:
         | The connected scatter plot is so cursed... talk about hard to
         | interpret!
         | 
         | The only conclusion that's hard to argue with here is zero-
         | indexing plots, but that's not exactly a new finding.
        
           | pasc1878 wrote:
           | Also the scaling - in this case the original had reasonable
           | scaling but it can be manipulated. The changes could be small
           | enough to be random fluctuations on one series and so no real
           | match.
           | 
           | However the graph does show that a slightly deeper look would
           | be worthwhile - even if it is a very quick one to see that
           | the data is manipulated e.g. climate deniers graphs of
           | temperature all starting on the same year. If you change the
           | starting year you got rather different results.
        
           | parpfish wrote:
           | I could see that connected scatter MAYBE working if it were
           | an animation or interactive plot. Maybe.
           | 
           | But on its own, it's horrid
        
         | slow_typist wrote:
         | The problem with indexed charts is that the base year is
         | arbitrarily set and can change the whole picture a lot.
        
       | jrd259 wrote:
       | I'd argue that the zero value should always be shown. Otherwise
       | you get different impressions of the rate depending how you scale
       | and subset the Y axis.
        
         | PheonixPharts wrote:
         | This is not a good practice at all. Do you think atmospheric
         | CO2 charts should show 0? How about daily temperature reading
         | for human body temperature? Should daily stock tickers all
         | start at 0?
         | 
         | Why is 0 magical?
         | 
         | Adding 0 to the vast majority of plots shows that data at an
         | unnatural scale that can obscure genuinely important
         | information. Human body temperature readings on a scale from 0
         | to 107F would make all the important information hard to see.
         | 
         | A much better rule is that charts should have reasonable bounds
         | based on knowledge of the system. For human temperature in F
         | anything less that 95 and greater than 107 basically mean
         | you're dead. For processes in nature good points are some delta
         | - the lowest record to delta + highest recorded. For things
         | like daily stock prices, a few standard deviations each way
         | from historic volatility works.
         | 
         | The dogma that charts should all start at 0 is complete
         | nonsense and tries to side step reasoning about you data. Yes
         | scales can be used to misrepresent data, but forcing 0 to the
         | axis does not solve this.
        
           | yau8edq12i wrote:
           | Fahrenheit is not an absolute scale, so there is nothing
           | special about 0F, you're right about that. As for your other
           | two examples (atmospheric CO2 and stock tickers)... Yes, the
           | scale should start at 0. Why shouldn't they?
        
             | parpfish wrote:
             | Because starting at zero can cause scaling issues that mask
             | meaningful trends and variation. That can also be abused to
             | mislead, but a simple rule like "always include zero" ain't
             | the solution to that.
        
               | jrd259 wrote:
               | All fair points about zero. Sorry, I acknowledge now I
               | was overly influenced my metrics dashboards I use for
               | alerting. I've seen people panic at a seeming steep rise
               | in error rate or increase in latency because the chart
               | was not showing the full range (0 to 1 for rates, or 0 to
               | 2x SLA for latency). I was only thinking of operational
               | alerting dashboards.
        
             | PheonixPharts wrote:
             | > Fahrenheit is not an absolute scale
             | 
             | So if someone showed body temperature measured in _Kelvin_
             | you would argue that it _should start at 0_? That seems
             | even more ridiculous.
             | 
             | > Why shouldn't they?
             | 
             | Because for the vast majority of stock it would appear to
             | be a straight line every single day? Can you find me a
             | example of a stock trading app for a company who's price is
             | > $100/share that shows intraday price activity on a zero
             | scale?
             | 
             | Likewise most co2 charts start around 300ppm since that has
             | been roughly where the lower bound of atmospheric co2
             | levels have been for all of human history.
             | 
             | The last time co2 was 0 on the planet earth it was just a
             | molten rock so what's the _meaning_ of showing this value?
             | It 's not even theoretically possible that co2 could be
             | that low baring alien life sucking the atmosphere off the
             | planet.
             | 
             | Can you clarify why the scale _should_ start at 0 for these
             | things? How is that anywhere close to an honest
             | representation?
        
             | hex4def6 wrote:
             | In that case, we should report body temperature in Kelvin.
             | However, now the dead-alive range (95degF - 107degF)
             | becomes 308K to 315K.
             | 
             | Starting at zero, that range (17K) is now only 5% of the
             | graph, assuming we start at zero. Or in other words, if
             | your chart is 10cm tall, the entirety of the useful range
             | is compressed into a space that is 5mm tall.
        
           | vharuck wrote:
           | Yes. Charts are communication devices. Any "rules" for charts
           | should be seen like similar "rules" for essays or emails:
           | good advice that almost always gives a satisfactory result
           | when followed. Reliable paths for infrequent authors.
           | 
           | But what matters most in charts is the same thing that
           | matters most with writing: pick one major point and stick to
           | it (if you're really good or can't avoid it, maybe a couple
           | points). This also explains why a lot of dual-axis charts
           | don't work: the author explains two sets of data that aren't
           | even measured on the same scale and then leaves the reader to
           | connect them _and_ understand the meaning of that connection.
           | You can 't be sure the reader will end up at the point you
           | wanted to make.
           | 
           | That's not to say a dual-axis chart is always the wrong
           | choice. Just that, if you start making one, stop and ask if
           | there isn't a better way to show the data. Same with pie
           | charts.
        
       | rossdavidh wrote:
       | So, it's great that they try to actually get data on what kinds
       | of charts convey what information. However, you need to know who
       | your audience is. I, for example, found all of their suggested
       | alternatives to be harder to interpret than the dual axis chart.
       | If you're trying to see whether or not the ups and downs of two
       | different variables are similar, suggesting a connection between
       | the two, none of the suggested alternatives do as good a job
       | (although two charts could, if instead of having them side by
       | side you had them one above the other, with the same x-axis
       | scale, but that is really just a stealth dual axis chart).
       | 
       | Most of these "don't use this kind of chart" seems to be trying
       | to make it impossible to confuse or mislead your audience, and
       | that is just not plausible. You do, and probably usually should,
       | have some point in mind when you are showing someone else a
       | chart, and the format needs to make it easy to see that. Almost
       | any chart, even pie charts, have some particular use case where
       | they are the best chart for that purpose. No chart is going to
       | always be the best way to present data. Like choosing what kind
       | of language to use in explaining something, you need to know
       | something about who your audience is, and what they are
       | accustomed to.
        
         | petsfed wrote:
         | Wasn't there an article the other day about a concept that's
         | similar to incompleteness theorem? That any ambiguity-free
         | language is incapable of completely describing sufficiently
         | complex situations? Am I just imagining that? [0]
         | 
         | I feel like making a tool harder to use, just to prevent bad
         | actors, only punishes _good_ actors, while the bad actors find
         | some other way to act badly. Like, I don 't want to participate
         | in your arms race against disinformation purveyors, i just want
         | to illustrate that it tends to rain on days that are cloudy and
         | have high humidity.
         | 
         | 0. Sort of. I recently encountered "Colorless green ideas sleep
         | furiously" (https://en.wikipedia.org/wiki/Colorless_green_ideas
         | _sleep_fu...), although where I can't recall, and sort of
         | inferred the rest.
        
       | joshe wrote:
       | Context is important, this is targeted at journalists. They are
       | usually trying to make a point to casual readers.
       | 
       | For readers with more interest or who are numerate in their day
       | jobs (engineers, finance, or economists), dual axis charts can
       | often be a great choice.
       | 
       | This is better graph style advice from the Economist, which
       | includes good dual axis examples and one bad one and how to
       | correct it. https://medium.economist.com/mistakes-weve-drawn-a-
       | few-8cdd8...
       | 
       | Since we are engineers or founders trying to deal with very
       | complex systems, adding detail and clarity like the Economist or
       | Edward Tufte does is the better way to go.
        
         | lisacmuth wrote:
         | Author here. Thanks for setting the context: Datawrapper - the
         | data vis tool I write articles like this for - is indeed for
         | people who want to make a point with their charts and maps,
         | often to a broad audience. I agree that people who have learned
         | to read dual axis charts can benefit greatly from them (the
         | same is true for rainbow color maps).
         | 
         | Financial Times journalist John Burn Murdoch changed my mind on
         | dual axes charts - even for casual readers! - a bit over the
         | last six years, too. Here's a dual axis chart he created for
         | the FT: https://x.com/AlexSelbyB/status/1529039107732774913
         | 
         | The next article I write on dual axis charts will probably be a
         | "What to consider when you do use them" one.
        
           | joshe wrote:
           | What a great update, thanks for posting!
        
       | erehweb wrote:
       | I get it, and sympathize, but at many companies the decision
       | maker is someone who wants to see dual axis charts. If
       | Datawrapper can't do that, then that would be a point against
       | using it widely.
        
       | patrick451 wrote:
       | What a patronizing company. Your customers keep asking for a
       | feature that is widely supported and you refuse to add it because
       | it violates your sensibilities. Instead, you write this diatribe
       | lecturing us that the way we want to display data is wrong. Just
       | reading the opening paragraph, whatever interest I may have had
       | in your plotting capabilities evaporated.
        
       | jdeaton wrote:
       | I once worked with someone who was doing performance benchmarking
       | of two systems, and made a duel axis chart with the lines right
       | on top of eachother when in fact one system was like 5x faster
       | than the other. it drove me nuts because I didn't even realize
       | the dual axis at first and thought that they literally had
       | identical performance
        
       ___________________________________________________________________
       (page generated 2024-05-17 23:00 UTC)