Plots in Graphical Representation: 10 Quotes
“In working through graphics one has, however, to be exceedingly cautious in certain particulars, for instance, when a set of figures, dynamical or financial, are available they are, so long as they are tabulated, instinctively taken merely at their face value. When plotted, however, there is a temptation to extrapolation which is well nigh irresistible to the untrained mind. Sometimes the process can be safely employed, but it requires a rather comprehensive knowledge of the facts that lie back of the data to tell when to go ahead and when to stop.” (Allan C Haskell, “How to Make and Use Graphic Charts”, 1919)
“The wandering of a line is more powerful in its effect on the mind than a tabulated statement; it shows what is happening and what is likely to take place just as quickly as the eye is capable of working.” (A Lester Boddington, “Statistics And Their Application To Commerce”, 1921)
“We can gain further insight into what makes good plots by thinking about the process of visual perception. The eye can assimilate large amounts of visual information, perceive unanticipated structure, and recognize complex patterns; however, certain kinds of patterns are more readily perceived than others. If we thoroughly understood the interaction between the brain, eye, and picture, we could organize displays to take advantage of the things that the eye and brain do best, so that the potentially most important patterns are associated with the most easily perceived visual aspects in the display.” (John M Chambers et al, “Graphical Methods for Data Analysis”, 1983)
“The plotted points on a graph should always be made to stand out well. They are, after all, the most important feature of a graph, since any lines linking them are nearly always a matter of conjecture. These lines should stop just short of the plotted points so that the latter are emphasised by the space surrounding them. Where a point happens to fall on an axis line, the axis should be broken for a short distance on either side of the point.” (Linda Reynolds & Doig Simmonds, “Presentation of Data in Science” 4th Ed, 1984)
“Boxplots provide information at a glance about center (median), spread (interquartile range), symmetry, and outliers. With practice they are easy to read and are especially useful for quick comparisons of two or more distributions. Sometimes unexpected features such as outliers, skew, or differences in spread are made obvious by boxplots but might otherwise go unnoticed.” (Lawrence C Hamilton, “Regression with Graphics: A second course in applied statistics”, 1991)
“Construction refers to everything involved in the production of the graphical display, including questions of what to plot and how to plot. Deciding what to plot is not always easy and again depends on what we want to accomplish. In the initial phases of an analysis, two-dimensional displays of the response against each of the p predictors are obvious choices for gaining insights about the data, choices that are often recommended in the introductory regression literature. Displays of residuals from an initial exploratory fit are frequently used as well.” (R Dennis Cook, “Regression Graphics: Ideas for studying regressions through graphics”, 1998)
“A bar graph typically presents either averages or frequencies. It is relatively simple to present raw data (in the form of dot plots or box plots). Such plots provide much more information. and they are closer to the original data. If the bar graph categories are linked in some way — for example, doses of treatments — then a line graph will be much more informative. Very complicated bar graphs containing adjacent bars are very difficult to grasp. If the bar graph represents frequencies. and the abscissa values can be ordered, then a line graph will be much more informative and will have substantially reduced chart junk.” (Gerald van Belle, “Statistical Rules of Thumb”, 2002)
“A useful feature of a stem plot is that the values maintain their natural order, while at the same time they are laid out in a way that emphasises the overall distribution of where the values are concentrated (that is, where the longer branches are). This enables you easily to pick out key values such as the median and quartiles.” (Alan Graham, “Developing Thinking in Statistics”, 2006)
“There are two main reasons for using graphic displays of datasets: either to present or to explore data. Presenting data involves deciding what information you want to convey and drawing a display appropriate for the content and for the intended audience. […] Exploring data is a much more individual matter, using graphics to find information and to generate ideas. Many displays may be drawn. They can be changed at will or discarded and new versions prepared, so generally no one plot is especially important, and they all have a short life span.” (Antony Unwin, “Good Graphics?” [in “Handbook of Data Visualization”], 2008)
“With time series though, there is absolutely no substitute for plotting. The pertinent pattern might end up being a sharp spike followed by a gentle taper down. Or, maybe there are weird plateaus. There could be noisy spikes that have to be filtered out. A good way to look at it is this: means and standard deviations are based on the naïve assumption that data follows pretty bell curves, but there is no corresponding ‘default’ assumption for time series data (at least, not one that works well with any frequency), so you always have to look at the data to get a sense of what’s normal. […] Along the lines of figuring out what patterns to expect, when you are exploring time series data, it is immensely useful to be able to zoom in and out.” (Field Cady, “The Data Science Handbook”, 2017)
More quotes on “Plots” in Graphical Representation at sql-troubles.blogspot.com.