Valerie Lanard
IS 247
Assignment 4
10/9/2000

Contents:


Dataset

I obtained the sleep dataset from the CMU Statlib, a look at sleep patterns in 62 mammals, along with factors such as their lifespan, body weight, brain weight, and overall risk of danger from other animals. This dataset did not have a nominal field, so I created my own by researching mammals online to determine the Order within the Mammalia Class to which each animal in the list belonged. My primary resource was Smithsonian's Mammal Species of the World Home Page, and I briefly referenced the Lycos Encyclopedia of Animals, the Rat & Mouse Gazette, and this dataset.

Here is my final ASCII dataset that I used in XmdvTool.

I changed the code for an "unknown" value in the dataset from -999.0 to -1.0, since all known values were greater than 0. This lessened distortion of the visualizations caused by missing values. At the same time, it may have hidden some other data correlations, since it effectively misrepresented the actual (unknown) data values.

[top]

Pre-visualization Expectations

My pre-visualization hypotheses:

  • Animals who have the least danger index have the longest dreaming sleep.
  • Animals with longer life spans get more total sleep per day.
  • Animals with heavier body weight get more total sleep per day.
  • Primates get the most dreaming sleep.
[top]

My Hypothesis

Outcome

Animals who have the least danger index have the longest dreaming sleep. FALSE. As seen in Figure 1, danger index does not appear to be a predictor of dream sleep.

Figure 1. The danger index (DangI) is not a predictor of the amount of dream sleep(DreSlp).


However, the inverse is true. Dream sleep and total sleep appear to be predictors of the danger index, though total sleep has a much stronger correlation. As seen in Figure 2, animals who get the most total sleep (both dreaming and non-dreaming) have the lowest danger index, or are in the least amount of danger.

Figure 2. The animals getting the most total sleep (not just the most dream sleep) have the lowest danger index.


For example, in the dataset, the little brown bat is the mammal that gets the most total sleep per day, but the vast majority of it is non-dreaming sleep. This animal has the lowest possible danger index, so the danger index obviously does not determine the amount of dream sleep. Rather, total sleep is a much better indicator of an animal's danger index.

Animals with longer life spans get more total sleep per day. FALSE. Contrary to my expectation, a longer life span does not mean that the animal gets more total sleep per day. Figure 3 shows the comparison between life span and total sleep. The blue line shows the type of distribution slope I expected to see, if my hypothesis was correct.

Figure 3. There is not a strong correlation between life span and total sleep.

Animals with heavier body weight get more total sleep per day. FALSE. Figure 4 shows that the animals with the heaviest body weight, those selected in purple, do not fall among the animals who get a lot of total sleep--just the opposite, in fact.

Figure 4. Heavier animals (in purple) do not get the most total sleep.

Primates get the most dreaming sleep. FALSE. Figure 5 shows that the Primate Order (selected along the Order coordinate) gets among the smaller amounts of dream sleep from the mammals in this data set.

Figure 5. The "Primate" Order does not get the most dream sleep.

[top]

Data Exploration

Correlations I investigated:

brain weight and total sleep no obvious correlation found
brain weight and dreaming sleep no obvious correlation found
danger index and life span no obvious correlation found
body weight and brain weight appears to be a strong correlation between heavier body weight and heavier brain weight
life span and dream sleep no obvious correlation found
order and life span appears to be a moderate to strong correlation
order and body weight appears to be a moderate to strong correlation
order and dreaming sleep appears to be a moderate correlation
additional brain weight correlations Strong positive correlation between brain weight and both lifespan and gestation period. Primates have the greatest brain weights, lifespans, and gestation periods of the animals in this sample.

Additional Discovery:

All of the primates in the dataset get very similar amounts dream sleep per day. When I compared dream sleep to life span in Glyph view (see Figure 6a), I first noticed that Man was an extreme outlier at the top of the Life span axis, living roughly twice as long as the next nearest mammal in the list. Interestingly, this data view showed that all but two of the Primates on the list got very similar amounts of dream sleep, as seen by the vertical stack of red glyphs. Upon further investigation, it was revealed that the two outlier Primates on the far left of the dream sleep axis did not have recorded data for this variable, accounting for the deviation. Figure 6b shows the revised Glyph view, highlighting only those Primates with recorded dream sleep values.

Figure 6a. Red glyphs show all primates.

Figure 6b. Red glyphs show all primates with known dreaming sleep values.

The three outliers in green along the far right X axis of Figures 6a and 6b, getting the most dream sleep, were the North American opossum, the Water opossum, and the giant armadillo. While no one variable correlated exclusively to getting the most dream sleep, all three of these animals did have the lowest possible Danger Index score, and were among the animals getting the most total sleep.

[top]

Comments on the tool

I found both parallel coordinates and glyphs very effective for analyzing the data. In particular, the Glyph Placement feature was very helpful. At least with my data set, the Scatter and Stack views did not seem to yield as much information. The auxiliary display feature was also very nice for visualizing the same data in 2 different ways simultaneously.

Modifications that I feel would improve the tool:

  • Way to show nominal value labels (as defined in OKC file) that are associated with numeric representations, in all views of the data
  • Way to "unselect all" data points in Glyph view, ie, Shift-clicking between the glyphs, on empty space
  • Way to create new variables computed from other variables (for example, I would have liked to derive the dream sleep/total sleep ratio)
  • A "Check All" option in the Data Summary and Statistics dialog box, to save time
  • Improved stability on Windows 98; I experienced all sorts of bugs and OS problems both during and after use of XMDV

[top]