MATH 150  STATISTICAL DATA ANALYSIS  (on an as is basis)


    S-PLUS related materials
    • Spotfire S+ 8 for Windows User's Guide --TIBCO Spotfire S+ 8.2 for Windows User's Guide (local: copy) NEW
    • Introduction to S-Plus for UNIX, (with exercises) by Diego Kuonen, 2001 (exercises are useful for the Windows version too; local: copy)
    • S-Plus books from the Internet
      • SGuide.pdf -- Introductory Guide to S-PLUS, by R. D. Ripley, University of Oxford, 1994
      • Snotes.pdf (advanced) -- Notes on S-PLUS: A Programming Environment for Data Analysis and Graphics, by Bill Venables and Dave Smith, University of Adelaide, 1992
    • A HTML-based help system for S-PLUS 3.4--highly recommended!--unavailable as of January 18, 2005, NEW: working again! (local: copy--might not fully work though)
    • A Guide for the Unwilling S User by Patrick Burns, 23 February 2003 (or see local copy)
    • Resources to Help you Learn and Use R/S-PLUS, from UCLA, NEW (including textbook examples for R users from Visualizing Data by William S. Cleveland
    • S POETRY (a free 439 page book on more advanced S-Plus programming,  for the very willing) by Patrick Burns, 1998 (or see local copy)
    • S-PLUS tips on how to retrieve data via the Internet (including links to StatLib)
    • An R and S-PLUS Companion to Applied Regression by J. Fox, Sage Publications--material to the book
    • Trellis user manuals (trellis.pdf, trellis.user.pdf, trellis.tour.col.ps)

    Other sites to look for statistical databases:
    • WWW Virtual Library - Statistics
    • Datasets from S-PLUS (official datasets) (or see the local version)
    • Occidental College: database resources with statistical content
    • Links to Statistical Data (datasets)
    • Inter-University Consortium for Political and Social Research 
    • Case studies at UCLA 
    • Datasets at UCSD
    • The World Bank database
    • Links to Data Sets
    • Dr. B's Wide World of Web Data (?)
    • Data Sets from Vanderbilt University: Data for Titanic passengers, 1996 Olympics medal counts, 40-observation sex-age-response data, Boston neighborhood housing prices data, U.S. counties and 1992 presidential election dataset, links to other datasets, etc. at http://biostat.mc.vanderbilt.edu/wiki/Main/DataSets

    Downloaded material related to class coverage:
    1. The Old Faithful or click here for a local version. [Try this lovely Java histogram applet too.
    2. ]
      • The Yellowstone Net / The geysers of Yellowstone (videos!)
      • Picture 1  / Picture 2 
    3.   The Draft Lottery or click here for a local version (see the index page for text resources)

    Other material related to class coverage:
    • Average monthly temperatures in six big cities located at or around the latitude 34N and 34S (from Wikipedia: "The 34th parallel north is a circle of latitude that is 34 degrees north of the Earth's equatorial plane.")
    • Gallery of Data Visualization (the best and worst of statistical graphics)
      • Trellis Graphics: Case Studies (including the barley data); read this if nothing works
      • The Visual Design and Control of Trellis Display, by R. A. Becker, W. S. Cleveland, and M.-J. Shyu  (local copy)
    • Theoretical quantile-quantile plots and power normal transformations

    Playing with statistics [in order to view pages that do not seem to work properly, you might need a browser which supports Java 1.1] --disclaimer: some applets might be out of service!
    • guessing correlations (match the correlations with the (four) scatter plots)
    • linear regression with residuals: see the changes of the regression line and the evolving residuals as you design your own dataset
       
    • (normal approximation to binomial probabilities)
    •   Central Limit Theorem (CLT) 
    • confidence interval simulators: 1 and 2
    • a website with very useful material for probability theory and statistics (NEW) [same as Virtual Laboratories in Probability and Statistics at http://sciencenetlinks.com/tools/virtual-laboratories-in-probability-and-statistics/ [an excellent site with lots of goodies]]
    • Rice Virtual Lab in Statistics (the home of HyperStat)
    • interesting Java applets
    • CUWU (Champaign-Urbana Web University) Statistics Program
    •    The Sports Data page--compiled by Prof. Robin Lock
    • (More links to Statistical Data--compiled by Prof. Gary Smith)
    • awesome-public-datasets
    • some car data (interactive 360° Photos are no longer available)
    • 1993 New Car Data

    Texts, people and calculators
    • Electronic Statistics Textbook or see
      local copy: Electronic Statistics Textbook with glossary (on campus access only) [Note: please
      • start it from a "no frame page" and after starting it
      • adjust the vertical frame divider for better viewing.  This feature is unavailable if you start, for example, from the framed version of the math home page.]
    • StatProb: The Encyclopedia Sponsored by Statistics and Probability Societies
    • Online statistics texts: online texts (e.g., HyperStat Online Statistics Textbook)
    • The ISI Glossary of Statistical Terms
    • John W. Tukey is a pioneer in exploratory data analysis.
    • TI-83 Guidebook (2nd->DISTR-> 
      continuous distributions: with distributions normal, t, χ2, and F
      • distributioncdf(a,b,parameters)
      • distributionpdf(x,parameters)
      • invNorm(q,parameters)

      discrete distributions: with distributions binom, Poisson, geomet

      • distributioncdf(parameters,x)
      • distributionpdf(parameters,x)
    • TI-89 Stat Help (!), TI-89/92 Guidebook, The TI-89 Calculator in the Basic Probability and Statistics Course (local copy) (sorry, you are on your own with these calculators--(:we cannot help you beyond posting these files:))
    • TI Guidebooks: https://education.ti.com/en/guidebook/search?active=guidebooks
    • Larson and Farber, Elementary Statistics, 5th Edition, Pearson (opensource???) (local copy) -- USE THIS RESOURCE TO PRACTICE PROBLEM SOLVING IN STATISTICS
    • Statistics: LibreTexts  -- USE THIS RESOURCE TO PRACTICE PROBLEM SOLVING IN STATISTICS AND LOOK UP DEFINITIONS AND CONCEPTS (here is a book downloaded from the site)

    Paradoxes (or just odd)
    • Simpson's Paradox--mediant fraction or see this one-page article for an easy visual explanation (needs some fixing of the notation though...--# of cured patients goes on the y-axis, # of total patients goes on the x-axis for the two diseases, separately and combined, at two hospitals)
    • Winning a game by halftime in pdf or ps format (needs Acrobat or postscript viewer installed on your machine) but see also
    • Late-game reversals (needs Acrobat reader installed on your machine)
    • AAAS Special from San Francisco: mathematics: Secret of sports thrills spilled 

    • Longest head or tail runs or see Probability, runs, longest exact matches & gapless local alignment (or see an earlier local copy) lectures by Professor M. Zuker of Rensselaer Polytechnic Institute, Algorithms in Computational Molecular Biology, Winter/Spring 2004, for a more refined approach 
    • The Monty Hall Problem: Which door has the Cadillac? Part I (local copy) and Part II (local copy) in the column "The real-life adventures of a decision scientist," Decision Line, December/January 1999, pp. 17-19, by Andrew Vazsonyi (a.k.a. Endre Weiszfeld)
      a surprising application (on cognitive dissonance) in experimental psychology: And Behind Door No. 1, a Fatal Flaw from The New York Times, April 08, 2008, Science column, Findings, by John Tierney
    • Back-to-back fouls caught by Dodger fans sitting side-by-side - Los Angeles Times, May 8, 2008
    • Life Through A Mathematician's Eyes--winning the lottery 14 times. Is this legal? (local copy) NEW
    • ...and more lottery related oddities

    Interesting statistics, their consequences, uses and misuses--controversial plans to outlaw gender based risk assessment in the EU
    • Why women live longer than men--downloaded from the Scientific American, June 1998
    • Euro vision gives bleak outlook for men's pensions--downloaded from www.pensionsworld.co.uk, July 2003
    • EU move set to raise insurance premiums for women--downloaded from PRACTIV News, www.moneymarketing.co.uk, 13/11/2003
    • Memo to Brussels: women are different-downloaded from Telegraph, Opinion column, 26/06/2003
    • Europe is a long way from a sexism directive--published in the Financial Times, 27/06/2003, by European Social Affairs Commission's Commissioner Anna Diamantopolou
    • China's one child policy or see local copy

    Gambler's ruin problems, random walks, roulette, and discussion
    • Gambler's Ruin [read Chapter 2: Random Walks,  S2.1: Gambler's Ruin, pp6-16; in particular, Corollary 2.8 on an upper bound on the winning probability in terms of the intended profit for a disadvantageous game; see also slides, pp4-6]--downloaded from Prof. Meyer's site at MIT, 6.042/18.062J, Fall 2003;
      see also: Kozek, A. A rule of thumb (not only) for gamblers draft of Stoch. Proc. & Appl. 55( 1995), 169-181 (see page 8 for the analysis of the European roulette (without a double zero), while for the American roulette the expected value is -$2/38 and the variance is (36/38)2 ((38/k)-1) for a $1-bet with k=1 (on a single number), 2, 3, 4, 6, 9, 12, 18 ("even-money bet" on red/black (rouge/noir), even/odd (pair/impair), small/large (manqué/passé), etc.);
      click here (local copy) for the definition of some original French roulette terms (the French roulette is even better than the European one, and you lose only $1/74 on the average on a $1 "even-money" bet);
      or read about the The Martingale Roulette Betting System (local copy)--DISCLAIMER: I do not endorse any on-line gambling sites; in fact, take my advice and do not gamble at all!

      • see also: Wald's Identity
    • Virtual Laboratories in Probability and Statistics > Special Models > G. Random Walk (see Applets: Random Walk Experiment)

    Other readings (some journal articles)
    • On Least Absolute Deviation (LAD) lines
    • Average vs. median: driving speed on the highway -- Do You Know Your Relative Driving Speed?

    • Robust Locally Weighted Regression and Smoothing Scatterplots, Theory and Methods,  Journal of the American Statistical Association (JASA) 74(1979), pp. 829-83, by W. S. Cleveland (search at the database JSTOR or see a local copy)

    • Articles related to Marylin vos Savant--from the Chance database

    • The famous controversial betting from Casablanca (note that you can read the whole script here). You can watch the actual footage by clicking on this icon , or in even better quality if you click on .   

      Click on the
      left to see 2 scenes from the movie or right to see the 2 posters in full,

        

      and here is a photo of the Oscar winning director Michael Curtiz (a.k.a. Mihály Kertész born Manó Kaminer) with Ingrid Bergman and Humphrey Bogart.

    • Gambling at the movies (local copy). . . Most played roulette number 

    • Do good hands attract? by Stan Gudder, Mathematics Magazine 51(1981), 13-16

    Office hours and other info:
    • T. Lengyel's home page
    • Office hours
    • Calendar

    Last updated: Friday, May 20, 2022 by tl