DATASETS FOR DSCI 415 (these will be updated regularly!)
Datasets in Book
Section 1 - Graphics in R
- Boston Housing Data - Boston.JMP , Boston.txt
- US Cities - Old data on 77 largest U.S. cities - City77.JMP , City77.csv
- Olive Oils from Italy (Olives in mult.Rdata) - Olives.JMP, Olives.txt and a slightly different version is contained in the data object Olives in mult.Rdata
- University of California - Berkeley - UCBAdmissions in the datasets library that comes with base R
- Fisher's Iris Data - iris in the datasets library that comes with base R
- L.A. Ozone Data - Ozdata.JMP , Ozdata.csv, Ozdata in mult.Rdata
- Swiss Franc Data - Swiss.JMP , Swiss.csv, Swiss in mult.Rdata
- NHL Data - NHL.JMP , NHL.csv, NHL in mult.Rdata
- Crab data from Kodiak Island in Alaska is all contained in the mult.Rdata folder (kodiak.crab, survey.crab) - kodiak.csv and survey.csv also
Section 2 - Measuring Distance
Section 3 - Multidimensional Scaling
Section 4 - Principal Components and Other Dimension Reduction Methods
Section 5 - Cluster Analysis
Section 6 - Correspondence Analysis
Section 7 - Association Rules
Section 8 - Recommender Systems
Section 9 - Text Mining and Sentiment Analysis
Datasets for Assignments
Assignment 1 - Matrix Algebra and Graphics in R
- US Cities - Old data on 77 largest U.S. cities - City77.JMP , City77.csv, City in mult.Rdata
- Olive Oils from Italy (Olives in mult.Rdata) - Olives.JMP, Olives.txt and a slightly different version is contained in the data object Olives in mult.Rdata
- University of California - Berkeley admissions - UCBAdmissions in the datasets library that comes with base R
- Fisher's Iris Data - iris in the datasets library that comes with base R
- NHL Data - NHL.JMP , NHL.csv, NHL in mult.Rdata
- Swiss Franc Data - Swiss.JMP , Swiss.csv, Swiss in mult.Rdata
Assignment 2 - Distance and Multidimensional Scaling
Assignment 3 - Principal Component Analysis (PCA)
Assignment 4 - Independent Component Analysis (ICA)
Assignment 5 - Cluster Analysis
Assignment 6 - MCA
Assignment 7 - Association Rules
Assignment 8 - Recommender Systems
- Movie and Genre (HW 8).csv - this file contains information about the movies in the rating matrix such title, month released, year released, run time, and genres it can be categorized as.
- Rating Matrix (HW 8).csv- this file contains the user ratings for the movies (# of users = 9515, # of movies = 1471)
===============================================================================================================================
Italian Olive Oils
Olives.JMP
Olives.txt
Brazil Faces
Brazil.csv ( each column are the pixels to create one of the 200 faces in the database)
Car Images
Letter Recognition Data
Letter-recognition.JMP
Milk Truck Datasets (Milktruck, Milkdiesel, Milkgasoline in R)
Milktruck.JMP, Milkdiesel.JMP, Milkgasoline.JMP
Minnesota Districts, Teachers, and 8th Grade Test Scores (MNteachtest in R)
MNteachtest.JMP
NHL Skater Stats
PuckAnalytics.JMP
PuckAnalytics.csv
NCI Data
NCI.JMP
NCI transpose.JMP
Nutritional Data on Fast Food (Nutritional.Small and Nutritional.Large in R)
Nutritional (small).JMP and Nutritional (large).JMP
Orthopedic Sales Data
Orthopedic Sales.JMP
Orthopedic Sales.txt
Radiotherapy (Radiotherapy in R)
Radiotherapy.JMP
Salespeople
Salespeople.JMP
Schlerosis
Schlerosis.JMP
Sports Difficulty
Sports Difficulty.JMP (Description File click here)
Sports Difficulty.txt
Trackwomen (Trackwomen in R)
Trackwomen.JMP
Trackmen (Trackmen in R)
Trackmen.JMP
ZIP Code (Digit Recognition Data - zip.train and zip.test in ElemStatLearn package)
ZipTrain.JMP
ZipTest.JMP