Datasets for DSCI 425

These datasets are in comma-delimited format (.csv) files. They are easily read in this format into both R and JMP.

Datasets from Section 2 and 3

Body Fat - bodyfat.csv, Bodyfat.JMP

Saratoga NY Homes - Saratoga NY Homes.csv, Saratoga NY Homes.JMP

 

Datasets from Section 4 - ACE/AVAS

PCB trout - PCBtrout.csv , PCBtrout.JMP

Mammals - Mammals.csv , Mammals.JMP

LA Basin Ozone - Ozone.csv , Ozone.JMP

Boston Housing Data - Boston_Housing.csv, Boston Housing.JMP


Assignment 1 - Datasets

King County Homes: King County Homes (train).csv, King County Homes (test).csv

 

Datasets from Section 5 - MARS

LA Basin Ozone - Ozone.csv , Ozone.JMP

Saratoga NY Homes - Saratoga NY Homes.csv, Saratoga NY Homes.JMP

Boston Housing Data - Boston_Housing.csv, Boston Housing.JMP

 

Datasets from Section 6 - Projection Pursuit Regression

Florida Largemouth Bass - Bass.csv, Bass.JMP

Abalone - Abalone.csv, Abalone.JMP

 

Datasets from Section 7 - Neural Networks

Boston Housing Data - Boston_Housing.csv, Boston Housing.JMP

Abalone - Abalone.csv, Abalone.JMP

California Homes - CAhomes.csv, CAhomes.JMP

Twin Cities Homes (from Redfin www.redfin.com) - TwinCitiesRedfin.csv


Assignment 2 - Dataset

Compressive Strength of Concrete - Concrete.csv, Concrete.JMP

 

Datasets from Section 8 - Regularized/Penalized Regression Methods

Body Fat - bodyfat.csv, Bodyfat.JMP

Seat Position - Seat Position.csv, Seat Position.JMP

Chemical Permeability - Permeability Orig.csv, Permeability LogY.csv, Permeability Orig.JMP, Permeability LogY.JMP


Assignment 3 - Datasets

Abalone - Abalone.csv, Abalone.JMP

Twin Cities Home Prices - TC Homes (Train).JMP, TC Homes (Test).JMP

 


Datasets from Section 9 - Dimension Reduction Methods - PCR and PLS Regression

yarn - contained in the pls package.

 

Assignment 4 - Datasets

College is in the ISLR Package you need to install from CRAN.

Lu2004.csv

 

Datasets from Section 10 - Tree-based Regression Models

LA Basin Ozone - Ozone.csv , Ozone.JMP

Cities - City 77.csv

CPUs - CPUs.csv , CPUs.JMP

Twin Cities Homes - TC Homes (train).csv , TC Homes (train).JMP and TC Homes (test).csv , TC Homes (test).JMP

San Francisco Homes - SF Homes.txt, SF Homes.JMP

Compressive Strength of Concrete - Concrete.csv, Concrete.JMP

QSAR Melting Point - QSARmtp.csv (XGBoost Example)

Assignment 5 - Datasets

Boston (with lat long).csv

cars data set in the caret package from CRAN

 

MIDTERM PROJECT DATASETS

Solubility Training Data - SoluTrain.csv
Solubility Test Data - SoluTest.csv

 

Datasets from Section 11 - Nearest Neighbor Regression

City77.csv - used in example regarding statistical distance

Saratoga NY Homes - Saratoga NY Homes.csv, Saratoga NY Homes.JMP

Compressive Strength of Concrete - Concrete.csv, Concrete.JMP

 

Datasets from Section 13 - Nearest Neighbor Classification

Italian Olive Oils - OlivesOils.csv, Olives.JMP

Digit Recognition Data - ZIPtrain.csv, ZIPtest.csv

Water Bears - WaterBears.csv, WaterBears.JMP

 

Datasets from Section 14 - Naive Bayes Classification

Italian Olive Oils - OlivesOils.csv, Olives.JMP

Digit Recognition Data - ZIPtrain.csv, ZIPtest.csv

Water Bears - WaterBears.csv, WaterBears.JMP

Mushroom Data - Mushrooms.csv, Mushrooms.JMP

 

Assignment 6 - Datasets

Satellite Image - SATimage.csv

Music Genre Identification - GenreTrain.csv and GenreTest.csv

Oil Identification - Oils.csv

 

Assignment 7 - Datasets

Satellite Image - SATimage.csv

Music Genre Identification - Gtrain.csv and Gtest.csv

 

Datasets from Section 15 - Tree-based Models for Classification

Italian Olive Oils - OlivesOils.csv, Olives.JMP

Digit Recognition Data - ZIPtrain.csv, ZIPtest.csv

Water Bears - WaterBears.csv, WaterBears.JMP

Mushroom Data - Mushrooms.csv, Mushrooms.JMP

Credit Default Data - Credit Card Default.csv

Cleveland Heart Disease Data - Cleveland.csv

Credit Card Unbalanced - InbalancedCredit.csv


Assignment 8 - Datasets

Satellite Image - SATimage.csv

Music Genre Identification - GenreTrain.csv and GenreTest.csv

 

Assignment 9 - Datasets

Satellite Image - SATimage.csv

Alzheimers - Alzheimers.csv


 

 

 

 

 

Final Project Datasets

Water Solubility - WaterSol (Train).csv and WaterSol (Test).csv

Alzheimers - Alz Train (Final).csv and Alz Test (Final).csv

Polk Country, IA Home Selling Prices - Polk Train (Final).csv and Polk Test (Final).csv