View datasets-numeric pollution (public)

2010-11-06 09:57 by mldata | Version 1 | Rating Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
Rating
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Overall (based on 0 votes)
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Interesting
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Documentation
Summary

(No information yet)

License
unknown (from Weka repository)
Dependencies
Tags
arff slurped Weka
Attribute Types
Floating Point
Download
# Instances: 60 / # Attributes: 16
HDF5 (19.0 KB) XML CSV ARFF LibSVM Matlab Octave
Completeness of this item currently: 44%.
You can edit this item to add more meta information and make use of the site's premium features.
Original Data Format
arff
Name
pollution
Version mldata
0
Comment

Data from StatLib (ftp stat.cmu.edu/datasets)

This is the pollution data so loved by writers of papers on ridge regression. Source: McDonald, G.C. and Schwing, R.C. (1973) 'Instabilities of regression estimates relating air pollution to mortality', Technometrics, vol.15, 463- 482. Variables in order: PREC Average annual precipitation in inches JANT Average January temperature in degrees F JULT Same for July OVR65 % of 1960 SMSA population aged 65 or older POPN Average household size EDUC Median school years completed by those over 22 HOUS % of housing units which are sound & with all facilities DENS Population per sq. mile in urbanized areas, 1960 NONW % non-white population in urbanized areas, 1960 WWDRK % employed in white collar occupations POOR % of families with income < $3000 HC Relative hydrocarbon pollution potential NOX Same for nitric oxides SO@ Same for sulphur dioxide HUMID Annual average % relative humidity at 1pm MORT Total age-adjusted mortality rate per 100,000

Names
PREC,JANT,JULT,OVR65,POPN,EDUC,HOUS,DENS,NONW,WWDRK,
Types
  1. numeric
  2. numeric
  3. numeric
  4. numeric
  5. numeric
  6. numeric
  7. numeric
  8. numeric
  9. numeric
  10. numeric
Data (first 10 data points)
    PREC JANT JULT OVR65 POPN EDUC HOUS DENS NONW WWDRK ...
    36.0 27.0 71.0 8.1 3.34 11.4 81.5 3243.0 8.8 42.6 ...
    35.0 23.0 72.0 11.1 3.14 11.0 78.8 4281.0 3.5 50.7 ...
    44.0 29.0 74.0 10.4 3.21 9.8 81.6 4260.0 0.8 39.4 ...
    47.0 45.0 79.0 6.5 3.41 11.1 77.5 3125.0 27.1 50.2 ...
    43.0 35.0 77.0 7.6 3.44 9.6 84.6 6441.0 24.4 43.7 ...
    53.0 45.0 80.0 7.7 3.45 10.2 66.8 3325.0 38.5 43.1 ...
    43.0 30.0 74.0 10.9 3.23 12.1 83.9 4679.0 3.5 49.2 ...
    45.0 30.0 73.0 9.3 3.29 10.6 86.0 2140.0 5.3 40.4 ...
    36.0 24.0 70.0 9.0 3.31 10.5 83.2 6582.0 8.1 42.5 ...
    36.0 27.0 72.0 9.5 3.36 10.7 79.3 4213.0 6.7 41.0 ...
    ... ... ... ... ... ... ... ... ... ... ...
Description

A jarfile containing 37 regression problems, obtained from various sources (datasets-numeric.jar, 169,344 Bytes).

URLs
(No information yet)
Publications
    Data Source
    Measurement Details
    Usage Scenario
    revision 1
    by mldata on 2010-11-06 09:57

    No one has posted any comments yet. Perhaps you would like to be the first?

    Leave a comment

    To post a comment, please sign in.

    This item was downloaded 4649 times and viewed 2322 times.

    No Tasks yet on dataset datasets-numeric pollution

    Submit a new Task for this Data item

    Data

    Sort by

    Disclaimer

    We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.

    Data | Task | Method | Challenge

    Acknowledgements

    This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
    PASCAL Logo
    http://www.pascal-network.org/.