View uci-20070111 fishcatch (public)

2011-09-14 15:25 by mldata | Version 1 | Rating Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
Rating
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Overall (based on 0 votes)
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Interesting
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Documentation
Summary

(No information yet)

License
unknown (from Weka repository)
Dependencies
Tags
arff slurped Weka
Attribute Types
Integer,Floating Point
Download
# Instances: 158 / # Attributes: 8
HDF5 (21.5 KB) XML CSV ARFF LibSVM Matlab Octave
Completeness of this item currently: 55%.
You can edit this item to add more meta information and make use of the site's premium features.
Original Data Format
arff
Name
'fishcatch'
Version mldata
0
Comment

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Weight treated as the class attribute. Identifier deleted.

As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems. Singapore: Springer-Verlag.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

NAME: fishcatch TYPE: Sample SIZE: 159 observations, 8 variables

DESCRIPTIVE ABSTRACT:

159 fishes of 7 species are caught and measured. Altogether there are 8 variables. All the fishes are caught from the same lake (Laengelmavesi) near Tampere in Finland.

SOURCES: Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaera sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4, Meddelanden utgivna av fiskerifoereningen i Finland. Helsingfors 1917

VARIABLE DESCRIPTIONS:

1 Obs Observation number ranges from 1 to 159 2 Species (Numeric) Code Finnish Swedish English Latin
1 Lahna Braxen Bream Abramis brama 2 Siika Iiden Whitewish Leusiscus idus 3 Saerki Moerten Roach Leuciscus rutilus 4 Parkki Bjoerknan ? Abramis bjrkna 5 Norssi Norssen Smelt Osmerus eperlanus 6 Hauki Jaedda Pike Esox lucius 7 Ahven Abborre Perch Perca fluviatilis

3 Weight Weight of the fish (in grams) 4 Length1 Length from the nose to the beginning of the tail (in cm) 5 Length2 Length from the nose to the notch of the tail (in cm) 6 Length3 Length from the nose to the end of the tail (in cm) 7 Height% Maximal height as % of Length3 8 Width% Maximal width as % of Length3 9 Sex 1 = male 0 = female

      ___/////___                  _
     /           \    ___          |
   /\             \_ /  /          H
 <   )            __)  \           |
   \/_\\_________/   \__\          _

 |------- L1 -------|
 |------- L2 ----------|
 |------- L3 ------------|

Values are aligned and delimited by blanks. Missing values are denoted with NA. There is one data line for each case.

SPECIAL NOTES: I have usually calculated Height = Height%Length3/100 Widht = Widht%Length3/100

PEDAGOGICAL NOTES: I have mainly used only Species=7 (Perch) and here is some of the models and test, we have used

  Weight=a+b*(Length3*Height*Width)+epsilon
     Ho: a=0;
     Heteroscedastic case. Question: What is proper weighting, 
     if you use Length3 as a weighting variable.

  Log(Weight)=a+b1*Length3+epsilon

  Weight^(1/3)=a+b1*Length3+epsilon
  (Given by Box-Cox-transformation)
     Ho: a=0;

  Log(Weight)=a+b1*Length3+b2*Height+b3*Width+epsilon
     Ho: b1+b2+b3=3;  
     i.e. dimension of the fish = 3

  Weight^(1/3)=a+b1*Length3+b2*Height+b3*Width+epsilon
  (Given by Box-Cox-transformation)
     Ho: a=0;

  Weight=a*Length3^b1*Height^b2*Width^b3+epsilon
     Nonlinear, heteroscedastic case.
     What is proper weighting?

  Is obs 143

  143  7  840.0 32.5  35.0  37.3  30.8  20.9  0

  an outlier? It had in its stomach 6 roach.

REFERENCES: Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaara sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4, Meddelanden utgivna av fiskerifoereningen i Finland. Helsingfors 1917

SUBMITTED BY: Juha Puranen Departement of statistics PL33 (Aleksanterinkatu 7) 000014 University of Helsinki Finland e-mail: jpuranen@noppa.helsinki.fi

Names
Species,Length1,Length2,Length3,Height,Width,Sex,class,
Types
  1. nominal:1,2,3,4,5,6,7
  2. numeric
  3. numeric
  4. numeric
  5. numeric
  6. numeric
  7. nominal:1,0
  8. numeric
Data (first 10 data points)
    Spec... Leng... Leng... Leng... Height Width Sex class
    1.0 23.2 25.0 30.0 38.4 13.4 nan 242.0
    1.0 24.0 26.0 31.2 40.0 13.8 nan 290.0
    1.0 23.9 26.0 31.1 39.8 15.1 nan 340.0
    1.0 26.3 29.0 33.5 38.0 13.3 nan 363.0
    1.0 26.5 29.0 34.0 36.6 15.1 nan 430.0
    1.0 26.8 29.0 34.7 39.2 14.2 nan 450.0
    1.0 26.8 29.0 34.5 41.1 15.3 nan 500.0
    1.0 27.6 30.0 35.0 36.2 13.4 nan 390.0
    1.0 27.6 30.0 35.1 39.9 13.8 nan 450.0
    1.0 28.5 30.0 36.2 39.3 13.7 nan 500.0
    ... ... ... ... ... ... ... ...
Description

A gzip'ed tar containing UCI and UCI KDD datasets (uci-20070111.tar.gz, 17,952,832 Bytes)

URLs
(No information yet)
Publications
    Data Source
    http://www.ics.uci.edu/~mlearn/MLRepository.html http://kdd.ics.uci.edu/
    Measurement Details
    Usage Scenario
    revision 1
    by mldata on 2011-09-14 15:25

    No one has posted any comments yet. Perhaps you would like to be the first?

    Leave a comment

    To post a comment, please sign in.

    This item was downloaded 4086 times and viewed 2562 times.

    No Tasks yet on dataset uci-20070111 fishcatch

    Submit a new Task for this Data item

    Data

    Sort by

    Disclaimer

    We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.

    Data | Task | Method | Challenge

    Acknowledgements

    This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
    PASCAL Logo
    http://www.pascal-network.org/.