View datasets-UCI iris (public)
























- Summary
The classic iris flower data
- License
- unknown (from Weka repository)
- Dependencies
- Tags
- arff slurped Weka
- Attribute Types
- Floating Point,String
- Download
-
# Instances: 150 / # Attributes: 5
HDF5 (24.4 KB) XML CSV ARFF LibSVM Matlab Octave
- Original Data Format
- arff
- Name
- iris
- Version mldata
- 0
- Comment
Title: Iris Plants Database
Sources: (a) Creator: R.A. Fisher (b) Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov) (c) Date: July, 1988
Past Usage:
Publications: too many to mention!!! Here are a few.
Fisher,R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).
Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis. (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments". IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-2, No. 1, 67-71. -- Results: -- very low misclassification rates (0% for the setosa class)
Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule". IEEE Transactions on Information Theory, May 1972, 431-433. -- Results: -- very low misclassification rates again
See also: 1988 MLC Proceedings, 54-64. Cheeseman et al's AUTOCLASS II conceptual clustering system finds 3 classes in the data.
Relevant Information: --- This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. --- Predicted attribute: class of iris plant. --- This is an exceedingly simple domain.
Number of Instances: 150 (50 in each of three classes)
Number of Attributes: 4 numeric, predictive attributes and the class
Attribute Information:
sepal length in cm
sepal width in cm
petal length in cm
petal width in cm
class: -- Iris Setosa -- Iris Versicolour -- Iris Virginica
Missing Attribute Values: None
Summary Statistics: Min Max Mean SD Class Correlation sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194 petal length: 1.0 6.9 3.76 1.76 0.9490 (high!) petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)- Class Distribution: 33.3% for each of 3 classes.
- Names
- sepallength,sepalwidth,petallength,petalwidth,class,
- Types
- numeric
- numeric
- numeric
- numeric
- nominal:Iris-setosa,Iris-versicolor,Iris-virginica
- Data (first 10 data points)
sepa... sepa... peta... peta... class 5.1 3.5 1.4 0.2 Iris... 4.9 3.0 1.4 0.2 Iris... 4.7 3.2 1.3 0.2 Iris... 4.6 3.1 1.5 0.2 Iris... 5.0 3.6 1.4 0.2 Iris... 5.4 3.9 1.7 0.4 Iris... 4.6 3.4 1.4 0.3 Iris... 5.0 3.4 1.5 0.2 Iris... 4.4 2.9 1.4 0.2 Iris... 4.9 3.1 1.5 0.1 Iris... ... ... ... ... ...
- Description
This is the classic Iris flower data set, collected by Edgar Anderson and used as an example of linear discriminant analysis by Ronald Fisher. See
[HTML_REMOVED]the wikipedia page[HTML_REMOVED]
Briefly, there are 150 instances, 50 each of Iris setosa, Iris versicolor, and Iris virginica. For each instance, there are measures of sepal length, sepal width, petal length, and petal width, in addition to the class indicator.
- URLs
- (No information yet)
- Publications
- Data Source
- http://www.ics.uci.edu/~mlearn/MLRepository.html
- Measurement Details
For details see: Edgar Anderson (1935). "The irises of the Gaspé Peninsula". Bulletin of the American Iris Society 59: 2–5.
- Usage Scenario
This dataset has been widely used as a test case for classification algorithms.
- revision 1
- by mldata on 2010-04-29 20:06
- revision 2
- by phoyer on 2010-08-31 09:12
- revision 3
- by phoyer on 2010-08-31 09:25
- revision 4
- by phoyer on 2010-08-31 07:27
- revision 5
- by phoyer on 2010-08-31 07:27
- revision 6
- by phoyer on 2010-08-31 09:30
- revision 7
- by phoyer on 2011-09-14 15:17
No one has posted any comments yet. Perhaps you would like to be the first?
Leave a comment
To post a comment, please sign in.This item was downloaded 17336 times and viewed 23231 times.
No Tasks yet on dataset datasets-UCI iris
Submit a new Task for this Data itemDisclaimer
We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.
Acknowledgements
This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
http://www.pascal-network.org/.