View datasets-UCI vowel (public)
























- Summary
(No information yet)
- License
- unknown (from Weka repository)
- Dependencies
- Tags
- arff slurped Weka
- Attribute Types
- Floating Point,String
- Download
-
# Instances: 990 / # Attributes: 14
HDF5 (248.8 KB) XML CSV ARFF LibSVM Matlab OctaveFiles are converted on demand and the process can take up to a minute. Please wait until download begins.
You can edit this item to add more meta information and make use of the site's premium features.
- Original Data Format
- arff
- Name
- vowel
- Version mldata
- 0
- Comment
Introduction ============
In my work on context-sensitive learning, I used the "Deterding Vowel Recognition Data", but I found it necessary to reformulate the data. Implicit in the original data is contextual information on the speaker's gender and identity. For my work, it was necessary to make this information explicit. The file "vowel-context.data" adds the speaker's sex and identity as new features. The format of the data file is described below.
Peter Turney peter@ai.iit.nrc.ca
References ==========
P. Turney. "Robust Classification With Context-Sensitive Features." Proceedings of the Sixth International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEA/AIE-93): 268-276. 1993.
URL: ftp://ai.iit.nrc.ca/pub/ksl-papers/NRC-35074.ps.Z
P. Turney. "Exploiting Context When Learning to Classify." Proceedings of the European Conference on Machine Learning (ECML-93): 402-407. 1993.
URL: ftp://ai.iit.nrc.ca/pub/ksl-papers/NRC-35058.ps.Z
File Structure ============== Column Description ------------------------------- 0 Train or Test 1 Speaker Number 2 Sex 3 Feature 0 4 Feature 1 5 Feature 2 6 Feature 3 7 Feature 4 8 Feature 5 9 Feature 6 10 Feature 7 11 Feature 8 12 Feature 9 13 Class Numerical Codes =============== Speaker Code Number --------------------------- Andrew 0 Bill 1 David 2 Mark 3 Jo 4 Kate 5 Penny 6 Rose 7 Mike 8 Nick 9 Rich 10 Tim 11 Sarah 12 Sue 13 Wendy 14 Set Number --------------------------- Train 0 Test 1 Sex Number --------------------------- Male 0 Female 1 Class Number --------------------------- hid 0 hId 1 hEd 2 hAd 3 hYd 4 had 5 hOd 6 hod 7 hUd 8 hud 9 hed 10 Speaker Code Number Sex Train/Test --------------------------------------------------------------- Andrew 0 0 0 Bill 1 0 0 David 2 0 0 Mark 3 0 0 Jo 4 1 0 Kate 5 1 0 Penny 6 1 0 Rose 7 1 0 Mike 8 0 1 Nick 9 0 1 Rich 10 0 1 Tim 11 0 1 Sarah 12 1 1 Sue 13 1 1 Wendy 14 1 1
Num Instances: 990 Num Attributes: 14 Num missing: 0 / 0.0%
name type enum ints real missing distinct (1)
1 'Train or Test' Enum 100% 0% 0% 0 / 0% 2 / 0% 0% 2 'Speaker Number' Enum 0% 100% 0% 0 / 0% 15 / 2% 0% 3 'Sex' Enum 0% 100% 0% 0 / 0% 2 / 0% 0% 4 'Feature 0' Real 0% 0% 100% 0 / 0% 853 / 86% 74% 5 'Feature 1' Real 0% 0% 100% 0 / 0% 877 / 89% 78% 6 'Feature 2' Real 0% 0% 100% 0 / 0% 815 / 82% 67% 7 'Feature 3' Real 0% 0% 100% 0 / 0% 836 / 84% 71% 8 'Feature 4' Real 0% 0% 100% 0 / 0% 803 / 81% 66% 9 'Feature 5' Real 0% 0% 100% 0 / 0% 798 / 81% 64% 10 'Feature 6' Real 0% 0% 100% 0 / 0% 748 / 76% 57% 11 'Feature 7' Real 0% 0% 100% 0 / 0% 794 / 80% 64% 12 'Feature 8' Real 0% 0% 100% 0 / 0% 788 / 80% 63% 13 'Feature 9' Real 0% 0% 100% 0 / 0% 775 / 78% 60% 14 'Class' Enum 0% 100% 0% 0 / 0% 11 / 1% 0%
Relabeled values in attribute 'Speaker Number' From: 0 To: Andrew
From: 1 To: Bill
From: 2 To: David
From: 3 To: Mark
From: 4 To: Jo
From: 5 To: Kate
From: 6 To: Penny
From: 7 To: Rose
From: 8 To: Mike
From: 9 To: Nick
From: 10 To: Rich
From: 11 To: Tim
From: 12 To: Sarah
From: 13 To: Sue
From: 14 To: Wendy
Relabeled values in attribute 'Sex' From: 0 To: Male
From: 1 To: Female
Relabeled values in attribute 'Class' From: 0 To: hid
From: 1 To: hId
From: 2 To: hEd
From: 3 To: hAd
From: 4 To: hYd
From: 5 To: had
From: 6 To: hOd
From: 7 To: hod
From: 8 To: hUd
From: 9 To: hud
From: 10 To: hed
- Names
- Train or Test,Speaker Number,Sex,Feature 0,Feature 1,Feature 2,Feature 3,Feature 4,Feature 5,Feature 6,
- Types
- nominal:Train,Test
- nominal:Andrew,Bill,David,Mark,Jo,Kate,Penny,Rose,Mike,Nick,Rich,Tim,Sarah,Sue,Wendy
- nominal:Male,Female
- numeric
- numeric
- numeric
- numeric
- numeric
- numeric
- numeric
- Data (first 10 data points)
Trai... Spea... Sex Feat... Feat... Feat... Feat... Feat... Feat... Feat... ... Train Andrew Male -3.639 0.418 -0.67 1.779 -0.168 1.627 -0.388 ... Train Andrew Male -3.327 0.496 -0.694 1.365 -0.265 1.933 -0.363 ... Train Andrew Male -2.12 0.894 -1.576 0.147 -0.707 1.559 -0.579 ... Train Andrew Male -2.287 1.809 -1.498 1.012 -1.053 1.06 -0.567 ... Train Andrew Male -2.598 1.938 -0.846 1.062 -1.633 0.764 0.394 ... Train Andrew Male -2.852 1.914 -0.755 0.825 -1.588 0.855 0.217 ... Train Andrew Male -3.482 2.524 -0.433 1.048 -1.995 0.902 0.322 ... Train Andrew Male -3.941 2.305 0.124 1.771 -1.815 0.593 -0.435 ... Train Andrew Male -3.86 2.116 -0.939 0.688 -0.675 1.679 -0.512 ... Train Andrew Male -3.648 1.812 -1.378 1.578 0.065 1.577 -0.466 ... ... ... ... ... ... ... ... ... ... ... ...
- Description
A jarfile containing 37 classification problems, originally obtained from the UCI repository (datasets-UCI.jar, 1,190,961 Bytes).
- URLs
- (No information yet)
- Publications
- Data Source
- http://www.ics.uci.edu/~mlearn/MLRepository.html
- Measurement Details
- Usage Scenario
- revision 1
- by mldata on 2010-11-06 09:57
No one has posted any comments yet. Perhaps you would like to be the first?
Leave a comment
To post a comment, please sign in.This item was downloaded 4418 times and viewed 6458 times.
No Tasks yet on dataset datasets-UCI vowel
Submit a new Task for this Data itemDisclaimer
We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.
Acknowledgements
This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
http://www.pascal-network.org/.