View statlib-20050214 pbcseq (public)
























- Summary
(No information yet)
- License
- unknown (from Weka repository)
- Dependencies
- Tags
- arff slurped Weka
- Attribute Types
- Integer,Floating Point,String
- Download
-
# Instances: 1945 / # Attributes: 19
HDF5 (575.0 KB) XML CSV ARFF LibSVM Matlab OctaveFiles are converted on demand and the process can take up to a minute. Please wait until download begins.
You can edit this item to add more meta information and make use of the site's premium features.
- Original Data Format
- arff
- Name
- pbcseq
- Version mldata
- 0
- Comment
Primary Biliary Cirrhosis This data set is a follow-up to the original PBC data set, as discussed
in appendix D of Fleming and Harrington, Counting Processes and Survival Analysis, Wiley, 1991. An analysis based on the enclised data is found in Murtaugh PA. Dickson ER. Van Dam GM. Malinchoc M. Grambsch PM. Langworthy AL. Gips CH. "Primary biliary cirrhosis: prediction of short-term survival based on repeated patient visits." Hepatology. 20(1.1):126-34, 1994.
Quoting from F&H. "The following pages contain the data from the Mayo Clinic trial in primary biliary cirrhosis (PBC) of the liver conducted between 1974 and 1984. A description of the clinical background for the trial and the covariates recorded here is in Chapter 0, especially Section 0.2. A more extended discussion can be found in Dickson, et al., Hepatology 10:1-7 (1989) and in Markus, et al., N Eng J of Med 320:1709-13 (1989). "A total of 424 PBC patients, referred to Mayo Clinic during that ten-year interval, met eligibility criteria for the randomized placebo controlled trial of the drug D-penicillamine. The first 312 cases in the data set participated in the randomized trial and contain largely complete data. The additional 112 cases did not participate in the clinical trial, but consented to have basic measurements recorded and to be followed for survival. Six of those cases were lost to follow-up shortly after diagnosis, so the data here are on an additional 106 cases as well as the 312 randomized participants. Missing data items are denoted by `.'. "
The F&H data set contains only baseline measurements of the laboratory paramters. This data set contains multiple laboratory results, but only on the first 312 patients. Some baseline data values in this file differ from the original PBC file, for instance, the data errors in prothrombin time and age which were discovered after the orignal analysis, during research work on dfbeta residuals. (These two data points are discussed in F&H, figure 4.6.7). Another major difference is that there was significantly more follow-up for many of the patients at the time this data set was assembled.
One "feature" of the data deserves special comment. The last observation before death or liver transplant often has many more missing covariates than other data rows. The original clinical protocol for these patients specified visits at 6 months, 1 year, and annually thereafter. At these protocol visits lab values were obtained for a large pre-specified battery of tests. "Extra" visits, often undertaken because of worsening medical condition, did not necessarily have all this lab work. The missing values are thus potentially informative, and violate the usual "missing at random" (MCAR or MAC) assumptions that are assumed in analyses. Because of the earlier published results on the Mayo PBC risk score, however, the 5 variables involved in that computation were usually obtained, i.e., age, bilirubin, albumin, prothrombin time, and edema score.
Variables: case number number of days between registration and the earlier of death, transplantion, or study analysis time status: 0=alive, 1=transplanted, 2=dead drug: 1= D-penicillamine, 0=placebo age in days, at registration sex: 0=male, 1=female day: number of days between enrollment and this visit date, remaining values on the line of data refer to this visit. presence of asictes: 0=no 1=yes presence of hepatomegaly 0=no 1=yes presence of spiders 0=no 1=yes presence of edema 0=no edema and no diuretic therapy for edema; .5 = edema present without diuretics, or edema resolved by diuretics; 1 = edema despite diuretic therapy serum bilirubin in mg/dl serum cholesterol in mg/dl albumin in gm/dl alkaline phosphatase in U/liter SGOT in U/ml (serum glutamic-oxaloacetic transaminase, the enzyme name has subsequently changed to "ALT" in the medical literature) platelets per cubic ml / 1000 prothrombin time in seconds histologic stage of disease
Information about the dataset CLASSTYPE: numeric CLASSINDEX: 3
- Names
- case_number,number_of_days,status,drug,age,sex,day,presence_of_asictes,presence_of_hepatomegaly,presence_of_spiders,
- Types
- numeric
- numeric
- numeric
- nominal:0,D-penicillamine
- numeric
- nominal:female,male
- nominal:1002,1013,1015,1027,1028,1030,1035,1036,1038,1050,1054,1055,1056,1058,1061,1062,1063,1064,1065,1066,1067,1070,1071,1072,1075,1076,1077,1078,108,1080,1081,1082,1083,1084,1085,1086,1087,1089,1090,1091,1092,1093,1094,1095,1096,1097,1098,1099,1100,1101,1102,1103,1104,1105,1106,1108,1109,1111,1112,1113,1115,1116,1118,1119,1120,1121,1122,1125,1126,1128,113,1132,1134,1137,1139,1140,1141,1142,1143,1145,1147,1150,1151,1153,1157,1160,1161,1168,1175,1179,1180,1182,1187,1189,1190,1191,1192,1194,1196,1204,1211,1219,1222,1229,1231,1233,1243,1254,1266,1274,1280,1288,1290,1296,1301,1302,1306,1309,1316,1324,1336,1342,1344,1352,1353,1357,1359,1361,1362,1363,1366,1372,1380,1381,1383,1385,1390,1396,1399,1408,1411,142,1420,1421,1423,1425,1426,1432,1433,1434,1435,1437,144,1440,1441,1444,1448,145,1452,1453,1454,1455,1456,1459,1460,1461,1462,1463,1464,1466,1467,1468,1469,147,1470,1471,1472,1474,1475,1476,1477,1478,1479,1481,1482,1483,1484,1488,1489,1491,1492,1497,1500,1503,1505,1506,1509,1510,1511,1512,1513,1517,152,1521,1523,1524,1526,1530,1532,1539,1541,1542,1544,1545,1547,1549,155,1554,1558,1559,156,1568,1569,157,1575,158,1588,1589,159,1600,161,162,1624,1629,163,164,1644,1645,1646,1651,1652,1653,166,1669,167,1671,1673,1679,168,169,170,171,1714,1717,1718,172,1729,173,1734,174,1741,1748,1749,175,1750,1751,1754,1758,176,1763,1767,177,1771,1775,1778,1779,178,1783,1787,179,1790,1791,1792,1797,1798,180,1800,1805,1806,181,1813,1817,1818,1819,182,1820,1821,1822,1824,1827,1829,183,1830,1831,1832,1834,1835,1836,1838,184,1841,1844,1847,1848,1849,185,1851,1854,1855,1858,1859,186,1861,1862,1866,1868,187,1870,1873,1875,188,1880,1881,1883,1886,189,1890,1892,1898,190,1904,1905,1906,1908,191,1910,1915,1918,192,1922,1923,1924,1926,1928,193,1932,1933,1939,194,1942,1944,1945,1946,195,196,197,1979,198,1980,1985,1987,1989,199,1992,1996,200,2001,2003,2007,2008,201,202,2020,2029,203,2033,2035,2036,204,2047,205,2050,2056,206,2065,207,2072,2079,208,2082,2089,209,2093,2096,2099,2103,2106,211,2119,212,2121,2135,2136,214,2143,2144,215,2151,2157,216,2165,2169,217,2173,2174,2175,2176,2177,2179,218,2185,2189,219,2190,2191,2192,2193,2194,2195,2196,2198,2199,220,2200,2202,2204,2205,2211,2212,2214,2215,2217,2218,2219,2220,2224,2225,2228,2232,2233,2236,2237,2246,2247,225,2253,2255,2258,2262,2264,2267,2269,2271,2275,2276,2278,2289,2290,2295,2303,2304,231,2311,2313,2320,2332,234,2341,2344,2351,2358,2364,2367,238,2380,2397,2399,2402,2406,241,2414,2420,243,2431,2434,2444,2453,2454,2465,2474,2478,2483,249,2490,2495,2499,2500,2503,2512,2513,2515,2516,2517,2522,2529,2533,2535,2539,2540,2541,2542,2544,2548,2550,2554,2555,2557,2559,2561,2563,2564,2566,2568,2577,258,2582,2587,2588,259,2590,2591,2596,2598,260,2600,2602,2603,2609,2611,2619,2627,2629,2639,2640,2643,2650,2651,2654,2659,2664,267,2670,2671,2677,2682,2688,2691,2696,271,2714,2722,2735,275,2766,2767,2768,2778,2785,2787,2792,2803,2808,2811,282,2834,2866,2867,2869,2870,2871,2875,2882,2885,2889,2890,2891,2895,2897,2908,2913,2917,2919,2921,2922,2924,2926,2928,2929,2932,2936,2941,2942,2945,2947,2948,2961,2965,2968,2977,2981,2983,2985,2988,2989,299,2993,2994,2997,3004,3007,3020,3021,3043,3045,3046,305,3050,3051,3073,308,3086,3109,311,3113,3131,3137,3150,3157,3178,318,320,3203,3209,321,3218,3219,3226,3228,3239,3241,3254,3255,3256,3258,326,3261,3272,3275,3280,3282,3284,3285,3286,3290,3291,3298,330,3311,3313,3318,333,3332,334,3340,3341,3342,335,3352,3354,3357,3377,338,3381,3390,3394,3397,3402,341,3414,342,3428,343,3430,344,345,3464,347,3477,348,349,3494,350,351,3513,352,3521,3529,353,3537,354,355,356,357,358,359,3590,3592,360,361,3611,3616,362,3626,3627,363,3631,3637,364,3647,3649,365,366,3663,3669,367,368,3682,3683,3687,369,3692,3694,370,3700,3703,371,3716,3718,372,3720,373,3734,374,375,376,3766,377,378,379,380,381,3812,382,383,384,3841,385,386,387,388,389,3892,390,391,392,393,395,3956,396,3966,397,3986,399,3996,401,4011,4012,4014,4018,403,4034,4036,404,4046,4049,4054,4059,407,4075,4091,410,4105,4115,412,4123,4126,414,416,418,4181,419,4206,421,4243,425,4263,428,430,431,433,4333,4336,4347,4351,4389,441,4417,4418,442,4435,4438,444,4457,4458,446,447,454,4565,4580,463,465,4696,4704,4714,4715,4775,483,4832,4845,4865,4877,495,496,503,5076,5118,512,514,5152,516,525,530,535,538,545,550,553,564,566,596,610,623,626,645,650,654,655,656,663,673,675,676,678,679,680,689,691,692,693,695,696,700,704,705,709,711,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,758,759,760,761,762,763,764,768,769,772,773,774,775,781,782,784,787,790,793,795,796,797,798,804,807,808,812,816,817,819,821,822,825,826,832,838,840,843,847,851,860,862,868,881,889,890,899,904,905,906,913,916,920,925,930,931,942,952,978,979,980,983,984,988,996,no
- nominal:no,yes
- nominal:no,yes
- nominal:0,1
- Data (first 10 data points)
case... numb... status drug age sex day pres... pres... pres... ... 1 400 2 D-pe... 21464 female no yes yes 1.0 ... 1 400 2 D-pe... 21464 female 192 yes yes 1.0 ... 2 5169 0 D-pe... 20617 female no no yes 1.0 ... 2 5169 0 D-pe... 20617 female 182 no yes 1.0 ... 2 5169 0 D-pe... 20617 female 365 no yes 1.0 ... 2 5169 0 D-pe... 20617 female 768 no yes 1.0 ... 2 5169 0 D-pe... 20617 female 1790 yes yes 1.0 ... 2 5169 0 D-pe... 20617 female 2151 yes yes 1.0 ... 2 5169 0 D-pe... 20617 female 2515 yes yes 1.0 ... 2 5169 0 D-pe... 20617 female 2882 yes yes 1.0 ... ... ... ... ... ... ... ... ... ... ... ...
- Description
A gzip'ed tar containing StatLib datasets (statlib-20050214.tar.gz, 12,785,582 Bytes)
- URLs
- (No information yet)
- Publications
- Data Source
- http://lib.stat.cmu.edu/datasets/
- Measurement Details
- Usage Scenario
- revision 1
- by mldata on 2010-11-06 10:00
No one has posted any comments yet. Perhaps you would like to be the first?
Leave a comment
To post a comment, please sign in.This item was downloaded 2767 times and viewed 2037 times.
No Tasks yet on dataset statlib-20050214 pbcseq
Submit a new Task for this Data itemDisclaimer
We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.
Acknowledgements
This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
http://www.pascal-network.org/.