Data Analysis

I. Data Base and Preprocessing

Our basis for data analysis is the MEDAN data base, consisting of septic shock patient data. Data of up to 583 patients will be analysed for the design of an alarm system (see II.). The data includes numerical measurement data, operations, therapies and medicaments.

A very important work is the preprocessing of the data [1], e.g.

II. Classification and Alarm System

We investigated that mainly the measurement data can be used to classify the data into "survived" and "deceased". We use neural network techniques as a classification tool, mainly for their advantages in classifying nonlinear data [4]. We showed and quantify that the data can be better classified at the end of the intensive care unit stay than in the beginning, i.e. we proved the time dependency of the data. In Fig. 1 the general architecture of a neural network is shown.

Figure 1 A neural network architecture.

In Table 1 the classification performance of different subsystems - using the data of the last 3 days of ICU stay (138 patients) - is shown.


ROC area

thrombocytes, RRsys and RRdia (3 variables)


blood (14 variables)


SOFA (using 10 single variables)


heart (10 variables)


lung (11 variables)


infect (7 variables)


liver (13 variables)


kidney (9 variables)


acid/base (6 variables)


Table 1 Classification Results.

Using classification results we found out that variables like diastolic and systolic blood pressure and thrombocytes are in combination a very good indicator for critical, very critical or uncritical conditions of the patient. On this basis we built an alarm system prototype that can be used online in the ICU to support the physician. Compared to the SOFA score our system is at least as efficient as the score, but using less variables and a purely data driven alarm scheme. Subsequently, a more reliable feedback for the physician is expected.

III. Rule Generation

In addition to the classification process we generated rules as an explanation for the classification results of the neural network by using improved neuro fuzzy techniques [2] and techniques for the symbolic rule generation [3]. For example, numerical rules have the format

if v1 in [a1 , b1] and v2 in [a2 , b2] and ... and vm in [am , bm] then class c

with v1 , ... , vm m variables out of set of all the variables, ai < bi real measurement values and c a class label ("survived" or "deceased").

Selected publications (in english):

[1] Proc. of the 1st Int. Symp. of Medical Data Analysis (ISMDA 2000), Frankfurt am Main, Germany, R. Brause, E. Hanisch (Eds.), LNCS Vol. 1933, Springer-Verlag, , 130-137, 2000

"About the Analysis of Septic Shock Patient Data" ( J. Paetz, F. Hamker, and S. Thöne) .ps (329 kB) or .pdf (331kB)

The © copyright of the publication is held by Springer-Verlag.

[2] Proc. of the 1st IEEE Int. Conf. on Data Mining (ICDM 2001), San Jose, CA, USA, IEEE Computer Society Press, 637-638, 2001

"Metric Rule Generation with Septic Shock Patient Data" (J. Paetz)

[3] Proc. of the 2nd Int. Symp. of Medical Data Analysis (ISMDA 2001), Madrid, Spain, J. Crespo, V. Maojo and F. Martin (Eds.), LNCS Vol. 2199, Springer-Verlag, 207-212, 2001

"A Frequent Patterns Tree Approach for Rule Generation with Categorical Septic Shock Patient Data" (J. Paetz, R. Brause)

[4] Computational Intelligence Techniques in Medical Diagnosis and Prognosis, L.C. Jain (Ed.), Springer Verlag, in print

"Septic Shock Diagnosis by Neural Networks and Rule Based Systems" (R. Brause, F. Hamker, J. Paetz)