3. Related work

4. Suppression of data 5. Destroy data quality

168 Data Mining

2.2. K-anonymity

6. Adding mathematical noise

2.2.1. Classification of attributes

Table 2 is voter data.

Table 1. Medical dataset.

Table 2. Voter dataset.

meet the requirement. This is k-anonymity [6, 7].

directly. It is always removed before release.

A release of data is said to have the k-anonymity property if the information for each person contained in the release cannot be distinguished from a least k-1 individuals whose information also appear in the release. For example, if you try to identify a person from a release dataset but you only have information of his/her birth date and gender. There are k people that

Key attribute is name, address, and cell phone, which can uniquely identify an individual

Quasi-identifier is a zip code, birth date, and gender, a set of attributes that can be potentially linked with external information to re-identify entities. Eighty-seven percent of the population in the USA can be uniquely identified based on these attributes, according to the census summary data in 1991. There are two tables shown below: Table 1 is hospital dataset and

DOB Sex Zip code Disease 1/21/1976 M 65715 Heart disease 4/13/1986 F 65715 Hepatitis 2/28/1976 M 65703 Bronchitis 1/21/1976 M 65703 Broken arm

4/13/1986 F 65706 Flu 2/28/1976 F 65706 Hang nail

Name DOB Sex Zip code Andre 1/21/1976 Male 53715 Beth 1/10/1981 Female 55410 carol 10/1/1944 Female 90210 Dan 2/21/1984 Male 02174 Ellen 4/19/1972 Female 02237

### 3.1. FANNST algorithm
