SCATTER: Fully Automated Classification System across Multiple Databases

Loading...
Thumbnail Image

Date

2019-07

Journal Title

Journal ISSN

Volume Title

Publisher

Université de M'sila

Abstract

Data mining approaches performed recently use data coming from a single table and are not adapted to multiple tables. Moreover, computer network expansion and data sources diversity require new data mining systems handling databases heterogeneity in multi-database systems. In this paper, we propose SCATTER: a fully automated classification system from multiple heterogeneous databases. SCATTER is composed of three components. The first component uses schema matching techniques to find foreign-key links across the multi-database system. The second component tries to find the most useful links that are critical for producing accurate classes across multiple databases. The last component is a decision tree classification algorithm which exploits the useful links discovered automatically across the databases. Experiments performed on real databases were very satisfactory with an average accuracy of 86.5% and showed that SCATTER system succeeded in achieving a fully automated classification from multiple heterogeneous databases.

Description

Keywords

Classification, Heterogeneous databases, Link discovery, Link usefulness, Multi-database mining

Citation

Collections