SCATTER: Fully Automated Classification System across Multiple Databases
Loading...
Date
2019-07
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Université de M'sila
Abstract
Data mining approaches performed recently use data coming from a single table and are not adapted to multiple tables.
Moreover, computer network expansion and data sources diversity require new data mining systems handling databases
heterogeneity in multi-database systems. In this paper, we propose SCATTER: a fully automated classification system from multiple
heterogeneous databases. SCATTER is composed of three components. The first component uses schema matching techniques to
find foreign-key links across the multi-database system. The second component tries to find the most useful links that are critical for
producing accurate classes across multiple databases. The last component is a decision tree classification algorithm which exploits
the useful links discovered automatically across the databases. Experiments performed on real databases were very satisfactory with
an average accuracy of 86.5% and showed that SCATTER system succeeded in achieving a fully automated classification from
multiple heterogeneous databases.
Description
Keywords
Classification, Heterogeneous databases, Link discovery, Link usefulness, Multi-database mining