Volume 5 Number 7 (Jul. 2010)
Home > Archive > 2010 > Volume 5 Number 7 (Jul. 2010) >
JCP 2010 Vol.5(7): 995-1002 ISSN: 1796-203X
doi: 10.4304/jcp.5.7.995-1002

Utility Maximization Model for Deep Web Source Selection and Integration

Xuefeng Xian1, 2, Zhiming Cui1, 2, Pengpeng Zhao1, 2, Yuanfeng Yang1, 2, and Guangming Zhang2
1 JiangSu Province Support Software Engineering R&D Center for Modern Information Technology Application in Enterprise, Suzhou, China
2 The Institute of Intelligent Information Processing and Application, Soochow University, Suzhou, China


Abstract—The World Wide Web is witnessing an increase in the amount of structured content--vast collection of structured data are on the rise due to the deep web. Such Internet-scale deep web data integration tasks are becoming increasingly more common. In Internet-scale deep web data integration tasks, a primary challenge is to determine in which web database to be included in the integration system. This paper presents a utility maximization model for resources selection of deep web data integration. This new model shows an efficient and effective way to estimate the approximate utility of the web database bringing to a given status of an integration system by integrating it. The utility of the web databases is synthesized by positive and negative utility. With the estimated utility information, web database selection can be made by explicitly optimizing the goal of high-utility(include as much and important data as possible in the selected databases, and the query cost of which as low as possible) in an iterative manner, where web databases are integrated incrementally. We experimentally demonstrate that our approach is efficient and finding highutility data integration solutions.

Index Terms—deep web; data integration; utility maximization model; web database selection

[PDF]

Cite: Xuefeng Xian, Zhiming Cui, Pengpeng Zhao, Yuanfeng Yang, and Guangming Zhang, " Utility Maximization Model for Deep Web Source Selection and Integration," Journal of Computers vol. 5, no. 7, pp. 995-1002, 2010.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>