Volume 11 Number 2 (Mar. 2016)
Home > Archive > 2016 > Volume 11 Number 2 (Mar. 2016) >
JCP 2016 Vol.11(2): 99-108 ISSN: 1796-203X
doi: 10.17706/jcp.11.2.99-108

Efficient Extraction for Mobile Web Access Log with Caching Strategy

Lifeng Gao1, Min Zhu1, Mengying Li1, Yu Cao2, Weixue Zhang1
1College of Computer Science, Sichuan University, Chengdu, China.
2EMC Labs, Beijing, China.


Abstract—Mobile web access log file plays an important role in the analysis about demand of mobile terminal market or user behavior. However, the log file data is highly dimensional, disorganized and semi-structured, which heightens the difficulty of data extracting accuracy; while it generates and transmits continuously, which poses an extracting efficiency challenge. It is highly desirable to extract the information embedded in log files as disorder or hidden situation efficiently and accurately. This paper proposes an efficient extracting method for mobile web access log with cache strategy. Firstly of all, data dictionary sets are built for each kind of complex field before extracting. Then, data is extracted based on the dictionaries, and the dictionaries will be completed simultaneously. Furthermore, with the discovery of the distribution of some data in the log generally following Zipf-like distribution, cache strategy is considered to be an auxiliary way to reduce mapping time. In addition, the classical cache strategy LFU is chosen. Ultimately, the experiment shows that the data could be extracted from the log accurately, and the extracting efficiency speeds up remarkably with LFU cache strategy.

Index Terms—Data extraction, mobile web access log, data dictionary, cache strategy, Zipf-like distribution.

[PDF]

Cite: Lifeng Gao, Min Zhu, Mengying Li, Yu Cao, Weixue Zhang, "Efficient Extraction for Mobile Web Access Log with Caching Strategy," Journal of Computers vol. 11, no. 2, pp. 99-108, 2016.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>