v

您的位置:VeryCD图书计算机与网络

图书资源事务区


《Web数据挖掘:挖掘Web内容模式、结构和用途》文字版[PDF]

  • 状态: 精华资源
  • 摘要:
    图书分类网络
    出版社Wiley Blackwell
    发行时间2007年04月01日
    语言英文
  • 时间: 2012/09/23 12:34:19 发布 | 2012/09/23 14:33:42 更新
  • 分类: 图书  计算机与网络 

csfer

精华资源: 10

全部资源: 13

相关: 分享到新浪微博   转播到腾讯微博   分享到开心网   分享到人人   分享到QQ空间   订阅本资源RSS更新   美味书签  subtitle
该内容尚未提供权利证明,无法提供下载。
中文名Web数据挖掘:挖掘Web内容模式、结构和用途
图书分类网络
资源格式PDF
版本文字版
出版社Wiley Blackwell
书号0471666556
发行时间2007年04月01日
地区美国
语言英文
简介

IPB Image

内容介绍:

This book introduces the reader to methods of data mining on the web, including uncovering patterns in web content (classification, clustering, language processing), structure (graphs, hubs, metrics), and usage (modeling, sequence analysis, performance).

内容截图:

IPB Image



目录

PREFACE
PART I: WEB STRUCTURE MINING
1 INFORMATION RETRIEVAL AND WEB SEARCH
Web Challenges
Web Search Engines
Topic Directories
Semantic Web
Crawling the Web
Web Basics
Web Crawlers
Indexing and Keyword Search
Document Representation
Implementation Considerations
Relevance Ranking
Advanced Text Search
Using the HTML Structure in Keyword Search
Evaluating Search Quality
Similarity Search
Cosine Similarity
Jaccard Similarity
Document Resemblance
References
Exercises
2 HYPERLINK-BASED RANKING
Introduction
Social Networks Analysis
PageRank
Authorities and Hubs
Link-Based Similarity Search
Enhanced Techniques for Page Ranking
References
Exercises
PART II: WEB CONTENT MINING
3 CLUSTERING
Introduction
Hierarchical Agglomerative Clustering
k-Means Clustering
Probabilty-Based Clustering
Finite Mixture Problem
Classification Problem
Clustering Problem
Collaborative Filtering (Recommender Systems)
References
Exercises
4 EVALUATING CLUSTERING
Approaches to Evaluating Clustering
Similarity-Based Criterion Functions
Probabilistic Criterion Functions
MDL-Based Model and Feature Evaluation.
Minimum Description Length Principle.
MDL-Based Model Evaluation
Feature Selection
Classes-to-Clusters Evaluation
Precision, Recall, and F-Measure
Entropy
References
Exercises
5 CLASSIFICATION
General Setting and Evaluation Techniques
Nearest-Neighbor Algorithm
Feature Selection
Naive Bayes Algorithm
Numerical Approaches
Relational Learning
References
Exercises
PART III: WEB USAGE MINING
6 INTRODUCTION TO WEB USAGE MINING
Definition of Web Usage Mining
Cross-Industry Standard Process for Data Mining
Clickstream Analysis
Web Server Log Files
Remote Host Field
Date/Time Field
HTTP Request Field
Status Code Field
Transfer Volume (Bytes) Field
Common Log Format
Identification Field
Authuser Field
Extended Common Log Format
Referrer Field
User Agent Field
Example of a Web Log Record
Microsoft IIS Log Format
Auxiliary Information
References
Exercises
7 PREPROCESSING FOR WEB USAGE MINING
Need for Preprocessing the Data
Data Cleaning and Filtering
Page Extension Exploration and Filtering
De-Spidering the Web Log File
User Identification
Session Identification
Path Completion
Directories and the Basket Transformation
Further Data Preprocessing Steps
References
Exercises
8 EXPLORATORY DATA ANALYSIS FOR WEB USAGE MINING
Introduction
Number of Visit Actions
Session Duration
Relationship between Visit Actions and Session Duration
Average Time per Page
Duration for Individual Pages
References
Exercises
9 MODELING FOR WEB USAGE MINING: CLUSTERING, ASSOCIATION, AND CLASSIFICATION
Introduction
Modeling Methodology
Definition of Clustering
The BIRCH Clustering Algorithm
Affinity Analysis and the A Priori Algorithm
Discretizing the Numerical Variables: Binning
Applying the A Priori Algorithm to the CCSU Web Log Data
Classification and Regression Trees
The C4.5 Algorithm
References
Exercises
INDEX

正在读取……

这里是其它用户补充的资源(我也要补充):

暂无补充资源
正在加载,请稍等...

点击查看所有153网友评论

 

(?) [公告]留口水、评论相关规则 | [活动]每日签到 轻松领取电驴经验

    小贴士:
  1. 类似“顶”、“沙发”之类没有营养的文字,对勤劳贡献的楼主来说是令人沮丧的反馈信息。
  2. 提问之前请再仔细看一遍楼主的说明,或许是您遗漏了。
  3. 勿催片。请相信驴友们对分享是富有激情的,如果确有更新版本,您一定能搜索到。
  4. 请勿到处挖坑绊人、招贴广告。既占空间让人厌烦,又没人会搭理,于人于己都无利。
  5. 如果您发现自己的评论不见了,请参考以上4条。