BK: Data mining
data ------> knowledge
Are all patterns interesting?
No. only a small fraction of the patterns potentially generated would actually be of interest to a given user.
What makes a pattern interesting?
- easily understood by humans
- valid
- potentially useful
- novel
- An interesting pattern represents knowledge.
Can a data mining system generate all of the interesting patterns?
It is often unrealistic and inefficient for data mining systems to generate all possible pattern.
1.7 Major issue in data mining
major issues:
- mining methodology
- user interaction
- efficiency and scalability可扩展性
- diversity of data types
- data mining and society
BK: Data mining的更多相关文章
- BK: Data mining: concepts and techniques (1)
Chapter 1 data mining is knowledge discovery from data; The knowledge discovery process is an iterat ...
- BK: Data mining, Chapter 2 - getting to know your data
Why: real-world data are typically noisy, enormous in volume, and may originate from a hodgepodge of ...
- Distributed Databases and Data Mining: Class timetable
Course textbooks Text 1: M. T. Oszu and P. Valduriez, Principles of Distributed Database Systems, 2n ...
- What is the most common software of data mining? (整理中)
What is the most common software of data mining? 1 Orange? 2 Weka? 3 Apache mahout? 4 Rapidminer? 5 ...
- What’s the difference between data mining and data warehousing?
Data mining is the process of finding patterns in a given data set. These patterns can often provide ...
- A web crawler design for data mining
Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...
- Datasets for Data Mining and Data Science
https://github.com/mattbane/RecommenderSystem http://grouplens.org/datasets/movielens/ KDDCUP-2012官网 ...
- cluster analysis in data mining
https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...
- Weka 3: Data Mining Software in Java
官方网站: Weka 3: Data Mining Software in Java 相关使用方法博客 WEKA使用教程(经典教程转载) (实例数据:bank-data.csv) Weka初步一.二. ...
随机推荐
- JavaScript-迭代器模式
迭代器模式 顺序访问一个集合 使用者无需知道集合内部结构(封装) jQuery 示例 <!DOCTYPE html> <html> <head> <meta ...
- MySQL 什么是事务?
该文为< MySQL 实战 45 讲>的学习笔记,感谢查看,如有错误,欢迎指正 一.事务简介 事务就是为了保证一组数据库操作,要么全部成功,要么全部失败. 事务是在引擎层实现的,也就是说并 ...
- 静态存储SRAM设计
SRAM即静态随机存取存储器.它是具有静止存取功能的内存,不需要刷新电路便能保存它内部存储的数据.在工业与科学用的很多子系统,汽车电子等等都用到了SRAM.现代设备中很多都嵌入了几千字节的SRAM.实 ...
- sqli-labs-Basic Challenges
sqli_labs注入学习 一.SQL基本语法 1.1show databases; 显示MySQL数据库里边所有的库: 1.2use [table name]; 使用特定的数据库: 1.3show ...
- python爬虫2:按html标签提取信息和中文域名处理(BeautifulSoup用法初步)
#!/usr/bin/env python # -*- coding: utf- -*- # python3 import string import urllib from urllib impor ...
- python基礎學習第一天
python歷史 Python 是一种解释型.面向对象.动态数据类型的高级程序设计语言.Python 由 Guido van Rossum 于 1989 年底在荷兰国家数学和计算机科学研究所设计出来 ...
- EF Core For Oracle11中Find FirstOrDefault等方法执行失败
问题描述 最近在使用ef core连接oracle的发现Find.FirstOrDefault.Skip Task分页等等方法执行失败.使用的是docker安装的oracle11,错误如下图: 解决办 ...
- 二次剩余的判定及Cipolla算法
二次剩余 ppp是奇素数.所有的运算都是在群Zp∗Z_{p}^{*}Zp∗中的运算.方程x2=a≠0x^2=a \neq 0x2=a̸=0问是否有解,以及解是什么?若有解,aaa就是模ppp的二次 ...
- MySQL优化、锁
1. MySQL优化-查看执行记录 MySQL 提供了一个 EXPLAIN 命令, 它可以对 SELECT 语句进行分析, 并输出 SELECT 执行的详细信息, 以供开发人员针对性优化. 使用ex ...
- 剑指offer-面试题35-复杂链表的复制-链表
/* 题目: 实现一个函数,复制复杂链表,返回复制链表的头节点. */ /* 思路: 第一步,复制一个链表S‘,插在原链表S中. 第二步,链表S’复制链表S的random指针. 第三步:拆分链表S和S ...