Problem: clustering

A clustering network transforms the data into another space and then selects one of the clusters. Next, the autoencoder associated with this cluster is used to reconstruct the data-point.

Introduction:

traditional method: data------> extract a feature vector from each object --------> aggregate groups of vectors in a feature space.

cluster is represented by an autoencoder network. ??how

common method: k-means; but for the high-dimensional dataset, it's less useful because inter-point distances become less informative in high-dimensional spaces.

如果对于找一个序列的pattern来说,是不是就是时间维度作为高维情况,每个pattern作为一个cluster,而有的子序列不能归到cluster当中。

representation learning has been used to map the input data into a low-dimensional feature space.

Attempts: apply unsupervised deep learning approaches for clustering.  ??how

However, most focus on clustering over a low-dimensional feature space.

Transform the data into more clustering-friendly representations:

A deep version of k-means is based on learning a data representation and applying k-means in the embedded space.

How to represent a cluster:

a vector VS an autoencoder network.

Data collapsing problem: 数据崩溃问题,对于每个数据库,你必须重新调一遍程序。

for multivariate time series, how to find patterns.

1. find patterns: SAX; TICC; slide windows; 导数

2. VG, statistic features.

3.

Supplementary knowledge: 

1. Pattern recognition and clustering

Pattern recognition is a mature field in computer science with well-established techniques for the assignment of unknown patterns to categories, or classes. A pattern is defined as a vector of some number of measurements, called features. Usually, a pattern recognition system uses training samples from known categories to form a decision rule for unknown patterns. The unknown pattern is assigned to one of the categories according to the decision rule. Since we are interested in the classes of documents that have been assigned by the user, we can use pattern recognition techniques to try to classify previously unseen documents into the user's categories. While pattern recognition techniques require that the number and labels of categories are known, clustering techniques are unsupervised, requiring no external knowledge of categories. Clustering methods simply try to group similar patterns into clusters whose members are more similar to each other (according to some distance measure) than to members of other clusters. There is no a priori knowledge of patterns that belong to certain groups, or even how many groups are appropriate. Refer to basic pattern recognition and clustering texts such as [567] for further information.

We first employ pattern recognition techniques on documents to attempt to find features for classification, then focus on clustering the raw features of the documents.

PP: Deep clustering based on a mixture of autoencoders的更多相关文章

  1. 基于图嵌入的高斯混合变分自编码器的深度聚类(Deep Clustering by Gaussian Mixture Variational Autoencoders with Graph Embedding, DGG)

    基于图嵌入的高斯混合变分自编码器的深度聚类 Deep Clustering by Gaussian Mixture Variational Autoencoders with Graph Embedd ...

  2. 论文翻译:2021_Towards model compression for deep learning based speech enhancement

    论文地址:面向基于深度学习的语音增强模型压缩 论文代码:没开源,鼓励大家去向作者要呀,作者是中国人,在语音增强领域 深耕多年 引用格式:Tan K, Wang D L. Towards model c ...

  3. 论文解读SDCN《Structural Deep Clustering Network》

    前言 主体思想:深度聚类需要考虑数据内在信息以及结构信息. 考虑自身信息采用 基础的 Autoencoder ,考虑结构信息采用 GCN. 1.介绍 在现实中,将结构信息集成到深度聚类中通常需要解决以 ...

  4. 【论文阅读】Deep Clustering for Unsupervised Learning of Visual Features

    文章:Deep Clustering for Unsupervised Learning of Visual Features 作者:Mathilde Caron, Piotr Bojanowski, ...

  5. 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角

    [论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys  ...

  6. 论文笔记: Deep Learning based Recommender System: A Survey and New Perspectives

    (聊两句,突然记起来以前一个学长说的看论文要能够把论文的亮点挖掘出来,合理的进行概括23333) 传统的推荐系统方法获取的user-item关系并不能获取其中非线性以及非平凡的信息,获取非线性以及非平 ...

  7. Predicting effects of noncoding variants with deep learning–based sequence model | 基于深度学习的序列模型预测非编码区变异的影响

    Predicting effects of noncoding variants with deep learning–based sequence model PDF Interpreting no ...

  8. Deep Clustering Algorithms

    Deep Clustering Algorithms 作者:凯鲁嘎吉 - 博客园 http://www.cnblogs.com/kailugaji/ 本文研究路线:深度自编码器(Deep Autoen ...

  9. Paper Reading——LEMNA:Explaining Deep Learning based Security Applications

    Motivation: The lack of transparency of the deep  learning models creates key barriers to establishi ...

随机推荐

  1. 《自拍教程5》Python自动化测试学习思路

    前提:熟悉测试业务及流程 任何Python自动化测试的前提,都是必须先熟悉实际测试业务. 任何脱离实际测试业务的自动化都是噱头且无实际意义! 测试的基本流程基本是: 测试需求分析,测试用例设计与评审, ...

  2. Linux系统下常见的数据盘分区丢失的问题以及对应的处理方法

    在修复数据前,您必须先对分区丢失的数据盘创建快照,在快照创建完成后再尝试修复.如果在修复过程中出现问题,您可以通过快照回滚将数据盘还原到修复之前的状态. 前提条件 在修复数据前,您必须先对分区丢失的数 ...

  3. CentOS 7.6下安装 NVM 管理不同版本的 Node.js

    学习网站:https://www.linuxidc.com/Linux/2019-10/160918.htm

  4. python环境开发

    Python3 下载 Python3 最新源码,二进制文档,新闻资讯等可以在 Python 的官网查看到: Python 官网:https://www.python.org/ 你可以在以下链接中下载 ...

  5. margin合并及解决办法

    外边距合并指的是,当两个垂直外边距相遇时,它们将形成一个外边距. 合并后的外边距的高度等于两个发生合并的外边距的高度中的较大者 水平方向不会发生合并 只有普通文档流中块框的垂直外边距才会发生外边距合并 ...

  6. C# 如何实现完整的INI文件读写类

    作者: 魔法软糖 日期: 2020-02-27 引言 ************************************* .ini 文件是Initialization File的缩写,即配置文 ...

  7. AB实验人群定向HTE模型5 - Meta Learner

    Meta Learner和之前介绍的Casual Tree直接估计模型不同,属于间接估计模型的一种.它并不直接对treatment effect进行建模,而是通过对response effect(ta ...

  8. 转载 CXF动态调用webservice

    /** * * @param wsdlUrl wsdl的地址:http://localhost:8001/demo/HelloServiceDemoUrl?wsdl * @param methodNa ...

  9. 牛客网剑指offer第13题——调整数组顺序使得奇数位于偶数前面

    题目来源:剑指offer 题目: 输入一个整数数组,实现一个函数来调整该数组中数字的顺序,使得所有的奇数位于数组的前半部分,所有的偶数位于数组的后半部分,并保证奇数和奇数,偶数和偶数之间的相对位置不变 ...

  10. Wannafly Winter Camp 2020 Day 6H 异或询问 - 二分

    给定一个长 \(n\) 的序列 \(a_1,\dots,a_n\),定义 \(f(x)\) 为有多少个 \(a_i \leq x\) 有 \(q\) 次询问,每次给定 \(l,r,x\),求 \(\s ...