题目如下:

Given a list of directory info including directory path, and all the files with contents in this directory, you need to find out all the groups of duplicate files in the file system in terms of their paths.

A group of duplicate files consists of at least two files that have exactly the same content.

A single directory info string in the input list has the following format:

"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"

It means there are n files (f1.txtf2.txt ... fn.txt with content f1_contentf2_content ... fn_content, respectively) in directory root/d1/d2/.../dm. Note that n >= 1 and m >= 0. If m = 0, it means the directory is just the root directory.

The output is a list of group of duplicate file paths. For each group, it contains all the file paths of the files that have the same content. A file path is a string that has the following format:

"directory_path/file_name.txt"

Example 1:

Input:
["root/a 1.txt(abcd) 2.txt(efgh)", "root/c 3.txt(abcd)", "root/c/d 4.txt(efgh)", "root 4.txt(efgh)"]
Output:
[["root/a/2.txt","root/c/d/4.txt","root/4.txt"],["root/a/1.txt","root/c/3.txt"]]

Note:

  1. No order is required for the final output.
  2. You may assume the directory name, file name and file content only has letters and digits, and the length of file content is in the range of [1,50].
  3. The number of files given is in the range of [1,20000].
  4. You may assume no files or directories share the same name in the same directory.
  5. You may assume each given directory info represents a unique directory. Directory path and file info are separated by a single blank space.

解题思路:我的方法非常简单,解析输入的字符串,把文件内容作为key,文件路径作为value存入字典中,当然value是用list保存的。最后遍历字典,value超过两个元素即为符合条件的结果。

代码如下:

class Solution(object):
def findDuplicate(self, paths):
"""
:type paths: List[str]
:rtype: List[List[str]]
"""
dic = {}
for p in paths:
item = p.split(' ')
for i in range(1,len(item)):
file_name = item[i].split('(')[0]
file_content = item[i].split('(')[1][:-1]
dic[file_content] = dic.setdefault(file_content,[]) + [item[0] + '/' + file_name]
res = []
for v in dic.itervalues():
if len(v) >= 2:
res.append(v)
return res

【leetcode】609. Find Duplicate File in System的更多相关文章

  1. 【LeetCode】609. Find Duplicate File in System 解题报告(Python & C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 日期 题目地址:https://leetcode.c ...

  2. 【LeetCode】652. Find Duplicate Subtrees 解题报告(Python)

    [LeetCode]652. Find Duplicate Subtrees 解题报告(Python) 标签(空格分隔): LeetCode 作者: 负雪明烛 id: fuxuemingzhu 个人博 ...

  3. LC 609. Find Duplicate File in System

    Given a list of directory info including directory path, and all the files with contents in this dir ...

  4. 609. Find Duplicate File in System

    Given a list of directory info including directory path, and all the files with contents in this dir ...

  5. 【LeetCode】316. Remove Duplicate Letters 解题报告(Python & C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 日期 题目地址:https://leetcode.c ...

  6. 【LeetCode】388. Longest Absolute File Path 解题报告(Python)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述: 题目大意 解题方法 日期 题目地址:https://leetcode. ...

  7. 【LeetCode】217. Contains Duplicate (2 solutions)

    Contains Duplicate Given an array of integers, find if the array contains any duplicates. Your funct ...

  8. 【leetcode】316. Remove Duplicate Letters

    题目如下: Given a string which contains only lowercase letters, remove duplicate letters so that every l ...

  9. 【leetcode】388. Longest Absolute File Path

    题目如下: Suppose we abstract our file system by a string in the following manner: The string "dir\ ...

随机推荐

  1. 13 October

    树链剖分 http://www.lydsy.com/JudgeOnline/problem.php?id=1036 https://oi.men.ci/tree-chain-split-notes/. ...

  2. [CSP-S模拟测试]:石头剪刀布(rps)(概率DP)

    题目传送门(内部题9) 输入格式 第一行一个整数$n$.接下来$n$行每行$3$个非负整数$r_i,p_i,s_i$. 输出格式 一行一个实数表示答案.当你的答案与标准答案的绝对或相对误差不超过${1 ...

  3. leetcode 190. 颠倒二进制位(c++)

    颠倒给定的 32 位无符号整数的二进制位. 示例 1: 输入: 00000010100101000001111010011100输出: 00111001011110000010100101000000 ...

  4. Invoke和BeginInvoke的区别(转载)

    转自http://www.cnblogs.com/c2303191/articles/826571.html Control.Invoke 方法 (Delegate) :在拥有此控件的基础窗口句柄的线 ...

  5. applicationContext.xml无错有红叉,Error occured processing XML 'Provider org.apache.xerces.parsers.解决方案

    applicationContext.xml无错有红叉,网上讲的取消xml验证的方法没用... 甚至我的myeclipse10连windows-->perferences-->myecli ...

  6. spring cloud网关gateway

    spring gateway使用基于netty异步io,第二代网关:zuul 1使用servlet 3,第一代网关,每个请求一个线程,同步Servlet,多线程阻塞模型.而spring貌似不想在支持z ...

  7. jmeter之cookies登录

    现在很多网站的登录都要验证码了,验证码的值是动态的,值不易获取.使用jmeter测试一个需要登录的接口就有困难,这时候,我们就可以使用cookies管理器来记住这个登录信息. 目录 1.jmeter的 ...

  8. [转载]Java中异常的捕获顺序(多个catch)

    http://blog.sina.com.cn/s/blog_6b022bc60101cdbv.html [转载]Java中异常的捕获顺序(多个catch) (2012-11-05 09:47:28) ...

  9. HTML--JS 定时刷新、时钟、倒计时

    <html> <head> <title>定时刷新时间</title> <script language="JavaScript&quo ...

  10. Learning OSG programing---osgShape

    本例示范了osg中Shape ---- 基本几何元素的绘制过程.参照osg官方文档,Shape 类包含以下子类: 在示例程序中,函数createShapes函数用于生成需要绘制的几何形状. osg:: ...