题目如下:

Given a list of directory info including directory path, and all the files with contents in this directory, you need to find out all the groups of duplicate files in the file system in terms of their paths.

A group of duplicate files consists of at least two files that have exactly the same content.

A single directory info string in the input list has the following format:

"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"

It means there are n files (f1.txtf2.txt ... fn.txt with content f1_contentf2_content ... fn_content, respectively) in directory root/d1/d2/.../dm. Note that n >= 1 and m >= 0. If m = 0, it means the directory is just the root directory.

The output is a list of group of duplicate file paths. For each group, it contains all the file paths of the files that have the same content. A file path is a string that has the following format:

"directory_path/file_name.txt"

Example 1:

Input:
["root/a 1.txt(abcd) 2.txt(efgh)", "root/c 3.txt(abcd)", "root/c/d 4.txt(efgh)", "root 4.txt(efgh)"]
Output:
[["root/a/2.txt","root/c/d/4.txt","root/4.txt"],["root/a/1.txt","root/c/3.txt"]]

Note:

  1. No order is required for the final output.
  2. You may assume the directory name, file name and file content only has letters and digits, and the length of file content is in the range of [1,50].
  3. The number of files given is in the range of [1,20000].
  4. You may assume no files or directories share the same name in the same directory.
  5. You may assume each given directory info represents a unique directory. Directory path and file info are separated by a single blank space.

解题思路:我的方法非常简单,解析输入的字符串,把文件内容作为key,文件路径作为value存入字典中,当然value是用list保存的。最后遍历字典,value超过两个元素即为符合条件的结果。

代码如下:

class Solution(object):
def findDuplicate(self, paths):
"""
:type paths: List[str]
:rtype: List[List[str]]
"""
dic = {}
for p in paths:
item = p.split(' ')
for i in range(1,len(item)):
file_name = item[i].split('(')[0]
file_content = item[i].split('(')[1][:-1]
dic[file_content] = dic.setdefault(file_content,[]) + [item[0] + '/' + file_name]
res = []
for v in dic.itervalues():
if len(v) >= 2:
res.append(v)
return res

【leetcode】609. Find Duplicate File in System的更多相关文章

  1. 【LeetCode】609. Find Duplicate File in System 解题报告(Python & C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 日期 题目地址:https://leetcode.c ...

  2. 【LeetCode】652. Find Duplicate Subtrees 解题报告(Python)

    [LeetCode]652. Find Duplicate Subtrees 解题报告(Python) 标签(空格分隔): LeetCode 作者: 负雪明烛 id: fuxuemingzhu 个人博 ...

  3. LC 609. Find Duplicate File in System

    Given a list of directory info including directory path, and all the files with contents in this dir ...

  4. 609. Find Duplicate File in System

    Given a list of directory info including directory path, and all the files with contents in this dir ...

  5. 【LeetCode】316. Remove Duplicate Letters 解题报告(Python & C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 日期 题目地址:https://leetcode.c ...

  6. 【LeetCode】388. Longest Absolute File Path 解题报告(Python)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述: 题目大意 解题方法 日期 题目地址:https://leetcode. ...

  7. 【LeetCode】217. Contains Duplicate (2 solutions)

    Contains Duplicate Given an array of integers, find if the array contains any duplicates. Your funct ...

  8. 【leetcode】316. Remove Duplicate Letters

    题目如下: Given a string which contains only lowercase letters, remove duplicate letters so that every l ...

  9. 【leetcode】388. Longest Absolute File Path

    题目如下: Suppose we abstract our file system by a string in the following manner: The string "dir\ ...

随机推荐

  1. .NETFramework:System.Net.WebClient.cs

    ylbtech-.NETFramework:System.Net.WebClient.cs 提供用于将数据发送到和接收来自通过 URI 确认的资源数据的常用方法 1.返回顶部 1. #region 程 ...

  2. 不缓存JS

    方法一:告诉浏览器不要缓存(不一定好使) <meta http-equiv="Pragma" content="no-cache"> <met ...

  3. Selenium WebDriver 常用API

    public class Demo1 { WebDriver driver; @BeforeMethod public void visit(){ //webdriver对象的声明 System.se ...

  4. UML 类图快速入门

    UML 图形 官方定义 UML 类图(Class Diagram) UML 时序图(Sequence Diagram) 领域 UML 类图和实现 UML 类图 领域 UML 类图 实现 UML 类图 ...

  5. Ubuntu安装byzanz截取动态效果图

    byzanz-record主要参数选项 用法: byzanz-record [选项...] 录制您的当前桌面会话 帮助选项: -?, --help 显示帮助选项 --help-all 显示全部帮助选项 ...

  6. EasyUI的时间控件禁止输入

    <td class="right">制单日期:</td>  <td class="left"> <input name ...

  7. jQuery将字符串转换为数字

    1:parseInt(string)  parseInt("1234blue"); //returns 1234 parseInt("123"); //retu ...

  8. js 函数 写法

    // function ckeckName(){}; // function checkUser(){}; // function checkPassWorld(){}; // var checkNa ...

  9. Java_1.Java符号体系

    Java符号包含五类:标识符.关键字.常量及字面量.运算符.分隔符 1.标识符 定义:用于标明程序中元素的名字,如类.方法和变量 命名规则: ·由字母.数字.下划线(_)和美元符号($)构成的字母序列 ...

  10. RMQ(鸽巢原理或字符串操作)

    http://acm.hdu.edu.cn/showproblem.php?pid=3183 A Magic Lamp Time Limit: 2000/1000 MS (Java/Others)   ...