Spark修炼之道（基础篇）——Linux大数据开发基础：第二节：Linux文件系统、文件夹（一）

本节主要内容

怎样获取帮助文档
Linux文件系统简单介绍
文件夹操作
訪问权限

1. 怎样获取帮助文档

在实际工作过程其中，常常会忘记命令的使用方式。比如ls命令后面能够跟哪些參数，此时能够使用man命令来查看其使用方式。比如

//man命令获取命令帮助手冊

xtwy@ubuntu:~$ man ls

能够使用键盘上的来显示下一行或上一行命令，也能够使用进行上一页或下一页（屏）命令的查看，另外空格鍵也能够用来显示下一屏的命令。想退出命令查看，直接按q鍵退出就可以。也能够h鍵显示less命令列表（man命令通过less命令输出结果）

2. Linux文件系统简单介绍

（一）文件和文件夹

本节从使用者的角度来介绍Linux文件系统，Linux依据文件形式将文件分为文件夹和普通文件，例如以下图：

文件夹或文件的名称长度不超过255个字符，文件或文件夹名可由下列字符构成：

Uppercase letters (A–Z)
Lowercase letters (a–z)
Numbers (0–9)
Underscore ( _ )
Period(.)
Comma(,）

文件或文件夹名区分大写和小写，属于不同的文件或文件夹

（二）文件扩展名与不可见文件名称

与Window操作系统有非常大不同的是。Linux文件对文件扩展名没有强制要求。比如假设编写了一个c语言源文件，你能够将其命名为complier.c，也能够是其他如complier、complier.ccc等文件名称。但不推荐这么做，由于假设能将文件扩展名与特定的文件进行关联的话。有利于理解文件内容，眼下约定成俗的linux文件扩展名例如以下表：

带扩展名的文件名称	扩展名的含义
max.c	C语言源文件
max.o	编码后的目标代码文件
max	max.c相应的可运行文件
memo.txt	文本文件
memo.pdf	pdf文件。必须在GUI界面上使用xpdf或kpdf才干查看
memo.ps	PostScript文件。必须在GUI界面上使用ghostscript或kpdf才干查看
memo.z	经压缩程序压缩后的文件，可使用uncompress或gunzip解压
memo.gz	经gzip压缩程序压缩后的文件，可使用gunzip解压
memo.tar.gz或memo.tgz	经gzip压缩后的tar归档文件,可使用gunzip解压
memo.bz2	经bzip2压缩后的文件,可使用bunzip2解压
memo.html	html文件。使用GUI环境的firefox查看
memo.jpg等	图像文件,使用GUI环境的照片查看器打开

在前一讲中我们看到。linux中还存在大量的隐藏文件。採用ls -a 命令能够显示。想定义隐藏文件。仅仅要文件名称或文件夹以.開始就可以

（三）绝对路径与相对路径

在Linux中绝对路径与相对路径是一个非常重要的概念。下图给出了什么是绝对路径

全部以根文件夹”/”作为開始的都是绝对路径，其他的均为相对路径

//绝对路径訪问

xtwy@ubuntu:~/Public$ cd /home/

xtwy@ubuntu:/home$ ls

xtwy

//相对路径訪问

xtwy@ubuntu:/home$ cd xtwy/

3. 文件夹操作

（一）创建文件夹 mkdir

为演示方便，使用下列文件夹结构进行演示：

1 绝对路径创建方式

//使用绝对路径创建

root@ubuntu:/home# mkdir /home/max

root@ubuntu:/home# ls

max  xtwy

root@ubuntu:/home#

2 相对路径创建方式

//使用相对路径进行创建

root@ubuntu:/home# mkdir max/names

root@ubuntu:/home# mkdir max/temp

root@ubuntu:/home# mkdir max/literature

root@ubuntu:/home# cd max

root@ubuntu:/home/max# mkdir demo

root@ubuntu:/home/max# ls

demo  literature  names  temp

有时不想层层文件夹创建。此时能够在mkdir 后面加上參数 -p（parents）。将父子文件夹一起创建

root@ubuntu:/home/max# mkdir -p literature/promo

root@ubuntu:/home/max# ls

demo  literature  names  temp

root@ubuntu:/home/max# cd literature/

root@ubuntu:/home/max/literature# ls

promo

（二）更改文件夹 cd

工作文件夹与主文件夹的区别

用户每次登录后的默认文件夹就是主文件夹，与系统会话期间保持不变，主文件夹用~表示

xtwy@ubuntu:/root$ cd ~

xtwy@ubuntu:~$ pwd

/home/xtwy

工作文件夹又称当前文件夹，cd命令运行完毕后的文件夹就是工作文件夹，它是能够任意改变的。

//.表示当前文件夹即工作文件夹

//..表示当前文件夹的上一级文件夹

xtwy@ubuntu:~$ cd .

xtwy@ubuntu:~$ cd ..

xtwy@ubuntu:/home$

（三）删除文件夹 rmdir

rmdir是remove directory的简称，用于删除文件夹，它先删除文件夹下的全部文件，然后再删除该文件夹，但当文件夹下还有子文件夹时。该命令不能运行。须要使用rm命令，比如

//删除temp文件夹，先删除文件夹下的文件

//再删除temp文件夹自身

root@ubuntu:/home/max# rmdir temp/

root@ubuntu:/home/max# rmdir literature/

rmdir: failed to remove `literature/': Directory not empty

root@ubuntu:/home/max# rm -r literature/

root@ubuntu:/home/max# ls

demo  names

其中rm -r中的r指的是递归的删除文件夹及文件夹中的文件，因此它具有非常强的破坏力，要慎重使用。

（四）移动文件夹 mv

//将文件夹demo移到/home/xtwy/文件夹下

root@ubuntu:/home/max# mv demo/ /home/xtwy/

root@ubuntu:/home/max# cd /home/xtwy/

root@ubuntu:/home/xtwy# ls

demo     Documents  examples.desktop  Pictures  Templates

Desktop  Downloads  Music             Public    Videos

root@ubuntu:/home/xtwy# rmdir demo

//原来文件夹的demo文件夹已经不存在了

root@ubuntu:/home/xtwy# cd /home/max/

root@ubuntu:/home/max# ls

names

（五）拷贝文件夹 cp

前面用mv命令移动文件夹，有时候须要对文件夹进行拷贝，使用方式例如以下：

//先创建一个演示文件夹。用-p，父文件夹假设不存在将会被创建

root@ubuntu:/home/max# mkdir -p literature/demo

//由于literature还包含子文件夹，此时拷贝不成功

root@ubuntu:/home/max# cp literature/ /home/xtwy/

cp: omitting directory `literature/'

//假设包含子文件夹的话，则加上-r參数，表示递归地拷贝

root@ubuntu:/home/max# cp -r literature/ /home/xtwy/

root@ubuntu:/home/max# cd /homt

bash: cd: /homt: No such file or directory

root@ubuntu:/home/max# cd /home/xtwy/

root@ubuntu:/home/xtwy# ls

Desktop    Downloads         literature  Pictures  Templates

Documents  examples.desktop  Music       Public    Videos

root@ubuntu:/home/xtwy# cd literature/

root@ubuntu:/home/xtwy/literature# ls

demo

4. 文件操作

（一）创建文件

直接通过命令行的方式创建文件的方式有多种，常常使用方式例如以下：

//通过echo命令。将输出的命令重定向到文件

root@ubuntu:/home/xtwy# echo "hello linux" > hello.txt

root@ubuntu:/home/xtwy# ls

Desktop    Downloads         hello.txt   Music     Public     Videos

Documents  examples.desktop  literature  Pictures  Templates

//touch命令。怎样文件不存在。会创建文件

root@ubuntu:/home/xtwy# touch hell1.txt

root@ubuntu:/home/xtwy# ls

Desktop    Downloads         hell1.txt  literature  Pictures  Templates

Documents  examples.desktop  hello.txt  Music       Public    Videos

（二）显示文件内容

cate命令能够显示文件内容。它的全称是catenate。意思是将单词一个接一个地连接起来

root@ubuntu:/home/xtwy# cat hello.txt

hello linux

cat命令会将文件里全部的内容全部一次性显示出现，比如

root@ubuntu:/home/xtwy# cat /etc/profile

# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))

# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).

if [ -d /etc/profile.d ]; then

  for i in /etc/profile.d/*.sh; do

    if [ -r $i ]; then

      . $i

    fi

  done

  unset i

  ......

有时候我们希望能够分屏查看文件内容，此时能够使用less或more分页程序。less和more的使用方式相差不大，通过空格键显示下一屏信息，它们之间的区别在于less在文件末尾会显示END消息，而more直接返回shell终端。比如：

less命令

more命令

（三） cp命令拷贝文件

root@ubuntu:/home/xtwy# ls

Desktop    Downloads         hell1.txt  literature  Pictures  Templates

Documents  examples.desktop  hello.txt  Music       Public    Videos

//拷贝文件

root@ubuntu:/home/xtwy# cp hell1.txt literature/demo

root@ubuntu:/home/xtwy# cd literature/demo

//cd -返回上一次运行的工作文件夹

root@ubuntu:/home/xtwy/literature/demo# cd -

/home/xtwy

须要注意的是cp命令在复制时，假设目标文件夹中已存在该文件，系统不会给出警告，而是直接覆盖。因此它可能存在销毁文件的风险，为解决问题能够使用-i參数让系统给出警告，比如：

root@ubuntu:/home/xtwy# cp -i hell1.txt literature/demo

cp: overwrite `literature/demo/hell1.txt'?

（三） mv命令移动或重命名文件

//在同一文件夹时，相当于文件重命名，运行完毕后hell1.txt不存在

root@ubuntu:/home/xtwy# mv hell1.txt hell2.txt

root@ubuntu:/home/xtwy# ls

Desktop    Downloads         hell2.txt  literature  Pictures  Templates

Documents  examples.desktop  hello.txt  Music       Public    Videos

//移动hell2.txt到literature/demo

root@ubuntu:/home/xtwy# mv hell2.txt literature/demo

root@ubuntu:/home/xtwy# cd literature/demo/

root@ubuntu:/home/xtwy/literature/demo# ls

hell1.txt  hell2.txt

root@ubuntu:/home/xtwy/literature/demo# cd -

/home/xtwy

//源文件夹hell2.txt已不存在

root@ubuntu:/home/xtwy# ls

Desktop    Downloads         hello.txt   Music     Public     Videos

Documents  examples.desktop  literature  Pictures  Templates

（四）显示文件头部或尾部

显示文件头部内容用head命令。尾部用tail命令，默认显示行数为10

root@ubuntu:/home/xtwy# head ~/.bashrc

# ~/.bashrc: executed by bash(1) for non-login shells.

# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)

# for examples

# If not running interactively, don't do anything

[ -z "$PS1" ] && return

# don't put duplicate lines in the history. See bash(1) for more options

# ... or force ignoredups and ignorespace

HISTCONTROL=ignoredups:ignorespace

root@ubuntu:/home/xtwy# tail ~/.bashrc

if [ -f ~/.bash_aliases ]; then

    . ~/.bash_aliases

fi

# enable programmable completion features (you don't need to enable

# this, if it's already enabled in /etc/bash.bashrc and /etc/profile

# sources /etc/bash.bashrc).

#if [ -f /etc/bash_completion ] && ! shopt -oq posix; then

#    . /etc/bash_completion

#fi

head及tail的默认行数是能够改动的，比如：

//仅显示前两行

root@ubuntu:/home/xtwy# head -2 ~/.bashrc

# ~/.bashrc: executed by bash(1) for non-login shells.

# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)

tail命令在查看日志文件内容增长时可能常常会使用，比如在hadoop启动之后，会产生很多日志，但出现故障时，能够採用tail命令动态地监測日志文件内容的增长。查看问题出在哪个地方。

//初始显示情况

root@ubuntu:/home/xtwy# tail -f hello.txt

hello linux

//向文件里追加内容

root@ubuntu:/home/xtwy# echo "hello linux linux" >> hello.txt

//追加后的输出情况

root@ubuntu:/home/xtwy# tail -f hello.txt

hello linux

hello linux linux

（五）其他常见文件操作命令

以下的命令都不会改变文件内容

root@ubuntu:/home/xtwy# cp hello.txt hello1.txt

root@ubuntu:/home/xtwy# ls

Desktop    Downloads         hello1.txt  literature  Pictures  Templates

Documents  examples.desktop  hello.txt   Music       Public    Videos

//依据文件内容排序

root@ubuntu:/home/xtwy# sort hello1.txt

hello linux

hello linux linux

//逆序输出

root@ubuntu:/home/xtwy# sort -r  hello1.txt

hello linux linux

hello linux

//diff进行内容比較

root@ubuntu:/home/xtwy# diff hello1.txt hello.txt

//向文件里追加内容

root@ubuntu:/home/xtwy# echo "hello linux linux" >> hello.txt

//内容比較

root@ubuntu:/home/xtwy# diff hello1.txt hello.txt

2a3

> hello linux linux

//格式化输出

//-u參数将文件分成多块

//比較的两个文件分别用-、+表示

//本例中 -表示hello1.txt，+表示hello.txt

root@ubuntu:/home/xtwy# diff -u hello1.txt hello.txt

--- hello1.txt  2015-08-22 17:28:44.071202558 -0700

+++ hello.txt   2015-08-22 17:29:49.131181281 -0700

//@@xxx@@用于标识行起始编号、行数

//-1,2表示 hello1.txt文件起始编号为1，行数为2

//+1,3表示 hello.txt文件起始编号为1。行数为3

@@ -1,2 +1,3 @@

 hello linux

 hello linux linux

+hello linux linux

加入公众微信号。能够了解很多其他最新Spark、Scala相关技术资讯