一、文件下载和解压

1）下载地址：http://www.alluxio.org/download

2）解压命令如下：

$ wget http://alluxio.org/downloads/files/1.2.0/alluxio-1.2.0-bin.tar.gz $ tar xvfz alluxio-1.2.0-bin.tar.gz $ cd alluxio-1.2.0

二、配置文件更改

目前只是基本配置更改：

1） /data/spark/software/alluxio-1.2.0/conf下的 alluxio-env.sh.template 复制一份为： alluxio-env.sh

更改如下：

#!/usr/bin/env bash

#

# The Alluxio Open Foundation licenses this work under the Apache License, version 2.0

# (the "License"). You may not use this work except in compliance with the License, which is

# available at www.apache.org/licenses/LICENSE-2.0

#

# This software is distributed on an "AS IS" basis, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,

# either express or implied, as more fully set forth in the License.

#

# See the NOTICE file distributed with this work for information regarding copyright ownership.

#

# Copy it as alluxio-env.sh and edit that to configure Alluxio for your

# site. This file is sourced to launch Alluxio servers or use Alluxio shell

# commands.

#

# This file provides one way to configure Alluxio options by setting the

# following listed environment variables. Note that, setting this file will not

# affect jobs (e.g., Spark job or MapReduce job) that are using Alluxio client

# as a library. Alternatively, you can edit alluxio-site.properties file, where

# you can set all the configuration options supported by Alluxio

# (http://alluxio.org/documentation/) which is respected by both external jobs

# and Alluxio servers (or shell).

# The directory where Alluxio deployment is installed. (Default: the parent directory of libexec/).

export ALLUXIO_HOME=/data/spark/software/alluxio-1.2.0

# The directory where log files are stored. (Default: ${ALLUXIO_HOME}/logs).

# ALLUXIO_LOGS_DIR

# Hostname of the master.

# ALLUXIO_MASTER_HOSTNAME

export ALLUXIO_MASTER_HOSTNAME=spark29

# This is now deprecated. Support will be removed in v2.0

# ALLUXIO_MASTER_ADDRESS

#export ALLUXIO_MASTER_ADDRESS=spark29

# The directory where a worker stores in-memory data. (Default: /mnt/ramdisk).

# E.g. On linux, /mnt/ramdisk for ramdisk, /dev/shm for tmpFS; on MacOS, /Volumes/ramdisk for ramdisk

# ALLUXIO_RAM_FOLDER

export ALLUXIO_RAM_FOLDER=/data/spark/software/alluxio-1.2.0/ramdisk

# Address of the under filesystem address. (Default: ${ALLUXIO_HOME}/underFSStorage)

# E.g. "/my/local/path" to use local fs, "hdfs://localhost:9000/alluxio" to use a local hdfs

# ALLUXIO_UNDERFS_ADDRESS

export ALLUXIO_UNDERFS_ADDRESS=hdfs://spark29:9000

# How much memory to use per worker. (Default: 1GB)

# E.g. "1000MB", "2GB"

# ALLUXIO_WORKER_MEMORY_SIZE

export ALLUXIO_WORKER_MEMORY_SIZE=12GB

# Config properties set for Alluxio master, worker and shell. (Default: "")

# E.g. "-Dalluxio.master.port=39999"

# ALLUXIO_JAVA_OPTS

# Config properties set for Alluxio master daemon. (Default: "")

# E.g. "-Dalluxio.master.port=39999"

# ALLUXIO_MASTER_JAVA_OPTS

# Config properties set for Alluxio worker daemon. (Default: "")

# E.g. "-Dalluxio.worker.port=49999" to set worker port, "-Xms2048M -Xmx2048M" to limit the heap size of worker.

# ALLUXIO_WORKER_JAVA_OPTS

# Config properties set for Alluxio shell. (Default: "")

# E.g. "-Dalluxio.user.file.writetype.default=CACHE_THROUGH"

# ALLUXIO_USER_JAVA_OPTS

2）worker 下面的添加worker节点的地址

spark24

spark30

spark31

spark32

spark33

三、主机配置更改

1）在家目录下更改 .bash_profile 添加一下内容：

export TACHYON_HOME=/data/spark/software/alluxio-1.2.0

PATH=$PATH:$HOME/bin:$HADOOP/bin:$JAVA_HOME/bin:$TACHYON_HOME/bin

2）生效配置

source .bash_profile

四、Spark 添加依赖Jar

1、在所有的spark主机的spark安装目录下的conf目录下

更改spark-env.sh 后面添加：export SPARK_CLASSPATH="/data/spark/software/spark-1.5.2-bin-hadoop2.6/lib/alluxio-core-client-spark-1.2.0-jar-with-dependencies.jar:$SPARK_CLASSPATH"

五、分发到各个Worker节点上去

1、alluxio 软件：scp -r ./alluxio-1.2.0 spark30:/data/spark/software/

六、格式化和启动

1、进入到alluxio的安装目录下面的bin目录，执行命令： alluxio format 进行内存格式化。

2、启动集群：./alluxio-start.sh all

七、可能遇到问题

1、启动worker报错，报错内容：Pseudo-terminal will not be allocated because stdin is not a terminal.

更改：alluxio\bin\alluxio-workers.sh 的44行内容

原始内容为：

nohup ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -t ${worker} ${LAUNCHER} \

改成如下：
nohup ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no -tt $ {worker} ${LAUNCHER} \

2、如果启动报sudo相关命令错误，是因为启动用户未在sudoers里面，需要将用户添加到此文件中，添加方法搜下root位置，再后面添加即可。

内容如下：

root ALL=(ALL) ALL
spark ALL=(ALL) ALL

同时把此文件中的：#Defaults requiretty 注释掉。

3、如果还报错，可以在启动master之后，一个一个节点去启动worker。

八、官网安装说明

官网安装说明：http://www.alluxio.org/docs/master/cn/Running-Alluxio-on-a-Cluster.html 有中文的，可以看看。

Alluxio 内存存储系统部署的更多相关文章

GlusterFS + lagstash + elasticsearch + kibana 3 + redis日志收集存储系统部署 01
因公司数据安全和分析的需要,故调研了一下 GlusterFS + lagstash + elasticsearch + kibana 3 + redis 整合在一起的日志管理应用: 安装,配置过程,使 ...
【原创】大数据基础之Alluxio（1）简介、安装、使用
Alluxio 1.8.1 官方:http://www.alluxio.org/ 一简介 Open Source Memory Speed Virtual Distributed StorageAl ...
采用alluxio提升MR job和Spark job性能的注意点
1. 介绍 2. 实验说明 2.1 实验环境 2.2 实验方法 2.3 实验负载 3. MapReduce on alluxio 3.1 读取10G文件(1G split) 3.2 读取20G文件(1 ...
生态 | Apache Hudi集成Alluxio实践
原文链接:https://mp.weixin.qq.com/s/sT2-KK23tvPY2oziEH11Kw 1. 什么是Alluxio Alluxio为数据驱动型应用和存储系统构建了桥梁, 将数据从 ...
Kubernetes 入门与安装部署
一.简介参考:Kubernetes 官方文档.Kubernetes中文社区 | 中文文档 Kubernetes 是一个可移植的.可扩展的开源平台,用于管理容器化的工作负载和服务,可促进声明式配置和自 ...
RAMCloud：内存云存储的内存分配机制
现在全闪存阵列已经见怪不怪了,EMC的XtremIO,还有VNX-F(Rockies),IBM FlashSystem.全闪存真正为效率而生,重新定义存储速度.凭借极致性能,高可用性,为您极大提高企业 ...
redis 配置文件解释以及集群部署
redis是一款开源的.高性能的键-值存储(key-value store),和memcached类似,redis常被称作是一款key-value内存存储系统或者内存数据库,同时由于它支持丰富的数据结 ...
002.Ceph安装部署
一前期准备 1.1 配置规格节点类型 IP CPU 内存 ceph-deploy 部署管理平台 172.24.8.71 2 C 4 G node1 Monitor OSD 172.24.8.72 ...
spark on alluxio和MR on alluxio测试(改进版)【转】
转自:http://kaimingwan.com/post/alluxio/spark-on-alluxiohe-mr-on-alluxioce-shi-gai-jin-ban 1. 介绍 2. 准备 ...

随机推荐

pcapng文件的python解析实例以及抓包补遗
为了弥补pcap文件的缺陷,让抓包文件可以容纳更多的信息,pcapng格式应运而生.关于它的介绍详见<PCAP Next Generation Dump File Format> 当前的w ...
HTTP协议详解之响应篇
#xiaodeng #状态码 #HTTP权威指南 62 #http响应由3部分组成:状态行.消息报头.响应正文.HTTP-Version Status-Code Reason-Phrase CRLF# ...
科普：TLS、SSL、HTTPS以及证书（转）
最近在研究基于ssl的传输加密,涉及到了key和证书相关的话题,走了不少弯路,现在总结一下做个备忘不少人可能听过其中的超过3个名词,但它们究竟有什么关联呢? TLS是传输层安全协议(Transpo ...
MySQL5.7.18基于GTID的主从复制过程实现
GTID是5.6时加入的,在5.7中被进一步完善,生产环境建议在5.7版本中使用.GTID全称为Global Transaction Identifiers,全局事务标识符.GTID的复制完全是基于事 ...
springmvc访问静态资源的springmvc.xml配置
一. 问题及需求由于Spring MVC的web.xml文件中关于DispatcherServlet拦截url的配置为"/",拦截了所有的请求,同时*.js,*.jpg等静态资源 ...
HDUOJ------（1272）小希的迷宫
小希的迷宫 Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others) Total Subm ...
Python中 append 和 extend 的区别
Python中Lists 的两个方法: append 和 extend : list.append(object) 向列表中添加一个对象object.append 接受一个参数,这个参数可以是任何数据 ...
linux 常用awk命令
linux awk命令详解awk是一个强大的文本分析工具,相对于grep的查找,sed的编辑,awk在其对数据分析并生成报告时,显得尤为强大.简单来说awk就是把文件逐行的读入,以空格为默认分隔符将每 ...
.Net4.0 任务(Task)[转]
.Net4.0 任务(Task) 任务(Task)是一个管理并行工作单元的轻量级对象.它通过使用CLR的线程池来避免启动专用线程,可以更有效率的利用线程池.System.Threading.Tasks ...
iOS - UIPasteboard
前言 NS_CLASS_AVAILABLE_IOS(3_0) __TVOS_PROHIBITED __WATCHOS_PROHIBITED @interface UIPasteboard : NSOb ...

Alluxio 内存存储系统部署

一、文件下载和解压

二、 配置文件更改

三 、主机配置更改

四 、Spark 添加依赖Jar

五 、分发到各个Worker节点上去