因为有客户需求,所以必须尝试一下,可悲的是手里只有3.7的离线安装文档,加上之前3.11安装因为同事文档写得太好,基本没遇到什么坑,所以就没仔细研究就开始搞了。

结果果然是因为/etc/ansible/host文件写得有问题,遇到一堆问题,记录一下了。

1.遇到问题记录

  • 镜像不ready

镜像不ready,虽然都pull下来了,可是没仔细看文档,就save -o了文档中的那几个,所以就造成下面的错误,只好重新开始下载

One or more required container images are not available:
openshift3/registry-console:v3.,
registry.example.com/openshift3/ose-deployer:v3.6.173.0.130,
registry.example.com/openshift3/ose-docker-registry:v3.6.173.0.130,
registry.example.com/openshift3/ose-haproxy-router:v3.6.173.0.130,
registry.example.com/openshift3/ose-pod:v3.6.173.0.130
Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>
Default registries searched: registry.example.com, registry.access.redhat.com
Failed connecting to: registry.example.com, registry.access.redhat.com
  • registry 443端口没配

学3.11安装配了个80以为可以绕信过关,结果就报错了

[root@master ~]# oc logs  registry-console--deploy -n default
--> Scaling registry-console- to
--> Waiting up to 10m0s for pods in rc registry-console- to become ready
E1114 ::58.912499 reflector.go:] github.com/openshift/origin/pkg/deploy/strategy/support/lifecycle.go:: Failed to watch *api.Pod: Get https://172.30.0.1:443/api/v1/namespaces/default/pods?labelSelector=deployment%3Dregistry-console-1%2Cdeploymentconfig%3Dregistry-console%2Cname%3Dregistry-console&resourceVersion=1981&timeoutSeconds=412&watch=true: dial tcp 172.30.0.1:443: getsockopt: connection refused
  • registry-catalog需要retag一下

pull service-catalog的镜像出问题,这个是个大坑啊,每次一装就需要1个多钟头,类似错误如下

15m        13m            kubelet, master.example.com    spec.containers{apiserver}    Normal        Pulling        pulling image "registry.access.redhat.com/openshift3/ose-service-catalog:v3.6"
15m 13m kubelet, master.example.com spec.containers{apiserver} Warning Failed Failed to pull image "registry.access.redhat.com/openshift3/ose-service-catalog:v3.6": rpc error: code = desc = All endpoints blocked.
15m 13m kubelet, master.example.com spec.containers{apiserver} Normal BackOff Back-off pulling image "registry.access.redhat.com/openshift3/ose-service-catalog:v3.6"
15m 4m kubelet, master.example.com Warning FailedSync Error syncing pod

解决办法如下:

docker pull registry.example.com/openshift3/registry-console:v3.6.173.0.130
docker tag registry.example.com/openshift3/registry-console:v3.6.173.0.130 registry.example.com/openshift3/registry-console:v3. docker push registry.example.com/openshift3/registry-console:v3.
  • 配置了yum但找不到docker

master上安装docker找不到,大家都是配置同样的yum repository,后来只好通过联网方式的subscription-manager注册解决。

  • apiserver的pod虽然启动,但是无法连上,报错信息
curl: () Could not resolve host: apiserver.kube-service-catalog.svc; Unknown error

通过修改./etc/resolv.conf为

[root@node2 ~]# cat /etc/resolv.conf
# nameserver updated by /etc/NetworkManager/dispatcher.d/-origin-dns.sh
# Generated by NetworkManager
search cluster.local example.com
nameserver 192.168.0.105

3.6不像3.11有一个Prequrest的check,这个直接安装上来,就需要一直等他是否出错的信息了,所以每次安装很长时间。

host文件的选项可以参考,踩坑必看啊。

https://docs.okd.io/3.6/install_config/install/advanced_install.html#enabling-service-catalog

  • 安装完成没有看到metrics等组件

安装完成最后的log

TASK [openshift_excluder : Enable openshift excluder] *******************************************************************************************************************
changed: [node1.example.com]
changed: [master.example.com]
changed: [node2.example.com] PLAY RECAP **************************************************************************************************************************************************************
localhost : ok= changed= unreachable= failed=
master.example.com : ok= changed= unreachable= failed=
nfs.example.com : ok= changed= unreachable= failed=
node1.example.com : ok= changed= unreachable= failed=
node2.example.com : ok= changed= unreachable= failed=

检查只有这么几个pod,设置的metrics都没有上来,一定是hosts文件出了问题。

[root@master ~]# oc get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default docker-registry--x0hlq / Running 2d
default registry-console--p84p6 / Running 1d
default router--ttqq9 / MatchNodeSelector 1d
default router--rfpxc / Running 1d
kube-service-catalog apiserver-3ls5x / Running 1d
kube-service-catalog controller-manager-7zdbc / CrashLoopBackOff 1d
[root@master ~]# oc get nodes
NAME STATUS AGE VERSION
master.example.com Ready 2d v1.6.1+5115d708d7
node1.example.com Ready 2d v1.6.1+5115d708d7
node2.example.com Ready 2d v1.6.1+5115d708d7
  • 卸载脚本
ansible-playbook  /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml;
  • DNS无法启动导致atomic-openshift-node.service服务启动失败
Nov  :: master.example.com atomic-openshift-node[]: I1117 ::51.787479    mount_linux.go:] Detected OS with systemd
Nov :: master.example.com atomic-openshift-node[]: I1117 ::51.787497 docker.go:] Connecting to docker on unix:///var/run/docker.sock
Nov :: master.example.com atomic-openshift-node[]: I1117 ::51.787510 docker.go:] Start docker client with request timeout=2m0s
Nov :: master.example.com atomic-openshift-node[]: W1117 ::51.789279 cni.go:] Unable to update cni config: No networks found in /etc/cni/net.d
Nov :: master.example.com atomic-openshift-node[]: F1117 ::51.798668 start_node.go:] could not start DNS, unable to read config file: open /etc/origin/node/resolv.conf: no such file or directory
Nov :: master.example.com systemd[]: atomic-openshift-node.service: main process exited, code=exited, status=/n/a
Nov :: master.example.com systemd[]: Failed to start OpenShift Node.
-- Subject: Unit atomic-openshift-node.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit atomic-openshift-node.service has failed.

解决方案,拷贝一个resolv.conf文件

[root@master ansible]# cd /etc/origin/node
[root@master node]# ls
ca.crt node-dnsmasq.conf server.key system:node:master.example.com.key
node-config.yaml server.crt system:node:master.example.com.crt system:node:master.example.com.kubeconfig
[root@master node]# cp /etc/resolv.conf .
  • Router启动失败,经过分析,发现deploy到node2.example.com的时候失败,原因是bind不到443端口
[root@master node]# oc get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
router--deploy / Error 30m 10.129.0.14 node2.example.com
router--55bpf / Running 5m 192.168.0.104 node1.example.com
router--deploy / Running 5m 10.128.0.14 node1.example.com
router--dw31q / Running 5m 192.168.0.103 master.example.com
router--xn9cp / CrashLoopBackOff 5m 192.168.0.105 node2.example.com
[root@master node]# oc logs router--xn9cp
I1117 ::27.665452 template.go:] Starting template router (v3.6.173.0.130)
I1117 ::27.679413 metrics.go:] Router health and metrics port listening at 0.0.0.0:
I1117 ::27.700732 router.go:] Router is including routes in all namespaces
E1117 ::27.777551 ratelimiter.go:] error reloading router: exit status
[ALERT] / () : Starting frontend public_ssl: cannot bind socket [0.0.0.0:]

问题分析: registry在node2上也是bind 443端口,估计冲突了,所以修改ansible,删除node2的route属性。

把监控功能上上去,又修改了一把hosts文件,最后安装成功的hosts文件参考如下:

# Create an OSEv3 group that contains the masters and nodes groups
[OSEv3:children]
masters
nodes
etcd
nfs [OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=openshift-enterprise osm_cluster_network_cidr=10.128.0.0/
openshift_portal_net=172.30.0.0/
openshift_master_api_port=
openshift_master_console_port= openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_access_modes=['ReadWriteMany']
openshift_hosted_registry_storage_nfs_directory=/exports
openshift_hosted_registry_storage_nfs_options='*(rw,root_squash)'
openshift_hosted_registry_storage_volume_name=registry
openshift_hosted_registry_storage_volume_size=10Gi
oreg_url=registry.example.com/openshift3/ose-\${component}:\${version}
openshift_docker_additional_registries=registry.example.com
openshift_docker_insecure_registries=registry.example.com
openshift_docker_blocked_registries=registry.access.redhat.com,docker.io
openshift_image_tag=v3.6.173.0.130 openshift_enable_service_catalog=true
openshift_service_catalog_image_prefix=registry.example.com/openshift3/ose-
openshift_service_catalog_image_version=v3.6.173.0.130
ansible_service_broker_image_prefix=registry.example.com/openshift3/ose-
ansible_service_broker_etcd_image_prefix=registry.example.com/rhel7/
template_service_broker_prefix=registry.example.com/openshift3/
oreg_url=registry.example.com/openshift3/ose-${component}:${version}
openshift_examples_modify_imagestreams=true
openshift_clock_enabled=true openshift_metrics_storage_kind=nfs
openshift_metrics_install_metrics=true
openshift_metrics_storage_access_modes=['ReadWriteOnce']
openshift_metrics_storage_host=nfs.example.com
openshift_metrics_storage_nfs_directory=/exports
openshift_metrics_storage_volume_name=metrics
openshift_metrics_storage_volume_size=10Gi
openshift_metrics_hawkular_hostname=hawkular-metrics.apps.example.com
#openshift_metrics_cassandra_storage_type=emptydir
openshift_metrics_image_prefix=registry.example.com/openshift3/
openshift_hosted_metrics_deploy=true
openshift_hosted_metrics_public_url=https://hawkular-metrics.apps.example.com/hawkular/metrics
openshift_metrics_image_version=v3.6.173.0.130 openshift_template_service_broker_namespaces=['openshift']
template_service_broker_selector={"node": "true"}
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]
# Default login account: admin / handhand
openshift_master_htpasswd_users={'admin': '$apr1$gfaL16Jf$c.5LAvg3xNDVQTkk6HpGB1'} #openshift_repos_enable_testing=true
openshift_disable_check=docker_image_availability,disk_availability,memory_availability,docker_storage docker_selinux_enabled=false
openshift_docker_options=" --selinux-enabled --insecure-registry 172.30.0.0/16 --log-driver json-file --log-opt max-size=50M --log-opt max-file=3 --insecure-registry registry.example.com --add-registry registry.example.com"
osm_etcd_image=rhel7/etcd
openshift_logging_image_prefix=registry.example.com/openshift3/ openshift_hosted_router_selector='region=infra,router=true'
openshift_master_default_subdomain=app.example.com # host group for masters
[masters]
master.example.com
# host group for etcd
[etcd]
master.example.com # host group for nodes, includes region info
[nodes]
master.example.com openshift_node_labels="{'region': 'infra', 'router': 'true', 'zone': 'default'}" openshift_schedulable=true
node1.example.com openshift_node_labels="{'region': 'infra', 'router': 'true', 'zone': 'default'}" openshift_schedulable=true
node2.example.com openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=true [nfs]
nfs.example.com

安装完成后拿最后的hosts文件又装了一遍,这次终于全部都出来了

[root@master ~]# oc get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default docker-registry--p8p0s / Running 2h
default registry-console--t4bw2 / Running 1h
default router--1nnt3 / Running 2h
default router--4h8tg / Running 2h
kube-service-catalog apiserver-z6nmz / Running 1h
kube-service-catalog controller-manager-d2jgc / Running 1h
openshift-infra hawkular-cassandra--m6r4x / Running 1h
openshift-infra hawkular-metrics-4j828 / Running 1h
openshift-infra heapster-rgwrw / Running 2h

查看pv,pvc

[root@master ~]# oc get pv,pvc
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
pv/registry-volume 10Gi RWX Retain Bound default/registry-claim 26m NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
pvc/registry-claim Bound registry-volume 10Gi RWX 26m

2.批量存镜像脚本

for i in $(docker images |awk '{print $1":"$2}'); do
imagename=$(echo $i | awk -F '/' {'print $3'} | awk -F ':' {'print $1'});
# imagename=$($i |awk -F '/' {'print $3'} | awk -F ':' {'print $1'});
echo $imagename;
# echo docker save $ | gzip -c > /root/images/$imagename.tar.gz;
docker save $i | gzip -c > /root/images/$imagename.tar.gz;
done;

3. 镜像放在单独的盘

Virtualbox 添加一个新盘,然后通过

fdisk -l

找到相应的设备,比如/dev/sdb

格式化

echo "n
p w" | fdisk /dev/sdb;

创建vg

pvcreate /dev/sdb1;
vgcreate docker-vg /dev/sdb1;

docker使用docker-vg

vgs;

cat <<EOF > /etc/sysconfig/docker-storage-setup
VG=docker-vg
EOF docker-storage-setup lvextend -l %VG /dev/docker-vg/docker-pool
touch /etc/containers/registries.conf
systemctl start docker
systemctl enable docker lvs
getenforce

4.ocp.repo文件

[root@master ~]# cat /etc/yum.repos.d/ocp.repo
[server]
name=server
baseurl=http://192.168.56.103:8080/repo/rhel-7-server-rpms/
enabled=
gpgcheck=
[datapath]
name=datapath
baseurl=http://192.168.56.103:8080/repo/rhel-7-fast-datapath-rpms/
enabled=
gpgcheck=
[extra]
name=extra
baseurl=http://192.168.56.103:8080/repo/rhel-7-server-extras-rpms/
enabled=
gpgcheck=
[ose]
name=ose
baseurl=http://192.168.56.103:8080/repo/rhel-7-server-ose-3.6-rpms/
enabled=
gpgcheck=

5.主要安装步骤记录

systemctl stop firewalld
systemctl disable firewalld
systemctl mask firewalld
setenforce ;
sed -i 's/^SELINUX=.*/SELINUX=permissive/' /etc/selinux/config yum clean all
yum repolist yum install -y docker yum -y install wget git net-tools bind-utils iptables-services bridge-utils bash-completion vim atomic-openshift-excluder atomic-openshift-docker-excluder lrzsz unzip atomic-openshift-utils;
yum -y install python-setuptools yum -y update; ssh-keygen ssh-copy-id root@master.example.com
ssh-copy-id root@node1.example.com
ssh-copy-id root@node2.example.com echo "n
p w" | fdisk /dev/sdb; pvcreate /dev/sdb1;
vgcreate docker-vg /dev/sdb1; vgs; cat <<EOF > /etc/sysconfig/docker-storage-setup
VG=docker-vg
EOF docker-storage-setup lvextend -l %VG /dev/docker-vg/docker-pool
touch /etc/containers/registries.conf
systemctl start docker
systemctl enable docker lvs
getenforce yum -y install docker-distribution;
systemctl enable docker-distribution;
systemctl start docker-distribution;

service catalog灰色,technology preview版本一望就知。

Openshift 3.6 安装的更多相关文章

  1. openshift 3.11安装部署

    openshift 3.11 安装部署 openshift安装部署 1 环境准备(所有节点) openshift 版本 v3.11 1.1 机器环境 ip cpu mem hostname OSsys ...

  2. openshift 3.11 安装部署

    openshift 3.11 安装部署 openshift安装部署 1 环境准备(所有节点) openshift 版本 v3.11 1.1 机器环境 ip cpu mem hostname OSsys ...

  3. OpenShift helm的安装

    1.安装过程 下载addons的代码 $ git clone https://github.com/jorgemoralespou/minishift-addons $ cd minishift-ad ...

  4. 002.OpenShift安装与部署

    一 前置条件说明 1.1 安装准备概述 Red Hat OpenShift容器平台是由Red Hat作为RPM包和容器映像两种类型存在.RPM包使用订阅管理器从标准Red Hat存储库(即Yum存储库 ...

  5. 为了启动我在openshift的angular应用

    在Windows环境下搭建OpenShift环境,安装客户端工具rhc,首先需要安装Ruby和Git,参阅https://developers.openshift.com/en/getting-sta ...

  6. [转]OpenShift 集群搭建指南

    转自:http://www.cnblogs.com/zhangning/p/7251810.html OpenShift 集群搭建指南 v1.0 搭建Hyper-v虚拟机或物理机 配置物理机静态IP, ...

  7. 003.OpenShift网络

    一 OpenShift网络实现 1.1 软件定义网络(SDN) 默认情况下,Docker网络使用仅使用主机虚机网桥bridge,主机内的所有容器都连接至该网桥.连接到此桥的所有容器都可以彼此通信,但不 ...

  8. OpenShift Container Platform 4.3.0部署实录

    本文参照红帽官方文档,在裸机安装Openshift4.3文档进行.因为只有一台64G内存的PC机,安装vmware vsphere 6.7免费版进行本测试,所以尝试在OCP官方文档要求的最低内存需求基 ...

  9. paas-openshift

    https://www.openshift.com/pricing/index.htmlOpenShift是红帽的云开发平台即服务(PaaS).自由和开放源码的云计算平台使开发人员能够创建.测试和运行 ...

随机推荐

  1. 三:基于Storm的实时处理大数据的平台架构设计

    一:元数据管理器==>元数据管理器是系统平台的“大脑”,在任务调度中有着重要的作用[1]什么是元数据?--->中介数据,用于描述数据属性的数据.--->具体类型:描述数据结构,数据的 ...

  2. matlab实用命令

    实用命令 打点测时 在需要测量的开始部分标记: tic 在需要测量的结束部分标记: toc 记录程序从tic到toc运行所花费的时间 Image 翻转 fliplr(x) //左右翻转 flipud( ...

  3. [你必须知道的.NET]第二十六回:认识元数据和IL(下)

    发布日期:2009.03.04 作者:Anytao © 2009 Anytao.com ,Anytao原创作品,转贴请注明作者和出处. 说在,开篇之前 书接上回: 第二十四回:认识元数据和IL(上), ...

  4. AC日记——聪明的质监员 洛谷 P1314

    聪明的质监员 思路: 二分: 代码: #include <bits/stdc++.h> using namespace std; #define maxn 200005 #define l ...

  5. 六十六 aiohttp

    asyncio可以实现单线程并发IO操作.如果仅用在客户端,发挥的威力不大.如果把asyncio用在服务器端,例如Web服务器,由于HTTP连接就是IO操作,因此可以用单线程+coroutine实现多 ...

  6. 在 Ubuntu 系统安装 Redi

    在 Ubuntu 系统安装 Redi 可以使用以下命令: $sudo apt-get update $sudo apt-get install redis-server 启动 Redis $ redi ...

  7. STL模板整理 全排列

    概念: 从n个不同元素中任取m(m≤n)个元素,按照一定的顺序排列起来,叫做从n个不同元素中取出m个元素的一个排列.当m=n时所有的排列情况叫全排列.如果这组数有n个,那么全排列数为n!个. 比如a, ...

  8. pytho中pickle、json模块

    pickle & json 模块 json,用于字符串 和 python数据类型间进行转换 pickle,用于python特有的类型 和 python的数据类型间进行转换 json模块提供了四 ...

  9. 洛谷P1850换教室

    题目传送门 理解题意:给定你一个学期的课程和教室数量以及教室之间的距离还有换教室成功的概率,求一个学期走的距离的期望最小值 题目是有够恶心的,属于那种一看就让人不想刷的题目...很明显的动规,但是那个 ...

  10. 洛谷——P3414 SAC#1 - 组合数

    P3414 SAC#1 - 组合数 题目背景 本题由世界上最蒟蒻最辣鸡最撒比的SOL提供. 寂月城网站是完美信息教室的官网.地址:http://191.101.11.174/mgzd . 题目描述 辣 ...