コピー docker: Error response from daemon: failed to start shim: exec: "docker: Error response from daemon: failed to start shim: exec: " docker-containerd-shim ": executable file not found in $PATH: unknown.
" : executable file not found in $PATH : unknown.
コピー docker: Error response from daemon: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/b123c36ddea60f7bf562946b7ffb9520023fcb4946b3e7f9941a564a46bad28c/log.json: no such file or directory ): exec: "docker-runc" : executable file not found in $PATH: unknown.
原因はまだ特定されていません。 Docker のバージョンが 18.09.2 から 18.09.7 にアップグレードされたため、一部の実行ファイルが再計画されています。
この問題の発生を防ぐには、Ubuntu / CentOSマシンで APT 自動アップグレードサービスを無効にします。スクリプトを実行して、このサービスを停止します。以下のスクリプトは、環境デプロイメントスクリプトに組み込まれています。Diamond のデプロイメントパブリックキーを設定する際、このキーをデプロイメントマニュアル内に見つけることができます。
コピー # Disable apt package auto upgrade service
sudo systemctl stop apt-daily.service apt-daily.timer apt-daily-upgrade.timer apt-daily-upgrade.service
sudo systemctl disable apt-daily.service apt-daily.timer apt-daily-upgrade.timer apt-daily-upgrade.service
コピー ...
TASK [Create a ext4 filesystem] ********************************************************************************************************************************************************************************
ok: [mercury-work-01]
ok: [mercury-work-02]
ok: [mercury-work-03]
TASK [Mount up device] *****************************************************************************************************************************************************************************************
fatal: [mercury-work-02]: FAILED ! = > { "changed" : false , "msg" : "Error mounting /mnt/locals/afd/volume0: mount: mount /dev/vdb5 on /mnt/locals/afd/volume0 failed: Structure needs cleaning\n" }
fatal: [mercury-work-01]: FAILED ! = > { "changed" : false , "msg" : "Error mounting /mnt/locals/afd/volume0: mount: mount /dev/vdb5 on /mnt/locals/afd/volume0 failed: Structure needs cleaning\n" }
fatal: [mercury-work-03]: FAILED ! = > { "changed" : false , "msg" : "Error mounting /mnt/locals/afd/volume0: mount: mount /dev/vdb5 on /mnt/locals/afd/volume0 failed: Structure needs cleaning\n" }
to retry, use: --limit @/data/mercury/init_data_fs_vdb.retry
PLAY RECAP *****************************************************************************************************************************************************************************************************
mercury-work-01 : ok= 24 changed= 11 unreachable= 0 failed= 1
mercury-work-02 : ok= 24 changed= 11 unreachable= 0 failed= 1
mercury-work-03 : ok= 24 changed= 11 unreachable= 0 failed= 1
...
原因は不明です。この問題は毎回発生するわけではありません。同じデプロイメントスクリプトを使用して、初回実行時にエラーが発生する場合もあれば、繰り返し実行した場合にのみエラーが発生する場合もあります。
コピー TASK [engine-alert-feature-db : App engine-alert-feature-db | Config & Pod & Service] ******************************************************************************************
ok: [mercury-work-01] = > (item = config.yml )
ok: [mercury-work-01] = > (item = proxy.yml )
ok: [mercury-work-01] = > (item = worker.yml )
ok: [mercury-work-01] = > (item = ingress.yml )
ok: [mercury-work-01] = > (item = init.sh )
ok: [mercury-work-01] = > (item = worker-monitor.yml )
ok: [mercury-work-01] = > (item = proxy-monitor.yml )
TASK [engine-alert-feature-db : App engine-alert-feature-db | Init] ************************************************************************************************************
fatal: [mercury-work-01]: FAILED! => {"changed": true, "cmd": ["bash", "/etc/kubernetes/apps/engine-alert-feature-db/init.sh"], "delta": "0:00:14.564060", "end": "2019-10-17 22:29:22.053773", "msg": "non-zero return code", "rc": 2, "start": "2019-10-17 22:29:07.489713", "stderr": "+ (( i=0 ))\n+ (( i<20 ))\n+ kubectl exec -it -n component cassandra-alert-0 -- cqlsh -u root -p d@6lo6kBjK%jllN -e 'CREATE KEYSPACE IF NOT EXISTS viper_test WITH replication = {'\\''class'\\'':'\\''SimpleStrategy'\\'', '\\''replication_factor'\\'' : 3};'\nUnable to use a TTY - input is not a terminal or the right kind of file\n+ ret_code=0\n+ [[ 0 -eq 0 ]]\n+ break\n+ [[ 0 -ne 0 ]]\n+ (( i=0 ))\n+ (( i<20 ))\n+ kubectl exec -it -n component cassandra-alert-0 -- cqlsh -u root -p d@6lo6kBjK%jllN -e '\n CREATE TABLE IF NOT EXISTS viper_test.static_feature_dbs(\n db_id uuid,\n object_type text,\n name text,\n feature_version int,\n description text,\n creation_time timestamp,\n indexes map<uuid, text>,\n deleted boolean,\n max_size bigint,\n PRIMARY KEY (db_id),\n );'\nUnable to use a TTY - input is not a terminal or the right kind of file\n+ ret_code=0\n+ [[ 0 -eq 0 ]]\n+ break\n+ [[ 0 -ne 0 ]]\n+ (( i=0 ))\n+ (( i<20 ))\n+ kubectl exec -it -n component cassandra-alert-0 -- cqlsh -u root -p d@6lo6kBjK%jllN -e '\n CREATE TABLE IF NOT EXISTS viper_test.static_features(\n index_id uuid,\n seq_id bigint,\n feature_version int,\n creation_time timestamp,\n metadata blob,\n image_id text,\n payload text,\n feature blob,\n PRIMARY KEY (index_id, seq_id),\n );'\nUnable to use a TTY - input is not a terminal or the right kind of file\n<stdin>:1:OperationTimedOut: errors={'10.244.1.8': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=10.244.1.8\ncommand terminated with exit code 2", "stderr_lines": ["+ (( i=0 ))", "+ (( i<20 ))", "+ kubectl exec -it -n component cassandra-alert-0 -- cqlsh -u root -p d@6lo6kBjK%jllN -e 'CREATE KEYSPACE IF NOT EXISTS viper_test WITH replication = {'\\''class'\\'':'\\''SimpleStrategy'\\'', '\\''replication_factor'\\'' : 3};'", "Unable to use a TTY - input is not a terminal or the right kind of file", "+ ret_code=0", "+ [[ 0 -eq 0 ]]", "+ break", "+ [[ 0 -ne 0 ]]", "+ (( i=0 ))", "+ (( i<20 ))", "+ kubectl exec -it -n component cassandra-alert-0 -- cqlsh -u root -p d@6lo6kBjK%jllN -e '", " CREATE TABLE IF NOT EXISTS viper_test.static_feature_dbs(", " db_id uuid,", " object_type text,", " name text,", " feature_version int,", " description text,", " creation_time timestamp,", " indexes map<uuid, text>,", " deleted boolean,", " max_size bigint,", " PRIMARY KEY (db_id),", " );'", "Unable to use a TTY - input is not a terminal or the right kind of file", "+ ret_code=0", "+ [[ 0 -eq 0 ]]", "+ break", "+ [[ 0 -ne 0 ]]", "+ (( i=0 ))", "+ (( i<20 ))", "+ kubectl exec -it -n component cassandra-alert-0 -- cqlsh -u root -p d@6lo6kBjK%jllN -e '", " CREATE TABLE IF NOT EXISTS viper_test.static_features(", " index_id uuid,", " seq_id bigint,", " feature_version int,", " creation_time timestamp,", " metadata blob,", " image_id text,", " payload text,", " feature blob,", " PRIMARY KEY (index_id, seq_id),", " );'", "Unable to use a TTY - input is not a terminal or the right kind of file", "<stdin>:1:OperationTimedOut: errors={'10.244.1.8': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=10.244.1.8", "command terminated with exit code 2"], "stdout": "", "stdout_lines": []}
コピー $ kubectl -n component delete persistentvolume/local-pv-12bd5454 persistentvolume/local-pv-7d88103a persistentvolume/local-pv-db999675
persistentvolume "local-pv-12bd5454" deleted
persistentvolume "local-pv-7d88103a" deleted
persistentvolume "local-pv-db999675" deleted
コピー $ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv-12bd5454 4793Mi RWO Delete Terminating default/engine-alert-feature-db-oplogs-engine-alert-feature-db-worker-1-0 storageclass-local-afd 39h
local-pv-1ea63c14 14Gi RWO Delete Bound component/localvolume0-minio-default-0 storageclass-local-minio-default 39h
local-pv-21daf7fb 23Gi RWO Delete Available storageclass-local-cassandra-alert 39h
local-pv-483597 14Gi RWO Delete Available storageclass-local-minio-default 39h
...
コピー ...
kind : PersistentVolume
metadata :
creationTimestamp : 2019-10-09T10:11:41Z
deletionGracePeriodSeconds : 0
deletionTimestamp : 2019-10-11T01:03:21Z
finalizers :
- kubernetes.io/pv-protection
name : local-pv-12bd5454
...
コピー ...
kind : PersistentVolume
metadata :
creationTimestamp : 2019-10-09T10:11:41Z
deletionGracePeriodSeconds : 0
deletionTimestamp : 2019-10-11T01:03:21Z
finalizers : null
name : local-pv-12bd5454
...
デプロイメントのテスト中にデプロイメントのテストを最初からやり直すためには、デプロイされたすべてのリソースを完全に削除してデプロイメント前の状態に戻す必要があるがあります。プロジェクトのインストールパッケージには、これに対応したリソース削除スクリプトが用意されています。このスクリプトでは、互換性の多くが考慮されていません。既存のデプロイメント Ansible をインストールし、実行内容を戻すステップにすぎません。Ansible のデプロイメントコンテンツに変更があった場合、削除スクリプトのコンテンツを同時に変更する必要があります。
コピー ssh mercury-work-01 < ./mercury/bin/remove-services.sh
ssh mercury-work-01 < ./mercury/bin/remove-volumes.sh
ssh mercury-work-02 < ./mercury/bin/remove-volumes.sh
...
ssh mercury-work-XX < ./mercury/bin/remove-volumes.sh