Service Mesh 도입 테스트(linkerd)
Istio VS Linkerd
https://buoyant.io/linkerd-vs-istio
istio가 레퍼런스는 많지만 헤비한 시스템이라 Linkerd를 선정하고 클러스터에 올려 테스트 수행
1. linkerd CLI 설치
curl --proto '=https' --tlsv1.2 -sSfL <https://run.linkerd.io/install> | sh
or
brew install linkerd
설치확인
% linkerd version
Client version: stable-2.11.2
Server version: unavailable
2. linkerd로 클러스터 체크
% linkerd check --pre
there are nodes using the docker container runtime and proxy-init container must run as root user.
try installing linkerd via --set proxyInit.runAsRoot=true
에러발생... 체크 패스하고 위에 설명한대로 설치 진행
3. linkerd install
ybchoi@ybchoiui-MacBookPro linkerd % linkerd install | kubectl apply -f -
there are nodes using the docker container runtime and proxy-init container must run as root user.
try installing linkerd via --set proxyInit.runAsRoot=true
error: no objects passed to apply
ybchoi@ybchoiui-MacBookPro linkerd % linkerd install --set proxyInit.runAsRoot=true | kubectl apply -f -
namespace/linkerd created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-identity created
serviceaccount/linkerd-identity created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-destination created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-destination created
serviceaccount/linkerd-destination created
secret/linkerd-sp-validator-k8s-tls created
validatingwebhookconfiguration.admissionregistration.k8s.io/linkerd-sp-validator-webhook-config created
secret/linkerd-policy-validator-k8s-tls created
validatingwebhookconfiguration.admissionregistration.k8s.io/linkerd-policy-validator-webhook-config created
clusterrole.rbac.authorization.k8s.io/linkerd-policy created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-destination-policy created
role.rbac.authorization.k8s.io/linkerd-heartbeat created
rolebinding.rbac.authorization.k8s.io/linkerd-heartbeat created
clusterrole.rbac.authorization.k8s.io/linkerd-heartbeat created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-heartbeat created
serviceaccount/linkerd-heartbeat created
customresourcedefinition.apiextensions.k8s.io/servers.policy.linkerd.io created
customresourcedefinition.apiextensions.k8s.io/serverauthorizations.policy.linkerd.io created
customresourcedefinition.apiextensions.k8s.io/serviceprofiles.linkerd.io created
customresourcedefinition.apiextensions.k8s.io/trafficsplits.split.smi-spec.io created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
serviceaccount/linkerd-proxy-injector created
secret/linkerd-proxy-injector-k8s-tls created
mutatingwebhookconfiguration.admissionregistration.k8s.io/linkerd-proxy-injector-webhook-config created
configmap/linkerd-config created
secret/linkerd-identity-issuer created
configmap/linkerd-identity-trust-roots created
service/linkerd-identity created
service/linkerd-identity-headless created
deployment.apps/linkerd-identity created
service/linkerd-dst created
service/linkerd-dst-headless created
service/linkerd-sp-validator created
service/linkerd-policy created
service/linkerd-policy-validator created
deployment.apps/linkerd-destination created
Warning: batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
cronjob.batch/linkerd-heartbeat created
deployment.apps/linkerd-proxy-injector created
service/linkerd-proxy-injector created
secret/linkerd-config-overrides created
확인
kubectl get pod -n linkerd
NAME READY STATUS RESTARTS AGE
linkerd-destination-6dcfd495fd-g4p7w 4/4 Running 0 79s
linkerd-identity-847445f99d-7smq4 2/2 Running 0 80s
linkerd-proxy-injector-7695cf7f6-7st5z 2/2 Running 0 79s
linkerd check
Linkerd core checks
===================
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
‼ cluster networks can be verified
the following nodes do not expose a podCIDR:
ip-10-120-11-168.ap-northeast-2.compute.internal
ip-10-120-11-50.ap-northeast-2.compute.internal
ip-10-120-12-35.ap-northeast-2.compute.internal
see <https://linkerd.io/2.11/checks/#l5d-cluster-networks-verified> for hints
linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used
linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date
control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
√ control plane and cli versions match
linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match
Status check results are √
4. Testing
링커드 제공 데모앱 배포 및 테스팅
curl --proto '=https' --tlsv1.2 -sSfL <https://run.linkerd.io/emojivoto.yml> \\
| kubectl apply -f -
k get all -n emojivoto
NAME READY STATUS RESTARTS AGE
pod/emoji-66ccdb4d86-xvwzk 1/1 Running 0 20s
pod/vote-bot-69754c864f-v8k5k 1/1 Running 0 20s
pod/voting-f999bd4d7-xvv4n 1/1 Running 0 20s
pod/web-79469b946f-r9785 1/1 Running 0 19s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/emoji-svc ClusterIP 172.20.135.31 8080/TCP,8801/TCP 20s
service/voting-svc ClusterIP 172.20.39.59 8080/TCP,8801/TCP 20s
service/web-svc ClusterIP 172.20.34.9 80/TCP 20s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/emoji 1/1 1 1 20s
deployment.apps/vote-bot 1/1 1 1 20s
deployment.apps/voting 1/1 1 1 20s
deployment.apps/web 1/1 1 1 19s
NAME DESIRED CURRENT READY AGE
replicaset.apps/emoji-66ccdb4d86 1 1 1 21s
replicaset.apps/vote-bot-69754c864f 1 1 1 21s
replicaset.apps/voting-f999bd4d7 1 1 1 21s
replicaset.apps/web-79469b946f 1 1 1 20s
위 앱 배포하고 proxy이용 확인하면 특정 이모지에서 에러발생(의도한것)
이제 위앱에 inkerd를 삽입한다(프록시 삽입)
kubectl get -n emojivoto deploy -o yaml \\
| linkerd inject - \\
| kubectl apply -f -
위 명령어로 현재 deploy설정을 yaml로 출력한 다음 링커드 cli가 inject문법을 설정 한후 apply로 실제 클러스터에 적용
k get pod -n emojivoto
NAME READY STATUS RESTARTS AGE
emoji-696d9d8f95-smqmr 2/2 Running 0 42s
vote-bot-6d7677bb68-zzbfb 2/2 Running 0 42s
voting-ff4c54b8d-w5qf6 2/2 Running 0 42s
web-5f86686c4d-tzsc9 2/2 Running 0 42s
위와 같이 container의 갯수가 늘어남 linkerd-proxy container가 추가되었다
이게 프록시 역할을 해줌
linkerd injection 수행확인
linkerd -n emojivoto check --proxy
Linkerd core checks
===================
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API
kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version
linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
‼ cluster networks can be verified
the following nodes do not expose a podCIDR:
ip-10-120-11-168.ap-northeast-2.compute.internal
ip-10-120-11-50.ap-northeast-2.compute.internal
ip-10-120-12-35.ap-northeast-2.compute.internal
see <https://linkerd.io/2.11/checks/#l5d-cluster-networks-verified> for hints
linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used
linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor
linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days
linkerd-identity-data-plane
---------------------------
√ data plane proxies certificate match CA
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date
linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match
linkerd-data-plane
------------------
√ data plane namespace exists
√ data plane proxies are ready
√ data plane is up-to-date
√ data plane and cli versions match
√ data plane pod labels are configured correctly
√ data plane service labels are configured correctly
√ data plane service annotations are configured correctly
√ opaque ports are properly annotated
Status check results are √
이제 배포한 앱에 linkerd가 작동되고 있다
이를 확인하려면 viz라는 대시보드를 이용하여야함
viz 설치
linkerd viz install | kubectl apply -f -
설치후 다시 확인
linkerd check
viz부분만 출력
linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ viz extension proxies are healthy
√ viz extension proxies are up-to-date
√ viz extension proxies and cli versions match
√ prometheus is installed and configured correctly
√ can initialize the client
√ viz extension self-check
Status check results are √
대시보드 실행
linkerd viz dashboard &
위와같이 실제 서비스 흐름도를 파악할수 있으며 자동으로 배포되는 linkerd grafana로 메트릭 확인도 가능
위와같이 실제 어디에서 문제가 발생햇는지 모니터링이 가능함
배포앱들이 msa화 되며 서비스 관계도가 복잡하고 많아지는데
k8s의 한계점때문에 pod끼리의 통신은 추적 모니터링이 불가능함
이경우 위와같이 서비스메시를 사용하면 모니터링은 물론 서비스흐름도 제어해줌
다만 모든 서비스의 사이드카가 추가되고 관리 포인트가 늘어나므로 단점도 있다
테스트 클러스터 사용전까지 테스트하며 케이스 습득 예정
janusgraph에 linkerd적용 테스트
kubectl get -n janusgraph deploy -o yaml \\
| linkerd inject - \\
| kubectl apply -f -
% k get pod -n janusgraph
NAME READY STATUS RESTARTS AGE
jaunsgraph-deployment-57c68dcd5f-9lmzx 2/2 Running 0 6m41s
jaunsgraph-deployment-57c68dcd5f-bg9k7 2/2 Running 0 7m21s
jaunsgraph-deployment-57c68dcd5f-scc7v 2/2 Running 0 6m
정말 아무런 오류없이 잘 적용됨...링커드 많이 쓰는 이유가 있는듯
다른 파드에 적용할때는 아래 어노테이션만 주면된다
annotations:
linkerd.io/inject: enabled
파드에 해당 어노테이션이 있다면 스케쥴링시 linkerd proxy injector 가 체크하여 사이드카를 삽입 해줌
위 내용을 엘라스틱서치에 넣고 업그레이드 해봄 (젤어려워보이는 es로 테스트)
podAnnotations:
{}
# iam.amazonaws.com/role: es-cluster
podAnnotations:
linkerd.io/inject: enabled
% helm upgrade elasticsearch -n elasticsearch -f values.yaml elastic/elasticsearch
es의 경우 여러가지 복잡한 통신이 이루어지기때문에 몇일간 테스트 하고 모니터링 예정임
k get pod -n elasticsearch
NAME READY STATUS RESTARTS AGE
elasticsearch-master-0 2/2 Running 0 89s
elasticsearch-master-1 2/2 Running 0 3m1s
elasticsearch-master-2 2/2 Running 0 4m33s
또 viz배포시 자동으로 배포되는 프로메테우스 말고 클러스터에서 사용중인 프로메테우스에 통합하는 것도 테스트 해야함
https://linkerd.io/2.11/tasks/external-prometheus/
예거는 도입X - 너무복잡하고 앱 코드 수정이 필요함
테스트 말고 운영환경 실제 배포시에는 아래 옵션들 사용 필요
--ha Enable HA deployment config for the control plane (default false)
--set 으로 프로메테우스 그라파등 옵션 찾아서 지정하여 설치
--set grafana.url=grafana.grafana:3000
--set prometheusUrl=xxxx:9090
--set prometheus.enabled=false < 이건매뉴얼에 없음 체크
scylla에 도 적용 해봄
kubectl get -n scylla statefulset -o yaml \\
| linkerd inject - \\
| kubectl apply -f -
Warning: resource statefulsets/scylla-scylla is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
k get pod -n scylla
NAME READY STATUS RESTARTS AGE
scylla-scylla-0 3/3 Running 0 2m1s
scylla-scylla-1 3/3 Running 0 2m56s
scylla-scylla-2 3/3 Running 0 4m2s
syclla나 elastic모두 tcp통신이라 live call에는 안뜨지만 tcp모니터링은 가능