클라우드/쿠버네티스

Service Mesh 도입 테스트(linkerd)

ybchoi 2022. 5. 24. 21:19

Istio VS Linkerd

 

https://buoyant.io/linkerd-vs-istio

 

Linkerd vs Istio

 

buoyant.io

istio가 레퍼런스는 많지만 헤비한 시스템이라 Linkerd를 선정하고 클러스터에 올려 테스트 수행

1. linkerd CLI 설치

curl --proto '=https' --tlsv1.2 -sSfL <https://run.linkerd.io/install> | sh

or

brew install linkerd

설치확인

% linkerd version
Client version: stable-2.11.2
Server version: unavailable

2. linkerd로 클러스터 체크

% linkerd check --pre
there are nodes using the docker container runtime and proxy-init container must run as root user.
try installing linkerd via --set proxyInit.runAsRoot=true

에러발생... 체크 패스하고 위에 설명한대로 설치 진행

3. linkerd install

ybchoi@ybchoiui-MacBookPro linkerd % linkerd install | kubectl apply -f -
there are nodes using the docker container runtime and proxy-init container must run as root user.
try installing linkerd via --set proxyInit.runAsRoot=true
error: no objects passed to apply
ybchoi@ybchoiui-MacBookPro linkerd % linkerd install --set proxyInit.runAsRoot=true | kubectl apply -f -
namespace/linkerd created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-identity created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-identity created
serviceaccount/linkerd-identity created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-destination created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-destination created
serviceaccount/linkerd-destination created
secret/linkerd-sp-validator-k8s-tls created
validatingwebhookconfiguration.admissionregistration.k8s.io/linkerd-sp-validator-webhook-config created
secret/linkerd-policy-validator-k8s-tls created
validatingwebhookconfiguration.admissionregistration.k8s.io/linkerd-policy-validator-webhook-config created
clusterrole.rbac.authorization.k8s.io/linkerd-policy created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-destination-policy created
role.rbac.authorization.k8s.io/linkerd-heartbeat created
rolebinding.rbac.authorization.k8s.io/linkerd-heartbeat created
clusterrole.rbac.authorization.k8s.io/linkerd-heartbeat created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-heartbeat created
serviceaccount/linkerd-heartbeat created
customresourcedefinition.apiextensions.k8s.io/servers.policy.linkerd.io created
customresourcedefinition.apiextensions.k8s.io/serverauthorizations.policy.linkerd.io created
customresourcedefinition.apiextensions.k8s.io/serviceprofiles.linkerd.io created
customresourcedefinition.apiextensions.k8s.io/trafficsplits.split.smi-spec.io created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
serviceaccount/linkerd-proxy-injector created
secret/linkerd-proxy-injector-k8s-tls created
mutatingwebhookconfiguration.admissionregistration.k8s.io/linkerd-proxy-injector-webhook-config created
configmap/linkerd-config created
secret/linkerd-identity-issuer created
configmap/linkerd-identity-trust-roots created
service/linkerd-identity created
service/linkerd-identity-headless created
deployment.apps/linkerd-identity created
service/linkerd-dst created
service/linkerd-dst-headless created
service/linkerd-sp-validator created
service/linkerd-policy created
service/linkerd-policy-validator created
deployment.apps/linkerd-destination created
Warning: batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
cronjob.batch/linkerd-heartbeat created
deployment.apps/linkerd-proxy-injector created
service/linkerd-proxy-injector created
secret/linkerd-config-overrides created

확인

kubectl get pod -n linkerd
NAME                                     READY   STATUS    RESTARTS   AGE
linkerd-destination-6dcfd495fd-g4p7w     4/4     Running   0          79s
linkerd-identity-847445f99d-7smq4        2/2     Running   0          80s
linkerd-proxy-injector-7695cf7f6-7st5z   2/2     Running   0          79s
linkerd check
Linkerd core checks
===================

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
‼ cluster networks can be verified
    the following nodes do not expose a podCIDR:
	ip-10-120-11-168.ap-northeast-2.compute.internal
	ip-10-120-11-50.ap-northeast-2.compute.internal
	ip-10-120-12-35.ap-northeast-2.compute.internal
    see <https://linkerd.io/2.11/checks/#l5d-cluster-networks-verified> for hints

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ can retrieve the control plane version
√ control plane is up-to-date
√ control plane and cli versions match

linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match

Status check results are √

4. Testing

링커드 제공 데모앱 배포 및 테스팅

curl --proto '=https' --tlsv1.2 -sSfL <https://run.linkerd.io/emojivoto.yml> \\
  | kubectl apply -f -

k get all -n emojivoto
NAME                            READY   STATUS    RESTARTS   AGE
pod/emoji-66ccdb4d86-xvwzk      1/1     Running   0          20s
pod/vote-bot-69754c864f-v8k5k   1/1     Running   0          20s
pod/voting-f999bd4d7-xvv4n      1/1     Running   0          20s
pod/web-79469b946f-r9785        1/1     Running   0          19s

NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/emoji-svc    ClusterIP   172.20.135.31           8080/TCP,8801/TCP   20s
service/voting-svc   ClusterIP   172.20.39.59            8080/TCP,8801/TCP   20s
service/web-svc      ClusterIP   172.20.34.9             80/TCP              20s

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/emoji      1/1     1            1           20s
deployment.apps/vote-bot   1/1     1            1           20s
deployment.apps/voting     1/1     1            1           20s
deployment.apps/web        1/1     1            1           19s

NAME                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/emoji-66ccdb4d86      1         1         1       21s
replicaset.apps/vote-bot-69754c864f   1         1         1       21s
replicaset.apps/voting-f999bd4d7      1         1         1       21s
replicaset.apps/web-79469b946f        1         1         1       20s

위 앱 배포하고 proxy이용 확인하면 특정 이모지에서 에러발생(의도한것)

이제 위앱에 inkerd를 삽입한다(프록시 삽입)

kubectl get -n emojivoto deploy -o yaml \\
  | linkerd inject - \\
  | kubectl apply -f -

위 명령어로 현재 deploy설정을 yaml로 출력한 다음 링커드 cli가 inject문법을 설정 한후 apply로 실제 클러스터에 적용

k get pod -n emojivoto
NAME                        READY   STATUS    RESTARTS   AGE
emoji-696d9d8f95-smqmr      2/2     Running   0          42s
vote-bot-6d7677bb68-zzbfb   2/2     Running   0          42s
voting-ff4c54b8d-w5qf6      2/2     Running   0          42s
web-5f86686c4d-tzsc9        2/2     Running   0          42s

위와 같이 container의 갯수가 늘어남 linkerd-proxy container가 추가되었다

이게 프록시 역할을 해줌

linkerd injection 수행확인

linkerd -n emojivoto check --proxy
Linkerd core checks
===================

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version
√ is running the minimum kubectl version

linkerd-existence
-----------------
√ 'linkerd-config' config map exists
√ heartbeat ServiceAccount exist
√ control plane replica sets are ready
√ no unschedulable pods
√ control plane pods are ready
‼ cluster networks can be verified
    the following nodes do not expose a podCIDR:
	ip-10-120-11-168.ap-northeast-2.compute.internal
	ip-10-120-11-50.ap-northeast-2.compute.internal
	ip-10-120-12-35.ap-northeast-2.compute.internal
    see <https://linkerd.io/2.11/checks/#l5d-cluster-networks-verified> for hints

linkerd-config
--------------
√ control plane Namespace exists
√ control plane ClusterRoles exist
√ control plane ClusterRoleBindings exist
√ control plane ServiceAccounts exist
√ control plane CustomResourceDefinitions exist
√ control plane MutatingWebhookConfigurations exist
√ control plane ValidatingWebhookConfigurations exist
√ proxy-init container runs as root user if docker container runtime is used

linkerd-identity
----------------
√ certificate config is valid
√ trust anchors are using supported crypto algorithm
√ trust anchors are within their validity period
√ trust anchors are valid for at least 60 days
√ issuer cert is using supported crypto algorithm
√ issuer cert is within its validity period
√ issuer cert is valid for at least 60 days
√ issuer cert is issued by the trust anchor

linkerd-webhooks-and-apisvc-tls
-------------------------------
√ proxy-injector webhook has valid cert
√ proxy-injector cert is valid for at least 60 days
√ sp-validator webhook has valid cert
√ sp-validator cert is valid for at least 60 days
√ policy-validator webhook has valid cert
√ policy-validator cert is valid for at least 60 days

linkerd-identity-data-plane
---------------------------
√ data plane proxies certificate match CA

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

linkerd-control-plane-proxy
---------------------------
√ control plane proxies are healthy
√ control plane proxies are up-to-date
√ control plane proxies and cli versions match

linkerd-data-plane
------------------
√ data plane namespace exists
√ data plane proxies are ready
√ data plane is up-to-date
√ data plane and cli versions match
√ data plane pod labels are configured correctly
√ data plane service labels are configured correctly
√ data plane service annotations are configured correctly
√ opaque ports are properly annotated

Status check results are √

이제 배포한 앱에 linkerd가 작동되고 있다

이를 확인하려면 viz라는 대시보드를 이용하여야함

viz 설치

linkerd viz install | kubectl apply -f -

설치후 다시 확인

linkerd check

viz부분만 출력

linkerd-viz
-----------
√ linkerd-viz Namespace exists
√ linkerd-viz ClusterRoles exist
√ linkerd-viz ClusterRoleBindings exist
√ tap API server has valid cert
√ tap API server cert is valid for at least 60 days
√ tap API service is running
√ linkerd-viz pods are injected
√ viz extension pods are running
√ viz extension proxies are healthy
√ viz extension proxies are up-to-date
√ viz extension proxies and cli versions match
√ prometheus is installed and configured correctly
√ can initialize the client
√ viz extension self-check

Status check results are √

대시보드 실행

linkerd viz dashboard &

 

위와같이 실제 서비스 흐름도를 파악할수 있으며 자동으로 배포되는 linkerd grafana로 메트릭 확인도 가능

위와같이 실제 어디에서 문제가 발생햇는지 모니터링이 가능함

배포앱들이 msa화 되며 서비스 관계도가 복잡하고 많아지는데

k8s의 한계점때문에 pod끼리의 통신은 추적 모니터링이 불가능함

이경우 위와같이 서비스메시를 사용하면 모니터링은 물론 서비스흐름도 제어해줌

다만 모든 서비스의 사이드카가 추가되고 관리 포인트가 늘어나므로 단점도 있다

테스트 클러스터 사용전까지 테스트하며 케이스 습득 예정

janusgraph에 linkerd적용 테스트

kubectl get -n janusgraph deploy -o yaml \\
  | linkerd inject - \\
  | kubectl apply -f -

% k get pod -n janusgraph
NAME                                     READY   STATUS    RESTARTS   AGE
jaunsgraph-deployment-57c68dcd5f-9lmzx   2/2     Running   0          6m41s
jaunsgraph-deployment-57c68dcd5f-bg9k7   2/2     Running   0          7m21s
jaunsgraph-deployment-57c68dcd5f-scc7v   2/2     Running   0          6m

 

정말 아무런 오류없이 잘 적용됨...링커드 많이 쓰는 이유가 있는듯

 

다른 파드에 적용할때는 아래 어노테이션만 주면된다

 

annotations:
        linkerd.io/inject: enabled

파드에 해당 어노테이션이 있다면 스케쥴링시 linkerd proxy injector 가 체크하여 사이드카를 삽입 해줌

 

위 내용을 엘라스틱서치에 넣고 업그레이드 해봄 (젤어려워보이는 es로 테스트)

podAnnotations:
  {}
  # iam.amazonaws.com/role: es-cluster

podAnnotations:
  linkerd.io/inject: enabled
% helm upgrade elasticsearch -n elasticsearch -f values.yaml elastic/elasticsearch

 

 

es의 경우 여러가지 복잡한 통신이 이루어지기때문에 몇일간 테스트 하고 모니터링 예정임

k get pod -n elasticsearch
NAME                            READY   STATUS    RESTARTS   AGE
elasticsearch-master-0          2/2     Running   0          89s
elasticsearch-master-1          2/2     Running   0          3m1s
elasticsearch-master-2          2/2     Running   0          4m33s

또 viz배포시 자동으로 배포되는 프로메테우스 말고 클러스터에서 사용중인 프로메테우스에 통합하는 것도 테스트 해야함

https://linkerd.io/2.11/tasks/external-prometheus/

예거는 도입X - 너무복잡하고 앱 코드 수정이 필요함

 

테스트 말고 운영환경 실제 배포시에는 아래 옵션들 사용 필요

--ha 	Enable HA deployment config for the control plane (default false)

--set 으로 프로메테우스 그라파등 옵션 찾아서 지정하여 설치

--set grafana.url=grafana.grafana:3000

--set prometheusUrl=xxxx:9090

--set prometheus.enabled=false < 이건매뉴얼에 없음 체크

 

 

scylla에 도 적용 해봄

kubectl get -n scylla statefulset -o yaml \\
  | linkerd inject - \\
  | kubectl apply -f -

Warning: resource statefulsets/scylla-scylla is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.

k get pod -n scylla
NAME                    READY   STATUS    RESTARTS   AGE
scylla-scylla-0   3/3     Running   0          2m1s
scylla-scylla-1   3/3     Running   0          2m56s
scylla-scylla-2   3/3     Running   0          4m2s

syclla나 elastic모두 tcp통신이라 live call에는 안뜨지만 tcp모니터링은 가능