🏗️ M1 + M2 멀티 클러스터 아키텍처 설계

목표: M1 맥미니(16GB) + M2 맥북(8GB)을 활용한 분산 Kubernetes 클러스터 구축 학습 범위: Multi-cluster networking, 분산시스템 이론 실습, 하드웨어 제약 기반 워크로드 최적화 예상 기간: 4-6주 (프로메테우스 학습 완료 후)

📊 하드웨어 프로파일 비교

성능 매트릭스

항목	M2 맥북 (8GB)	M1 맥미니 (16GB)	승자	비고
Single-core 성능	~3.5 GHz (M2)	~3.2 GHz (M1)	🏆 M2	CPU 집약 작업 유리
Multi-core 효율	8 cores	8 cores	동등	병렬 처리
메모리 용량	8GB (제약!)	16GB (여유)	🏆 M1	Stateful 워크로드
메모리 대역폭	100 GB/s	68 GB/s	🏆 M2	데이터 처리 속도
가상화 오버헤드	낮음 (신형)	약간 높음	🏆 M2	minikube/kind 성능
VM 개수 (안정)	2-3개	4-5개	🏆 M1	클러스터 규모
휴대성	✅ 이동 가능	❌ 고정	🏆 M2	Edge 시뮬레이션
24시간 운영	❌ 배터리/발열	✅ 안정적	🏆 M1	Always-on 서비스

전략적 워크로드 배치

분산시스템 관점의 클러스터 설계
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

M2 맥북 (8GB) - "Edge / Compute Cluster"
┌─────────────────────────────────────────────────┐
│ 역할: 가볍고 빠른 stateless 워크로드             │
├─────────────────────────────────────────────────┤
│ ✅ 적합한 워크로드:                              │
│   • API Gateway (stateless)                     │
│   • Compute-intensive (ML inference)            │
│   • Short-lived jobs (CI/CD runners)            │
│   • 개발/테스트 환경 (minikube/kind)            │
│   • Frontend apps (React, nginx)                │
├─────────────────────────────────────────────────┤
│ ❌ 부적합한 워크로드:                            │
│   • Database (메모리 많이 먹음)                 │
│   • Caching (Redis 큰 dataset)                 │
│   • Stateful apps (PV 많이 필요)                │
└─────────────────────────────────────────────────┘

M1 맥미니 (16GB) - "Core / Data Cluster"
┌─────────────────────────────────────────────────┐
│ 역할: 상태 유지, 메모리 집약 워크로드            │
├─────────────────────────────────────────────────┤
│ ✅ 적합한 워크로드:                              │
│   • Databases (PostgreSQL, MySQL)               │
│   • Message Queue (Kafka, RabbitMQ)             │
│   • Caching (Redis, Memcached)                 │
│   • Observability (Prometheus, Grafana)         │
│   • etcd, Control Plane (Kubernetes)           │
│   • Long-running services                      │
├─────────────────────────────────────────────────┤
│ ❌ 부적합한 워크로드:                            │
│   • CPU-heavy ML training (메모리는 좋음)       │
│   • 일시적 워크로드 (VM 켜두기 아까움)          │
└─────────────────────────────────────────────────┘

🔀 3가지 아키텍처 패턴 비교

패턴 1: Active-Passive (DR 시나리오)

Architecture: 주-백업 구조
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
M1 맥미니 (Primary) ─────────┐
  - 모든 서비스 실행         │
  - Prometheus + Grafana     │  Replication
  - Database (Primary)       │  ─────────→
  - 24시간 운영              │
                              │
M2 맥북 (Standby) ───────────┘
  - Database (Replica, Read-only)
  - 필요시 장애복구용
  - 평소엔 개발/실험용

트레이드오프 분석

장점:

✅ 간단한 구조, 관리 쉬움

✅ M2 배터리 절약 (필요시만 켜기)

단점:

❌ M2 자원 활용도 낮음

❌ 분산시스템 학습 기회 적음

분산시스템 학습 포인트:

Replication lag 체험
Failover 메커니즘 (수동/자동)
Consensus 없이 단순 복제

패턴 2: Workload Partitioning (✨ 추천!)

Architecture: 역할 기반 분산
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
M1 맥미니 (Data Plane)
┌──────────────────────────────┐
│ Control Plane:               │
│ - etcd cluster (3 nodes)     │
│ - K8s API server             │
│                              │
│ Stateful Workloads:          │
│ - PostgreSQL (persistent)    │
│ - Kafka (message queue)      │
│ - Redis (cache, 4GB data)    │
│ - Prometheus (metrics store) │
│                              │
│ Long-running:                │
│ - Grafana                    │
│ - Jenkins controller         │
└──────────────────────────────┘
        ↕️ Service Mesh / VPN
┌──────────────────────────────┐
│ M2 맥북 (Compute Plane)      │
│                              │
│ Stateless Workloads:         │
│ - API Gateway (NGINX)        │
│ - Microservices (REST APIs)  │
│ - Frontend (React apps)      │
│                              │
│ Burst Compute:               │
│ - CI/CD runners              │
│ - ML inference               │
│ - Image processing           │
└──────────────────────────────┘

트레이드오프 분석

장점:

✅ 각 하드웨어의 강점 극대화

✅ M2를 필요할 때만 사용 (전력 효율)

✅ 분산시스템 패턴 학습 최고

단점:

⚠️ 네트워크 의존성 높음 (M2 오프라인 시 API 불가)

⚠️ Service discovery 복잡도 증가

분산시스템 학습 포인트:

Service mesh (Istio/Linkerd)
Load balancing across clusters
Circuit breaker (M2 다운 시 fallback)
Distributed tracing (M1 → M2 호출 추적)

실전 시나리오:

# 사용자 요청 흐름
Client
  → M2 (API Gateway)
  → M2 (Auth Service, stateless)
  → M1 (PostgreSQL, read user data)
  → M2 (Response 조립)
  → Client
 
# M2가 꺼져있으면?
→ M1의 fallback API Gateway로 라우팅
→ 느리지만 서비스는 유지 (degraded mode)

패턴 3: Multi-Tenant Simulation (고급)

Architecture: 테넌트별 클러스터 격리
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
M1 맥미니 (Production Cluster)
┌──────────────────────────────┐
│ Tenant A (고객사 A)          │
│ - Dedicated namespace        │
│ - Resource quota: 8GB        │
│ - Production workloads       │
│                              │
│ Shared Services:             │
│ - Ingress controller         │
│ - Monitoring stack           │
└──────────────────────────────┘
        ↕️ Federation / Cross-cluster policy
┌──────────────────────────────┐
│ M2 맥북 (Dev/Edge Cluster)   │
│                              │
│ Tenant B (고객사 B, 가벼움)  │
│ - 개발 환경                  │
│ - Edge computing use case    │
│                              │
│ Experimentation:             │
│ - Canary deployment testing  │
│ - New K8s version testing    │
└──────────────────────────────┘

트레이드오프 분석

장점:

✅ 멀티 클러스터 관리 경험 (Rancher, ArgoCD)

✅ Tenant isolation 학습

단점:

⚠️ 복잡도 매우 높음

⚠️ 8GB로는 여러 tenant 어려움

분산시스템 학습 포인트:

Kubernetes Federation (KubeFed)
Multi-tenancy security
Cross-cluster resource scheduling

🧮 분산시스템 이론 → 실습 매핑

1. CAP Theorem 체험하기

시나리오: M1과 M2 사이 네트워크 파티션 발생
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
Setup:
  - M1: etcd cluster (3 nodes)
  - M2: etcd client (read-only)
  - 서비스: User profile 조회 API
 
Test 1: Network Partition (P)
  1. M1 ↔ M2 사이 방화벽 규칙 추가
     → iptables drop all from M2
 
  2. M2에서 API 호출 시도
     → Timeout 발생!
 
  3. 선택지:
     - Consistency (C) 우선: M2 API 응답 거부 (503)
     - Availability (A) 우선: M2 로컬 캐시로 응답 (stale data)
 
Test 2: Eventual Consistency
  - M1에서 user profile 업데이트
  - M2는 1분 후 sync (Kafka로 전달)
  - 그 사이 M2는 outdated 데이터 반환
  → 실제로 "eventually consistent" 체감!

학습 포인트

CAP theorem은 이론이 아니라 실제 네트워크 환경의 문제

Consistency vs Availability 트레이드오프를 직접 선택

Partition tolerance는 선택이 아닌 필수

2. Consensus Algorithm (etcd, Raft)

실습: etcd 클러스터의 Leader Election
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
Setup:
  M1: etcd-1 (Leader), etcd-2, etcd-3
  M2: etcd client
 
Test:
  1. etcd-1 (Leader) 강제 종료
     → watch -n 1 "etcdctl endpoint status"
     → 2-3초 내 etcd-2 또는 etcd-3이 새 Leader
 
  2. Quorum 깨기 (2/3 노드 다운)
     → etcd cluster unavailable!
     → M2에서 API 호출 모두 실패
 
  3. Split-brain 시뮬레이션
     → M1과 M2 네트워크 분리
     → 각자 독립적인 Leader 선출 시도
     → 재연결 시 conflict resolution

학습 포인트

Raft consensus의 실제 동작

Quorum의 중요성 (majority voting)

Split-brain 문제와 해결

3. Distributed Tracing

실습: M1 ↔ M2 마이크로서비스 호출 추적
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
Setup:
  - Jaeger (M1에 배포)
  - OpenTelemetry instrumentation
 
Flow:
  1. Client → M2 (API Gateway)
     └─ Span ID: abc123
 
  2. M2 → M1 (User Service)
     └─ Parent Span: abc123, Span ID: def456
 
  3. M1 → M1 (Database)
     └─ Parent Span: def456, Span ID: ghi789
 
  4. Jaeger UI에서 전체 trace 시각화
     → M1 ↔ M2 네트워크 latency 측정
     → 어느 구간이 느린지 병목 분석

4. Load Balancing & Failover

실습: M2 다운 시 M1으로 자동 페일오버
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
Setup:
  - M2: API Gateway (primary)
  - M1: API Gateway (secondary, standby)
  - External DNS: Round-robin or weighted
 
Test:
  1. 정상 상태: 80% 트래픽 → M2, 20% → M1
 
  2. M2 다운:
     → Health check 실패 감지 (3초 내)
     → 모든 트래픽 M1으로 자동 전환
 
  3. M2 복구:
     → Gradual rollback (10% → 50% → 80%)

🎯 최종 추천 아키텍처 (패턴 2 기반)

전체 구성도

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🏠 홈네트워크 분산 Kubernetes 클러스터
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
M1 맥미니 (16GB) - "Always-On Core"
┌────────────────────────────────────────────┐
│ Control Plane Cluster (kind, 3 nodes)      │
│ - RAM: 6GB (2GB × 3)                       │
│ - etcd, kube-apiserver, scheduler          │
├────────────────────────────────────────────┤
│ Data Layer (Stateful Workloads)           │
│ - PostgreSQL: 2GB RAM                      │
│ - Redis: 1GB RAM                           │
│ - Kafka (single broker): 2GB RAM          │
├────────────────────────────────────────────┤
│ Observability Stack                        │
│ - Prometheus: 2GB RAM (long-term metrics)  │
│ - Grafana: 512MB                           │
│ - Jaeger: 1GB (distributed tracing)       │
├────────────────────────────────────────────┤
│ 총 사용: ~14GB (여유 2GB)                  │
└────────────────────────────────────────────┘
 
M2 맥북 (8GB) - "On-Demand Compute"
┌────────────────────────────────────────────┐
│ Worker Cluster (minikube, 1-2 nodes)       │
│ - RAM: 4GB (필요 시 조정)                  │
│ - Control plane: M1 클러스터 참조          │
├────────────────────────────────────────────┤
│ Stateless Services                         │
│ - API Gateway (NGINX/Traefik): 256MB      │
│ - Backend APIs (Go/Node.js): 1GB          │
│ - Frontend (React bundle): 256MB          │
├────────────────────────────────────────────┤
│ Burst Workloads (필요 시)                  │
│ - CI/CD runners: 1-2GB                    │
│ - ML inference: 2GB                        │
├────────────────────────────────────────────┤
│ 총 사용: ~5-6GB (호스트 2GB 확보)          │
└────────────────────────────────────────────┘

네트워크 아키텍처

Physical Network: 192.168.1.0/24 (홈 라우터)
├─ M1: 192.168.1.10
│   ├─ Control Plane: 10.89.0.0/24
│   ├─ Pod Network: 10.244.0.0/16
│   └─ Service IPs: 10.96.0.0/16
│
├─ M2: 192.168.1.20
│   ├─ Worker Node: M1의 Pod Network에 join
│   └─ Service: M1의 Service IPs 공유
│
└─ Inter-cluster Communication:
    - Option A: Submariner (L3 tunnel) ⭐ 추천
    - Option B: Cilium Cluster Mesh
    - Option C: Simple External IP routing
 
Service Mesh: Istio (optional)
├─ M1: Istiod (control plane)
├─ M2: Envoy sidecars
└─ mTLS between clusters

리소스 할당 전략

컴포넌트	M1 할당	M2 할당	이유
etcd	3 nodes (6GB)	0	Quorum 보장, 안정성
API Server	Primary	Proxy	통신 latency 최소화
Database	Primary	0	Stateful, 메모리 집약
Cache (Redis)	1GB	0	메모리 집약
API Gateway	Standby	Primary	M2 성능 활용
Backend APIs	0	Primary	Stateless, CPU 집약
Prometheus	2GB	Agent only	중앙 집중 모니터링
Grafana	512MB	0	Visualization 서버

📅 단계별 구현 가이드

Phase 0: 현재 상태 유지 (프로메테우스 학습 중)

현재 상태:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
M1: 4 VMs (VMware Fusion)
    - cp-k8s-1.30.3, w1, w2, w3
    - Prometheus 설치 완료 ✅
 
M2: 실습 중단 (메모리 부족)
 
→ 일단 M1만으로 프로메테우스 학습 완료!

Phase 1: M2를 경량 클러스터로 재구성 (주말 1일)

# Option 1: minikube (간단)
minikube start --memory=4096 --cpus=2 --nodes=2
 
# Option 2: kind (더 가벼움, 추천)
cat > kind-m2.yaml << 'EOF'
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: worker
  - role: worker
EOF
 
kind create cluster --config kind-m2.yaml --name m2-cluster
 
# 확인
kubectl get nodes

Phase 1 성공 기준

M2에서 2-node 클러스터 정상 실행

메모리 사용량 6GB 이하

nginx pod 배포 및 접근 가능

Phase 2: 기본 서비스 통신 테스트 (1일)

# Step 1: M2에 nginx 배포
kubectl run nginx --image=nginx --port=80
 
# Step 2: LoadBalancer 서비스 생성 (MetalLB 사용)
kubectl expose pod nginx --type=LoadBalancer --name=nginx-lb
 
# Step 3: External IP 확인
kubectl get svc nginx-lb
# EXTERNAL-IP: 192.168.1.20 (예시)
 
# Step 4: M1에서 M2 서비스 호출
# M1의 터미널에서:
curl http://192.168.1.20
# → nginx 응답 확인!

Phase 2 성공 기준

M1 → M2 HTTP 통신 성공

M2 → M1 HTTP 통신 성공

Ping 응답 시간 < 5ms (같은 네트워크)

Phase 3: Submariner로 클러스터 연결 (2일)

# Step 1: subctl 설치
curl -Ls https://get.submariner.io | bash
export PATH=$PATH:~/.local/bin
 
# Step 2: M1을 broker로 설정
subctl deploy-broker \
  --kubeconfig ~/.kube/config-m1 \
  --context kind-m1-cluster
 
# Step 3: M1 join
subctl join broker-info.subm \
  --kubeconfig ~/.kube/config-m1 \
  --clusterid m1 \
  --natt=false
 
# Step 4: M2 join
subctl join broker-info.subm \
  --kubeconfig ~/.kube/config-m2 \
  --clusterid m2 \
  --natt=false
 
# Step 5: 연결 확인
subctl show all
 
# Step 6: Service Export (M2의 nginx)
kubectl label service nginx-lb \
  submariner.io/exported=true
 
# Step 7: M1에서 M2 service 호출 (DNS 기반)
kubectl run curl --image=curlimages/curl -it --rm -- \
  curl http://nginx-lb.default.svc.clusterset.local

Phase 3 성공 기준

Submariner gateway 연결 상태: Connected

M1 Pod → M2 Service (DNS 기반 호출 성공)

M2 Pod → M1 Service (DNS 기반 호출 성공)

Phase 4: 분산 애플리케이션 배포 (3-5일)

# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# M1: 데이터 레이어 구성
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
# PostgreSQL 설치
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install postgresql bitnami/postgresql \
  --namespace data --create-namespace \
  --set auth.postgresPassword=mysecretpassword
 
# Redis 설치
helm install redis bitnami/redis \
  --namespace data \
  --set auth.password=redispassword
 
# Service Export
kubectl label service postgresql -n data \
  submariner.io/exported=true
kubectl label service redis-master -n data \
  submariner.io/exported=true
 
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# M2: API 서버 배포
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
cat > api-deployment.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 2
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
      - name: api
        image: your-api-image:latest
        env:
        - name: DB_HOST
          value: "postgresql.data.svc.clusterset.local"
        - name: DB_PASSWORD
          value: "mysecretpassword"
        - name: REDIS_HOST
          value: "redis-master.data.svc.clusterset.local"
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: api-server
spec:
  type: LoadBalancer
  selector:
    app: api-server
  ports:
  - port: 80
    targetPort: 8080
EOF
 
kubectl apply -f api-deployment.yaml
 
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# M1: Prometheus가 M2 메트릭 수집
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
cat > prometheus-scrape-m2.yaml << 'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-server
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
 
    scrape_configs:
    - job_name: 'm1-cluster'
      kubernetes_sd_configs:
      - role: pod
 
    - job_name: 'm2-cluster'
      static_configs:
      - targets:
        - 'api-server.default.svc.clusterset.local:8080'
      metrics_path: '/metrics'
EOF
 
kubectl apply -f prometheus-scrape-m2.yaml
kubectl rollout restart deployment/prometheus-server -n monitoring

Phase 4 성공 기준

M2 API가 M1 PostgreSQL 접근 성공

M2 API가 M1 Redis 접근 성공

Prometheus가 M1+M2 메트릭 모두 수집

Grafana 대시보드에서 통합 시각화

Phase 5: 분산 시스템 실험 (고급, 1-2주)

# 실험 1: Network Partition 시뮬레이션
# M1 터미널에서:
sudo iptables -A INPUT -s 192.168.1.20 -j DROP
sudo iptables -A OUTPUT -d 192.168.1.20 -j DROP
 
# M2에서 API 호출 시도
curl http://postgresql.data.svc.clusterset.local
# → Timeout!
 
# Circuit breaker 동작 확인
# → M2 API가 fallback response 반환
 
# 복구
sudo iptables -D INPUT -s 192.168.1.20 -j DROP
sudo iptables -D OUTPUT -d 192.168.1.20 -j DROP
 
# 실험 2: Leader Election (etcd)
kubectl exec -it etcd-0 -n kube-system -- etcdctl member list
# Leader 확인 후 강제 종료
kubectl delete pod etcd-0 -n kube-system --force
 
# 새 Leader 선출 확인 (2-3초 내)
watch -n 1 "kubectl exec -it etcd-1 -n kube-system -- etcdctl endpoint status"
 
# 실험 3: Distributed Tracing
# Jaeger 설치
kubectl create namespace observability
kubectl apply -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/crds/jaegertracing.io_jaegers_crd.yaml
kubectl apply -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/service_account.yaml
kubectl apply -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role.yaml
kubectl apply -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role_binding.yaml
kubectl apply -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/operator.yaml
 
# Jaeger UI 접근
kubectl port-forward -n observability svc/jaeger-query 16686:16686
# → http://localhost:16686

💰 비용 vs 학습 효과 분석

시간 투자

Phase	예상 시간	학습 효과	ROI
Phase 0 (현재)	4시간	Prometheus 기본	⭐⭐⭐
Phase 1-2 (연결)	+8시간	멀티 클러스터 기초	⭐⭐⭐⭐
Phase 3 (Submariner)	+16시간	실무급 네트워킹	⭐⭐⭐⭐⭐
Phase 4 (분산 앱)	+24시간	포트폴리오 급	⭐⭐⭐⭐⭐
Phase 5 (실험)	+40시간	분산시스템 마스터	⭐⭐⭐⭐⭐

클라우드 대안과 비교

클라우드 비용 (Multi-cluster):
━━━━━━━━━━━━━━━━━━━━━━━━━━━
GKE Multi-cluster: $150/month
AWS EKS (2 clusters): $200/month
Azure AKS (2 clusters): $180/month

→ 6개월이면: $900-1200 절약!

홈랩 비용:
━━━━━━━━━━━━━━━━━━━━━━━━━━━
M1 맥미니: 이미 보유 ($0)
M2 맥북: 이미 보유 ($0)
전기세: ~$5/month (M1 24시간 가동)

→ 실질적 비용: ~$30/6개월

학습 가치

기술 스택	홈랩 경험	클라우드 대안	학습 깊이
Kubernetes	✅ 직접 설치/운영	Managed (추상화)	홈랩 > 클라우드
Networking	✅ L2/L3 직접 설정	VPC (자동화)	홈랩 >> 클라우드
Troubleshooting	✅ 모든 레이어 접근	제한적 로그	홈랩 >>> 클라우드
Cost Management	✅ 하드웨어 제약 체감	비용만 고려	홈랩 > 클라우드
Real-world Scale	❌ 제한적	✅ 무제한	클라우드 > 홈랩

🔗 관련 문서 및 참고 자료

내부 링크

외부 참고 자료

Multi-cluster Kubernetes:

분산시스템 이론:

Kubernetes Networking:

🎯 핵심 요약

TL;DR: 이 프로젝트의 가치

기술적 학습:

Multi-cluster Kubernetes 아키텍처 (Submariner, Istio)

분산시스템 이론 → 실습 (CAP, Raft, Distributed Tracing)

하드웨어 제약 기반 워크로드 최적화

Service mesh, Load balancing, Failover 경험

실무 활용:

이력서: “홈랩에서 멀티 클러스터 Kubernetes 환경 구축”

면접: “실제 분산시스템 문제 해결 경험 (Network partition, Failover)”

포트폴리오: “M1/M2 하드웨어 특성을 고려한 아키텍처 설계”

비용 절감:

클라우드 비용 대비 6개월에 $900-1200 절약

무제한 실험 가능 (비용 걱정 없음)

추천 시작 시점:

프로메테우스 학습 완료 후 (2-3주 후)

Phase 1-2부터 시작 (주말 1-2일)

Phase 3-4로 점진적 확장 (4-6주)

📊 실행 체크리스트

Phase 1: M2 클러스터 구성

minikube 또는 kind 설치
2-node 클러스터 생성 (메모리 4GB)
kubectl 접근 확인
nginx pod 배포 및 접근 테스트

Phase 2: 기본 통신

M2 LoadBalancer 서비스 생성
M1 → M2 HTTP 통신 확인
M2 → M1 HTTP 통신 확인
Ping latency 측정 (< 5ms)

Phase 3: Submariner 연결

Phase 4: 분산 앱 배포

Phase 5: 분산 시스템 실험

Network partition 시뮬레이션
etcd Leader election 테스트
Jaeger distributed tracing 구성
Circuit breaker 동작 확인
Failover 자동화 테스트

작성자: irix 작성일: 2025-12-04 상태: 📋 계획 단계 예상 완료: 2025년 1월 말 (프로메테우스 학습 완료 후)

kubernetes multi-cluster distributed-systems homelab architecture submariner istio prometheus m1 m2

Quartz 4

탐색기

M1 + M2 멀티 클러스터 아키텍처 설계

🏗️ M1 + M2 멀티 클러스터 아키텍처 설계

📊 하드웨어 프로파일 비교

성능 매트릭스

전략적 워크로드 배치

🔀 3가지 아키텍처 패턴 비교

패턴 1: Active-Passive (DR 시나리오)

패턴 2: Workload Partitioning (✨ 추천!)

패턴 3: Multi-Tenant Simulation (고급)

🧮 분산시스템 이론 → 실습 매핑

1. CAP Theorem 체험하기

2. Consensus Algorithm (etcd, Raft)

3. Distributed Tracing

4. Load Balancing & Failover

🎯 최종 추천 아키텍처 (패턴 2 기반)

전체 구성도

네트워크 아키텍처

리소스 할당 전략

📅 단계별 구현 가이드

Phase 0: 현재 상태 유지 (프로메테우스 학습 중)

Phase 1: M2를 경량 클러스터로 재구성 (주말 1일)

Phase 2: 기본 서비스 통신 테스트 (1일)

Phase 3: Submariner로 클러스터 연결 (2일)

Phase 4: 분산 애플리케이션 배포 (3-5일)

Phase 5: 분산 시스템 실험 (고급, 1-2주)

💰 비용 vs 학습 효과 분석

시간 투자

클라우드 대안과 비교

학습 가치

🔗 관련 문서 및 참고 자료

내부 링크

외부 참고 자료

🎯 핵심 요약

📊 실행 체크리스트

Phase 1: M2 클러스터 구성

Phase 2: 기본 통신

Phase 3: Submariner 연결

Phase 4: 분산 앱 배포

Phase 5: 분산 시스템 실험

그래프 뷰

목차