When should you create another Kubernetes cluster?

  • Why are we on kubernetes?
  • What does the natural evolution look like in enterprises?
  • What happens at first when someone asks for a new cluster? When do we know there’s a problem?
  • How do we get ready to solve this problem?
  • How do we then decide whether to grant a new cluster?

Why are we on kubernetes?

What does the natural evolution look like in enterprises?

What happens at first when someone asks for a new cluster?

How do we know there’s a problem?

How do we get ready to solve this problem?

How do we then decide whether to grant a new cluster?

  • Regulatory reasons — kubernetes does not work well over long geographical distances, so they’re mostly constrained to a single geographical region (I think this is due to the underlying etcd database). If you have requirements that mean certain data must remain in certain geographical locations, then a new cluster is pretty much a must have.
  • Different environments — anyone who has watched any tutorials on kubernetes will see the example “you can have one namespace for dev, and one for prod”. In my opinion, this is total nonsense, complete and utter nonsense. I guess they do it because it’s an easy example to understand for namespacing, but it’s still nonsense. Dev and Production instances are on the same cluster, how do you safely test and upgrade the underlying infrastructure? Get separate clusters per environment.
  • “We’ve don’t want to share with someone else” — tough! If we’ve got the segregation access controls in place, there’s no reason we can’t provision computing infrastructure in kubernetes that’s truly separate from other users. We can even protect certain hardware for use by a single namespace, so this isn’t a good enough reason. If you go so far as to deploy a service mesh, you’ll have all of the controls of a regular compute instance architecture and then some, so there’s no reason.
  • Cloud provider redundancy — this is a complex topic. You can run workloads across multiple cloud providers, any provider worth their salt with have a kubernetes offering you can use, so this will allow your workloads to be fault tolerant, but this is something I’m extremely skeptical of. Most cloud providers will have an SLA in the range of 99.9% uptime, so in order to have multiple cloud providers, your most likely source of downtime should be the cloud provider, which unless you have an extremely mature ops team, is unlikely. However, if you can consider yourself a massive multinational corporation with material business risk in the case of downtime, then this might be a genuine reason to start spreading critical workloads across cloud providers, but do not underestimate the complexity of creating and maintaining this kind of configuration.
  • “We have highly sensitive data” — if you’re storing highly sensitive data in your cluster, (beyond PII, more like the kind of information that, in the wrong hands, could predict and manipulate your future stock price), then you might have a legitimate reason to provide a cluster separated from other workloads. But almost exactly the same security controls should go in place as normal clusters. These security controls are not so complex to implement, and even the toughest restrictions won’t take too long to secure things even for normal workloads.
  • PCIDSS — processing payment data? Expect extra security controls and an audit. This audit will not be 5 minutes, and if your payment applications are lumped in with other apps, expect more questions and pain. Separating the workloads that process payment information might not be the worst idea. You may be asked to implement all kinds of information logging around firewalls and network requests, which might be otherwise expensive in a shared cluster
  • “This is just a quick PoC” — absolutely not, we’ve almost certainly got another cluster lying round somewhere you can use for development purposes with the correct security controls in place already.

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How SRE is reducing on-call fatigue at Condé Nast

Enterprise Architecture: Demystifying the EA!

EuroSTAR is celebrating 30 years of software testing — join the community in Copenhagen

How to Uninstall Java? Here Are Three Methods for You

Basics of Selenium in Python3

5 Great Kanban Board Tools for Business Team Works

Is Your Organization Ready to Undergo a DevOps Cultural Shift?

Android — Volume keys as Camera Trigger with Kotlin

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mark McCracken

Mark McCracken

More from Medium

OpenDistro with cert-manager

Grafana 101 — What, Why, Who?

Grafana 101 — What, Why, Who?

Identity Federation for Gitlab CI and Google Cloud APIs

New Way of HCI — Harvester