I attempt to build a Pod that runs a service that requires:
1. cluster-internal services to be resolved and accessed by their FQDN (*.cluster.local),
2. while also have an active OpenVPN connection to a remote cluster and have services from this remote cluster to be resolved and accessed by their FQDN (*.cluster.remote).
The service container within the Pod without an OpenVPN sidecar can access all services provided an FQDN using the *.cluster.local namespace. Here is the /etc/resolv.conf in this case:
nameserver 169.254.25.10
search default.cluster.local svc.cluster.local cluster.local
options ndots:5resolv.confThe OpenVPN sidecar is started in the following way:
containers:
{{- if .Values.vpn.enabled }}
- name: vpn
image: "ghcr.io/wfg/openvpn-client"
imagePullPolicy: {{ .Values.image.pullPolicy | quote }}
volumeMounts:
- name: vpn-working-directory
mountPath: /data/vpn
env:
- name: KILL_SWITCH
value: "off"
- name: VPN_CONFIG_FILE
value: connection.conf
securityContext:
privileged: true
capabilities:
add:
- "NET_ADMIN"
resources:
limits:
cpu: 100m
memory: 80Mi
requests:
cpu: 25m
memory: 20Mi
{{- end }}and the OpenVPN client configuration contains the following lines:
script-security 2
up /etc/openvpn/up.sh
down /etc/openvpn/down.shThen OpenVPN client will overwrite resolv.conf so that it contains the following:
nameserver 192.168.255.1
options ndots:5In this case, any service in *.cluster.remote is resolved, but no services from *.cluster.local. This is expected.
resolv.conf, but spec.dnsConfig is providedRemove the following lines from the OpenVPN client configuration:
script-security 2
up /etc/openvpn/up.sh
down /etc/openvpn/down.shThe spec.dnsConfig is provided as:
dnsConfig:
nameservers:
- 192.168.255.1
searches:
- cluster.remoteThen, resolv.conf will be the following:
nameserver 192.168.255.1
nameserver 169.254.25.10
search default.cluster.local svc.cluster.local cluster.local cluster.remote
options ndots:5This would work for *.cluster.remote, but not for anything *.cluster.local, because the second nameserver is tried as long as the first times out. I noticed that some folk would get around this limitation by setting up namespace rotation and timeout for 1 second, but this behavior looks very hectic to me, I would not consider this, not even as a workaround. Or maybe I'm missing something. My first question would be: Could rotation and timeout work in this case?
My second question would be: is there any way to make *.cluster.local and *.cluster.remote DNS resolves work reliably from the service container inside the Pod and without using something like dnsmasq?
My third question would be: if dnsmasq is required, how can I configure it, provided, and overwrite resolv.conf by also making sure that the Kubernetes-provided nameserver can be anything (169.254.25.10 in this case).
Best, Zoltán
I had rather solved the problem by running a sidecar DNS-server, because:
Here is an example pod with CoreDNS:
apiVersion: v1
kind: Pod
metadata:
name: foo
namespace: default
spec:
volumes:
- name: config-volume
configMap:
name: foo-config
items:
- key: Corefile
path: Corefile
dnsPolicy: None # SIgnals Kubernetes that you want to supply your own DNS - otherwise `/etc/resolv.conf` will be overwritten by Kubernetes and there is then no way to update it.
dnsConfig:
nameservers:
- 127.0.0.1 # This will set the local Core DNS as the DNS resolver. When `dnsPolicy` is set, `dnsConfig` must be provided.
containers:
- name: dns
image: coredns/coredns
env:
- name: LOCAL_DNS
value: 10.233.0.3 # insert local DNS IP address (see kube-dns service ClusterIp)
- name: REMOTE_DNS
value: 192.168.255.1 # insert remote DNS IP address
args:
- '-conf'
- /etc/coredns/Corefile
volumeMounts:
- name: config-volume
readOnly: true
mountPath: /etc/coredns
- name: test
image: debian:buster
command:
- bash
- -c
- apt update && apt install -y dnsutils && cat /dev/stdout
---
apiVersion: v1
kind: ConfigMap
metadata:
name: foo-config
namespace: default
data:
Corefile: |
cluster.local:53 {
errors
health
forward . {$LOCAL_DNS}
cache 30
}
cluster.remote:53 {
errors
health
rewrite stop {
# rewrite cluster.remote to cluster.local and back
name suffix cluster.remote cluster.local answer auto
}
forward . {$REMOTE_DNS}
cache 30
}
The CoreDNS config above simply forwards cluster.local queries to the local service and cluster.remote - to the remote one. Using it I was able to resolve kubernetes service IP of both clusters:
❯ k exec -it -n default foo -c test -- bash
root@foo:/# dig @localhost kubernetes.default.svc.cluster.local +short
10.100.0.1
root@foo:/# dig @localhost kubernetes.default.svc.cluster.remote +short
10.43.0.1Update:
Possibly, the following Core DNS configuration is sufficient, in case you require access to the internet as well as cluster.internal is provided by Kubernetes itself:
.:53 {
errors
health
forward . {$LOCAL_DNS}
cache 30
}
cluster.remote:53 {
errors
health
forward . {$REMOTE_DNS}
cache 30
}
Ad 1.) I am not sure I understand what you mean by namespace rotation (do you mean round-robin domain rotation?), but you could set the timeout to 0, so resolver sends right away dns queries to both name-servers and returns the quicker dns response.
The better idea is to leverage a native kubernetes dns (coredns, kubedns) and just set the forwarding rule there, as per documentation you could add something like this to the coredns/kube-dns configmap in the kube-system:
cluster.remote:53 {
errors
cache 30
forward . <remote cluster dns ip>
}
This way you won't need to touch /etc/resolve.conf in the pod at all, you just need to ensure kubedns can reach the remote dns server... or configure your application for iterative dns resolution
You can find more details in the official kubernetes documentation https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/ and coredns https://coredns.io/plugins/forward/ .
Of course modifying kubedns/coredns configuration requires you to have admin rights in the cluster.