Latest Entries »

Disclaimer: Follow the below at your own risk. You shouldn’t work with mains power unless you are a competent person.

After building an Energy Storage System (ESS) and switching to an Agile electricity tariff (from Octopus Energy) my Marlec Solar iBoost was holding me back.

The iBoost kicks in when it detects excess electricity being fed back to the grid, but there are other situations when I might want to divert power to the immersion heater- for example, when electricity is cheap/free or in some instances I’m even being paid to consume.

I hunted for an off the shelf solution and came up short. The only potential contender was the myenergi eddi, but even that doesn’t seem to clearly document what it can/can’t do (and how well it integrates with Home Assistant)- oh and it’s probably £500+.

So what were my requirements?

  • WiFi / network connected.
  • Reports power consumption.
  • Can limit power output.
  • Can integrate with Home Assistant.

A simple Sonoff POW is network connected, reports power consumption and integrates well with Home Assistant (natively or by flashing with ESPHome or Tasmota). Great, 3 of the 4 requirements nailed, so how can we achieve the 4th?

A lot of research turned up something called a solid state relay (or contactor). I don’t pretend to fully understand, but loosely speaking it allows you to vary output by varying (an isolated) input. In this instance, the output is 240V AC and the input is 3V DC (which we can feed directly from the Sonoff).

There are different types of SSR/contactor which can be controlled by modifying the input in different ways, but the one I chose uses pulse width modulation (PWM). I picked a (deliberately over-specced) SSR from Amazon: 40A SSR-40DA DC-AC,Input 3-32VDC,Output 24-380VAC, suitable heatsink, and an enclosure to house them (along with the Sonoff).

To connect these easily to the mains and immersion heater, I bought a single socket 13A extension lead.

How does it all go together?

  1. Open the Sonoff up and flash the Tasmota firmware (I used this USB UART-TTL device).
  2. Apply the POW template to the Sonoff.
  3. Configure GPIO3 as PWM (via the Tasmota web interface, configuration, configure template).
  4. Connect the Sonoff (GND from the header used to flash the firmware) to the SSR (- DC input).
  5. Connect the Sonoff (RX from the header used to flash the firmware) to the SSR (+ DC input).
  6. Cut the extension lead in half.
  7. Connect the live and neutral from the “plug end” to Sonoff inputs.
  8. Connect the live output from the Sonoff to either AC terminal of the SSR.
  9. Connect the remaining AC terminal from the SSR to the “socket end”.
  10. Connect the neutral output from the Sonoff to the “socket end”.
  11. Connect earth from the “plug end” to the “socket end” (if you’re careful, you could possibly avoid cutting this in the first place.
  12. Fit it all neatly int he enclosure (I used a few cable ties and some hot glue to ensure there would be no strain on any of the cables and no danger of movement).

Now after adding the Tasmota device to Home Assistant I can see the state and power consumption (alongside some data from my Victron inverter and the current electricity price):

It’d be nice if I could somehow get a temperature probe into the hot water tank, and maybe a read from the SSR heatsink too (I have capped it at 70% for now as it does heat up)- but that’s for another day!

I have some automations setup to turn the immersion heater on when the electricity price is 0 or less or when the batteries reach a certain level of charge.

Total parts cost probably less than £50, so 10x less than the eddi (which may not have been capable of doing what I wanted).

Big shout out to Rich who reached out to give me some pointers when I was struggling with my search- thanks!

7 years ago, in our old house, we had a solar setup where we were paid for all electricity generated by our solar PV array. And for electricity we exported (generated, but didn’t use) we were paid double.

In our new house, unfortunately we do not get paid for generation or export. We only benefit from the electricity we generate and consume. This leaves us frustrated on a day where we generate electricity during the day, but use very little, then come home and consume a lot from the grid.

Since the Tesla Powerwall was released, i’ve been interested, but until now, I’ve not felt the return on investment was there.

I’m a keen DIY’er and a convenient Facebook ad, or perhaps while browser eBay, I stumbled across something like a PowMr 10KW Power-Wall. I was close to pulling the trigger, but needed to do a bit more research first…

Lucky I did- as upon further investigation, it became more obvious this was just a battery. It definitely wouldn’t be plug and play (at minimum, some sort of inverter would be needed too).

The more I read, the more I realised I needed to do plan thoroughly before making a purchase. One of things that seems especially unclear with all products- do they support two way/bi-directional power on the AC-Input?

Every wiring diagram seemed to show “AC Loads” connected to a dedicated output. I was starting to think I would need a separate consumer unit, and the inverter wouldn’t be able to cope if I wanted to boil the kettle and use the oven at the same time?

A lot of the questions and information I was searching for took me to the Victron community forum: https://community.victronenergy.com/index.html. Victron seemed like a solid bet for my equipment, now I needed a supplier and ideally an expert.

I was in luck- https://essandsolarsolutions.co.uk/ not only stock the Victron equipment, have an extremely clued up Victron consultant (Etienne), offer bundles/packages (which contain almost everything you need), but better yet, they’re based in Essex, close enough for me to sit down and talk things through!

After talking it through, and putting down what I needed on paper- I was ready to order. Here’s what I ended up with:

Parts (not available from ESS and Solar Solutions):

Parts (from ESS and Solar Solutions):

  • MultiPlus-II 48/5000/70-50 230V GX
  • 4x Pylontech batteries US5000
  • 2x Cable Pack for US2000 / US3000 Batteries 2m
  • VE.Can to CAN-bus BMS type A Cable 1.8 m
  • Energy Meter ET112 – 1 phase – max 100A
  • RS485 to USB interface 1.8m
  • Current Transformer 100A:50mA for MultiPlus-II (5m)
  • Lynx Distributor (M8)
  • 2x MEGA-fuse 125A/58V for 48V products

You may notice there is no MPPT solar charge controller. This is because our PV array is 16x 30V panels in series (total 480V) and the Victron MPPT solar charge controls don’t support that voltage (and I have no intention of getting up on the roof to split into multiple smaller strings).

For now, we’ll keen the Fronius IG TL inverter and live with the fact we’re losing a bit in the DC -> AC -> DC -> AC conversion.

Tools (I didn’t already have):

Wiring Diagram:

WORK IN PROGRESS- I have tried to keep the connections “in the right place” (e.g. ports in the correct positions on each device etc). The result unfortunately is a big spaghetti like?

I deliberately included a photo below before I closed everything up so you can see most of the connections (before i’d re-routed/tidied etc).

Install photo(s):

The cable management could still use a bit of work, but it’s good enough for the moment.

Some notes:

  • The current clamp/transformer is used to measure the electricity coming in/going out to the grid (it is quit responsive).
  • The ET112 energy meter is used to measure the electricity coming in from the Fronius inverter (it has a bit more of a lag and requires “splicing” into the AC feed. There are a few faster Carlo Gavazzi devices and there are some current clamp/transformer type energy meters, but none of them are fast & CT type (and many of them have a max current of 5A)- https://www.victronenergy.com/upload/documents/Datasheet-Energy-Meters-Selection-Guide-EN.pdf. I am quite surprised the GX device doesn’t support more energy meters (especially as I believe the underlying operation system is open source). Perhaps in the future this may be something I can contribute upstream? https://github.com/victronenergy/dbus-cgwacs/
  • The Lynx Distributor end cable entry holes were not big enough for the 70mm2 cables. I used a multi-tool to open them up a bit.
  • I should have ordered shorter 70mm2 cables, then I could have gone straight up and into the MultiPlus. I’m not comfortable with trying to cut and crimp to my own length yet, but that may be something I end up doing.
  • Getting the batteries connected to the MultiPlus was a bit confusing. I couldn’t understand whether to use the VE.Bus port or VE.Can. It ended up needing to be the VE.Can port (with a terminator in the other port) and needing to adjust the settings on the GX device: Settings > Services > VE.Can > CAN-bus BMS (500 kbit/s)
  • Getting the system working as desired (trying to maintain no import/export from the grid by charging batteries when generation > consumption and discharging them when consumption > generation) was far from obvious. Huge shout out to Etienne for sorting this. It requires dumping the MultiPlus config from the VRM portal- https://vrm.victronenergy.com > Device list > Remote VEConfigure > Download, then using the VEConfigure software to install the ESS Assistant and upload the modified config.
  • Getting the ET112 connected was trickier than anticipated too. The default baud rate was not compatible with the GX. I had to connect the USB RS485 cable to my laptop and use the UCS desktop app to find the device and change the baud rate to 9600- https://www.gavazziautomation.com/data_center/EN#download
  • After getting the ET112 to show up the numbers weren’t adding up/making sense. I poked and prodded a little, and figured Settings > Energy meters > Position (default AC Input 1) needed to be changed to AC Output.

Connecting everything up to the mains and splitting the incoming meter tails was something I wanted to leave to the professionals. Big thanks to Rob, Si and Noel at Lawson Electrical Services for sorting that at short notice- https://www.lawsonelectrical.co.uk/

Please feel free to reach out if you have any questions. I’m sure I missed a few questions/issues I stumbled across along the way.

A week or so back I wrote part 1: https://tickett.wordpress.com/2023/03/29/learning-kubernetes-with-gitlab-and-terraform-for-free/

I had a few “next steps” on my list. The first was to fix the 2nd node in the node pool (it was reportedly “not healthy”).

Connecting to the node I found some overlayfs errors, but no clear solution on Google of ChatGPT. I terminated the node and a new one automatically appeared- this time “healthy”. That was easy!

The next was to try and understand how to run multiple web-servers behind the single public IP.

I spent a lot of time being misled by ChatGPT (trying various techniques to get oci_network_load_balancer_network_load_balancer to route based on hostname, using oci_load_balancer_path_route_set for example) to eventually find that it’s a layer 3 / 4 load balancer and is not capable!

But, it’s all good, oci_load_balancer_load_balancer provides a layer 7 load balancer, which is capable.

It may not be the best solution, but after some poking around, I ended up with 2 nginx servers running in the cluster accessible via 2 different hostnames (using nip.io).

Here are the resources I removed:

free_nlb
free_nlb_backend_set
free_nlb_backend
free_nlb_listener

And the additions:

resource "oci_load_balancer_load_balancer" "free_lb" {
  compartment_id = var.compartment_id
  display_name   = "free-k8s-lb"
  shape          = "10Mbps"
  subnet_ids     = [oci_core_subnet.vcn_public_subnet.id]
  is_private = false
}

resource "oci_load_balancer_backend_set" "free_lb_backend_set" {
  load_balancer_id = oci_load_balancer_load_balancer.free_lb.id
  name             = "free-k8s-backend-set"
  health_checker {
    protocol = "TCP"
    port     = 10256
  }
  policy = "ROUND_ROBIN"
}

resource "oci_load_balancer_backend_set" "free_lb_backend_set_default" {
  load_balancer_id = oci_load_balancer_load_balancer.free_lb.id
  name             = "free-k8s-backend-set-default"
  health_checker {
    protocol = "TCP"
    port     = 10256
  }
  policy = "ROUND_ROBIN"
}

resource "oci_load_balancer_backend" "free_lb_backend" {
  for_each = {
    for index, node in local.active_nodes :
    node.private_ip => node
  }

  backendset_name  = oci_load_balancer_backend_set.free_lb_backend_set.name
  load_balancer_id = oci_load_balancer_load_balancer.free_lb.id
  port             = 31600
  ip_address       = each.value.private_ip
}

resource "oci_load_balancer_backend" "free_lb_backend_default" {
  for_each = {
    for index, node in local.active_nodes :
    node.private_ip => node
  }

  backendset_name  = oci_load_balancer_backend_set.free_lb_backend_set_default.name
  load_balancer_id = oci_load_balancer_load_balancer.free_lb.id
  port             = 31601
  ip_address       = each.value.private_ip
}

resource "oci_load_balancer_listener" "free_lb_listener" {
  default_backend_set_name = oci_load_balancer_backend_set.free_lb_backend_set.name
  load_balancer_id         = oci_load_balancer_load_balancer.free_lb.id
  name                     = "free-k8s-lb-listener"
  port                     = 80
  protocol                 = "HTTP"

  hostname_names = [oci_load_balancer_hostname.nginx.name]
}

resource "oci_load_balancer_listener" "free_lb_listener_default" {
  default_backend_set_name = oci_load_balancer_backend_set.free_lb_backend_set_default.name
  load_balancer_id         = oci_load_balancer_load_balancer.free_lb.id
  name                     = "free-k8s-lb-listener-default"
  port                     = 80
  protocol                 = "HTTP"

  hostname_names = [oci_load_balancer_hostname.nginx_default.name]
}

resource "oci_load_balancer_hostname" "nginx" {
  load_balancer_id = oci_load_balancer_load_balancer.free_lb.id
  hostname         = "nginx.130.162.168.181.nip.io"
  name             = "nginx"
}

resource "oci_load_balancer_hostname" "nginx_default" {
  load_balancer_id = oci_load_balancer_load_balancer.free_lb.id
  hostname         = "nginx-default.130.162.168.181.nip.io"
  name             = "nginx-default"
}

resource "kubernetes_service" "nginx_service_default" {
  metadata {
    name      = "nginx-service-default"
    namespace = kubernetes_namespace.free_namespace.id
  }
  spec {
    selector = {
      app = "nginx-default"
    }
    port {
      port        = 80
      target_port = 80
      node_port   = 31601
    }

    type = "NodePort"
  }
}

This achieves a few things:

  • Removes the network load balancer (and the associated bits)
  • Adds the load balancer, 2 load balancer listeners, 2 load balancer hostnames, 2 load balancer listeners, 2 load balancer backend sets and 2 load balancer backends
  • Adds a second Kubernetes service (the second deployment is still currently being deployed via a different project)
  • Replaces the hardcoded target from the previous backend (this still needs a little work as there is a dependency which prevents the infrastructure being built in a single pass- e.g. the nodes need to be created first, then the output can be used in the for each loop)

I noticed after a few days, the load balancer was costing a bit- but believe that’s because I set the shape to 100Mbps – now i’ve switched to 10Mbps I think it should be “always free”.

Now we have 2 healthy nodes, let’s try and increase the replicas?

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 4
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80

Great- this seems to be working.

lee@cloudshell:~ (uk-london-1)$ kubectl get all -n free-ns
NAME                                 READY   STATUS    RESTARTS   AGE
pod/nginx-6b75cb9cbb-4rjb9           1/1     Running   0          3d1h
pod/nginx-6b75cb9cbb-hb8r4           1/1     Running   0          7d15h
pod/nginx-6b75cb9cbb-wntbv           1/1     Running   0          3d
pod/nginx-6b75cb9cbb-zpjfw           1/1     Running   0          3d1h
pod/nginx-default-657bc4bf4d-6qcsx   1/1     Running   0          3d1h
pod/nginx-default-657bc4bf4d-hxvzd   1/1     Running   0          3d1h
pod/nginx-default-657bc4bf4d-lpf5t   1/1     Running   0          7d15h
pod/nginx-default-657bc4bf4d-rlvpt   1/1     Running   0          3d1h

NAME                            TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
service/nginx-service           NodePort   10.96.39.113    <none>        80:31600/TCP   14d
service/nginx-service-default   NodePort   10.96.105.209   <none>        80:31601/TCP   7d14h

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/nginx           4/4     4            4           14d
deployment.apps/nginx-default   4/4     4            4           7d15h

NAME                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/nginx-6b75cb9cbb           4         4         4       11d
replicaset.apps/nginx-default-657bc4bf4d   4         4         4       7d15h

But I can’t see which node each replica lives on?

I found a good script to start digging into that:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test
  labels:
    app: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      containers:
        - name: test-container
          image: arm64v8/busybox
          command: [ "sh", "-c"]
          args:
          - while true; do
              echo -en '\n';
              printenv MY_NODE_NAME MY_POD_NAME MY_POD_NAMESPACE;
              printenv MY_POD_IP MY_POD_SERVICE_ACCOUNT;
              sleep 10;
            done;
          env:
            - name: MY_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: MY_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: MY_POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: MY_POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: MY_POD_SERVICE_ACCOUNT
              valueFrom:
                fieldRef:
                  fieldPath: spec.serviceAccountName
lee@cloudshell:~ (uk-london-1)$ kubectl logs deployment.apps/test -n free-ns
10.0.1.224
test-684655bf55-2zcdk
free-ns
10.244.1.9
default

It looks like MY_NODE_NAME (e.g. spec.nodeName) contains the pod IP. That’s as far as I’ve had time to dive so far.

Next steps:

  • Polish things a little (use better naming, potentially create some modules to remove all the duplication with listeners, backend sets, services etc)
  • Take a look the GitOps workflow
  • Find a “good” SSL solution. I’ve managed to use Cloudflare DNS to add free SSL but that requires creating a new DNS entry for each hostname- ideally I would use a wildcard DNS but don’t want to go down the LetsEncrypt route.
  • Improve visibility/observability (easily be able to list pods, services, deployments, view logs etc)
  • Learn more about Kubernetes and persistent storage (at the moment, all of the deployments pull docker images, spin them up and then destroy them- what if I want a database?)

Not too much right?!

Shout out to Raimund Hook for sharing, and Arnold Galovics for documenting how to spin up a free Kubernetes cluster in Oracle Cloud (via Terraform): https://arnoldgalovics.com/oracle-cloud-kubernetes-terraform/.

I will try and talk through my process of following/tweaking the above to work with GitLab and the Kubernetes Agent Service (KAS). Here is my current (probably not final) terraform config:

terraform {
  backend "http" {}
}

provider "oci" {
  region = var.region
}

variable "compartment_id" {
  type        = string
  description = "The compartment to create the resources in"
}

variable "region" {
  type        = string
  description = "The region to provision the resources in"
}

variable "ssh_public_key" {
  type        = string
  description = "The SSH public key to use for connecting to the worker nodes"
}

output "free_load_balancer_public_ip" {
  value = [for ip in oci_network_load_balancer_network_load_balancer.free_nlb.ip_addresses : ip if ip.is_public == true]
}

module "vcn" {
  source  = "oracle-terraform-modules/vcn/oci"
  version = "3.1.0"

  compartment_id = var.compartment_id
  region         = var.region

  internet_gateway_route_rules = null
  local_peering_gateways       = null
  nat_gateway_route_rules      = null

  vcn_name      = "free-k8s-vcn"
  vcn_dns_label = "freek8svcn"
  vcn_cidrs     = ["10.0.0.0/16"]

  create_internet_gateway = true
  create_nat_gateway      = true
  create_service_gateway  = true
}

resource "oci_core_security_list" "private_subnet_sl" {
  compartment_id = var.compartment_id
  vcn_id         = module.vcn.vcn_id

  display_name = "free-k8s-private-subnet-sl"

  egress_security_rules {
    stateless        = false
    destination      = "0.0.0.0/0"
    destination_type = "CIDR_BLOCK"
    protocol         = "all"
  }

  ingress_security_rules {
    stateless   = false
    source      = "10.0.0.0/16"
    source_type = "CIDR_BLOCK"
    protocol    = "all"
  }

  ingress_security_rules {
    stateless   = false
    source      = "10.0.0.0/24"
    source_type = "CIDR_BLOCK"
    protocol    = "6"
    tcp_options {
      min = 10256
      max = 10256
    }
  }

  ingress_security_rules {
    stateless   = false
    source      = "10.0.0.0/24"
    source_type = "CIDR_BLOCK"
    protocol    = "6"
    tcp_options {
      min = 31600
      max = 31600
    }
  }
}

resource "oci_core_security_list" "public_subnet_sl" {
  compartment_id = var.compartment_id
  vcn_id         = module.vcn.vcn_id

  display_name = "free-k8s-public-subnet-sl"

  egress_security_rules {
    stateless        = false
    destination      = "0.0.0.0/0"
    destination_type = "CIDR_BLOCK"
    protocol         = "all"
  }

  egress_security_rules {
    stateless        = false
    destination      = "10.0.1.0/24"
    destination_type = "CIDR_BLOCK"
    protocol         = "6"
    tcp_options {
      min = 31600
      max = 31600
    }
  }

  egress_security_rules {
    stateless        = false
    destination      = "10.0.1.0/24"
    destination_type = "CIDR_BLOCK"
    protocol         = "6"
    tcp_options {
      min = 10256
      max = 10256
    }
  }

  ingress_security_rules {
    protocol    = "6"
    source      = "0.0.0.0/0"
    source_type = "CIDR_BLOCK"
    stateless   = false

    tcp_options {
      max = 80
      min = 80
    }
  }

  ingress_security_rules {
    stateless   = false
    source      = "10.0.0.0/16"
    source_type = "CIDR_BLOCK"
    protocol    = "all"
  }

  ingress_security_rules {
    stateless   = false
    source      = "0.0.0.0/0"
    source_type = "CIDR_BLOCK"
    protocol    = "6"
    tcp_options {
      min = 6443
      max = 6443
    }
  }
}

resource "oci_core_subnet" "vcn_private_subnet" {
  compartment_id = var.compartment_id
  vcn_id         = module.vcn.vcn_id
  cidr_block     = "10.0.1.0/24"

  route_table_id             = module.vcn.nat_route_id
  security_list_ids          = [oci_core_security_list.private_subnet_sl.id]
  display_name               = "free-k8s-private-subnet"
  prohibit_public_ip_on_vnic = true
}

resource "oci_core_subnet" "vcn_public_subnet" {
  compartment_id = var.compartment_id
  vcn_id         = module.vcn.vcn_id
  cidr_block     = "10.0.0.0/24"

  route_table_id    = module.vcn.ig_route_id
  security_list_ids = [oci_core_security_list.public_subnet_sl.id]
  display_name      = "free-k8s-public-subnet"
}

resource "oci_containerengine_cluster" "k8s_cluster" {
  compartment_id     = var.compartment_id
  kubernetes_version = "v1.25.4"
  name               = "free-k8s-cluster"
  vcn_id             = module.vcn.vcn_id

  endpoint_config {
    is_public_ip_enabled = true
    subnet_id            = oci_core_subnet.vcn_public_subnet.id
  }

  options {
    add_ons {
      is_kubernetes_dashboard_enabled = false
      is_tiller_enabled               = false
    }
    kubernetes_network_config {
      pods_cidr     = "10.244.0.0/16"
      services_cidr = "10.96.0.0/16"
    }
    service_lb_subnet_ids = [oci_core_subnet.vcn_public_subnet.id]
  }
}

data "oci_identity_availability_domains" "ads" {
  compartment_id = var.compartment_id
}

resource "oci_containerengine_node_pool" "k8s_node_pool" {
  cluster_id         = oci_containerengine_cluster.k8s_cluster.id
  compartment_id     = var.compartment_id
  kubernetes_version = "v1.25.4"
  name               = "free-k8s-node-pool"

  node_config_details {
    placement_configs {
      availability_domain = data.oci_identity_availability_domains.ads.availability_domains[0].name
      subnet_id           = oci_core_subnet.vcn_private_subnet.id
    }
    placement_configs {
      availability_domain = data.oci_identity_availability_domains.ads.availability_domains[1].name
      subnet_id           = oci_core_subnet.vcn_private_subnet.id
    }
    placement_configs {
      availability_domain = data.oci_identity_availability_domains.ads.availability_domains[2].name
      subnet_id           = oci_core_subnet.vcn_private_subnet.id
    }
    size = 2
  }

  node_shape = "VM.Standard.A1.Flex"

  node_shape_config {
    memory_in_gbs = 6
    ocpus         = 1
  }

  node_source_details {
    image_id    = "ocid1.image.oc1.uk-london-1.aaaaaaaat3lzhmmq3qlvichfxnzcoekciestjyojmjny74gkd3sm5ny6whiq"
    source_type = "image"
  }

  initial_node_labels {
    key   = "name"
    value = "free-k8s-cluster"
  }

  ssh_public_key = var.ssh_public_key
}

data "oci_containerengine_cluster_kube_config" "kubeconfig" {
  cluster_id = oci_containerengine_cluster.k8s_cluster.id
  depends_on = [
    oci_containerengine_node_pool.k8s_node_pool
  ]
}

data "oci_containerengine_node_pool" "free_k8s_np" {
  node_pool_id = oci_containerengine_node_pool.k8s_node_pool.id
}

locals {
  active_nodes = [for node in data.oci_containerengine_node_pool.free_k8s_np.nodes : node if node.state == "ACTIVE"]
}

resource "oci_network_load_balancer_network_load_balancer" "free_nlb" {
  compartment_id = var.compartment_id
  display_name   = "free-k8s-nlb"
  subnet_id      = oci_core_subnet.vcn_public_subnet.id

  is_private                     = false
  is_preserve_source_destination = false
}

resource "oci_network_load_balancer_backend_set" "free_nlb_backend_set" {
  health_checker {
    protocol = "TCP"
    port     = 10256
  }
  name                     = "free-k8s-backend-set"
  network_load_balancer_id = oci_network_load_balancer_network_load_balancer.free_nlb.id
  policy                   = "FIVE_TUPLE"

  is_preserve_source = false
}

resource "oci_network_load_balancer_backend" "free_nlb_backend" {
  count                    = length(local.active_nodes)
  backend_set_name         = oci_network_load_balancer_backend_set.free_nlb_backend_set.name
  network_load_balancer_id = oci_network_load_balancer_network_load_balancer.free_nlb.id
  port                     = 31600
  target_id                = local.active_nodes[count.index].id
}

resource "oci_network_load_balancer_listener" "free_nlb_listener" {
  default_backend_set_name = oci_network_load_balancer_backend_set.free_nlb_backend_set.name
  name                     = "free-k8s-nlb-listener"
  network_load_balancer_id = oci_network_load_balancer_network_load_balancer.free_nlb.id
  port                     = "80"
  protocol                 = "TCP"
}

The main things I remember changing were:

The backend:

terraform {
  backend "http" {}
}

This tells Terraform to use GitLab to for state management.

The node pool > node > source image:

image_id = "ocid1.image.oc1.uk-london-1.aaaaaaaat3lzhmmq3qlvichfxnzcoekciestjyojmjny74gkd3sm5ny6whiq"

This tells Terraform/Oracle to use the Oracle-Linux-8.7-aarch64-2023.01.31-3 image (from the UK region) instead of the Oracle Linux 7 image being used by Arnold.

I initially tried with the latest Oracle Linux 9 image, but received the following error:

│ Error: 400-InvalidParameter, Invalid nodeSourceDetails.imageId: Node image not supported.
│ Suggestion: Please update the parameter(s) in the Terraform config as per error message Invalid nodeSourceDetails.imageId: Node image not supported.

So dropped back. And lastly, the default deployment has been removed (you can see the service is still configured though).

I am using the default https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Terraform/Base.latest.gitlab-ci.yml

After the pipeline runs in GitLab, and I manually run the deploy job, we see Terraform output the load balancer public IP address- we’ll need that later!

The cluster is up and running- what’s next? We need to get it talking to GitLab. I believe the logical approach is to use the GitLab Kubernetes Agent Service (KAS).

I had a read of the docs https://docs.gitlab.com/ee/user/clusters/agent/ and went straight on to the install https://docs.gitlab.com/ee/user/clusters/agent/install/index.html (using the Helm command provided when I registered the agent in GitLab).

Note: I suspect I need to revisit this and incorporate the helm installation in the Terraform configuration at a later stage.

So what does this mean? Well, it looks like (as i’m not quite ready to dive in head first with the GitOps workflow), that I can now interact with the cluster from GitLab using the trusty CI/CD workflow: https://docs.gitlab.com/ee/user/clusters/agent/ci_cd_workflow.html

I created a simple project with a simple .gitlab-ci.yml to see if this was true:

get-pods:
  image:
    name: bitnami/kubectl:latest
    entrypoint: ['']
  script:
    - kubectl config get-contexts
    - kubectl config use-context tel/oracle_cloud:default
    - kubectl get pods

Voila! Great- but what can we do now? Well, we’re missing a few small pieces in order to deploy a web server. We need to create a namespace, and a service (back in our Terraform project):

provider "kubernetes" {
  config_path    = "/builds/tel/oracle_cloud.tmp/KUBECONFIG"
  config_context = "tel/oracle_cloud:default"
}

resource "kubernetes_namespace" "free_namespace" {
  metadata {
    name = "free-ns"
  }
}

resource "kubernetes_service" "nginx_service" {
  metadata {
    name      = "nginx-service"
    namespace = kubernetes_namespace.free_namespace.id
  }
  spec {
    selector = {
      app = "nginx"
    }
    port {
      port        = 80
      target_port = 80
      node_port   = 31600
    }

    type = "NodePort"
  }
}

I tripped up bad here. I went deep down the rabbit hole trying to understand how GitLab knew how to connect to the Kubernetes cluster, and figure out how to get the Kubernetes provider for Terraform to connect to it.

I knew the config_context (as we provided that in the kubectl command above. But I did not need a config file for the kubectl script. I went so deep I ended up building custom docker images based on Oracle Linux, containing the OCI client, Terraform, GitLab Terraform (and dependencies, jq, idns2 etc).

Eventually when I was about to admit defeat, I reached out on the GitLab Discord server and the legend Patrick Rice (one of the GitLab provider for Terraform maintainers) came to the rescue- thank you Patrick!

It turns out, we have access to $KUBECONFIG – an environment variable containing the path to the config file needed to connect to the cluster! This is documented:

All CI/CD jobs now include a KUBECONFIG with contexts for every shared agent connection. Choose the context to run kubectl commands from your CI/CD scripts.

But I guess I was a bit out of my depth and needed it spoon fed a little. I have taken the opportunity to raise a merge request to improve the documentation– let’s see what happens!

Right, now we’re back on track, let’s try and deploy a default nginx container. We create a deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80

And add a few lines to .gitlab-ci.yml:

deploy:
  image:
    name: bitnami/kubectl:latest
    entrypoint: ['']
  script:
    - kubectl config use-context tel/oracle_cloud:default
    - kubectl -n free-ns apply -f deployment.yaml

Sounds promising? Now ket’s hit the load balancer IP we jotted down earlier:

Awesome! Right, just one more small step for this experiment “today” (or so I thought). How can I load some custom content into nginx?

I’m sure there are a number of techniques, but a common approach seems to be creating a custom docker image, simply adding a custom index.html on top of the default one. So let’s create a Dockerfile in our project:

FROM nginx
COPY index.html /usr/share/nginx/html/index.html

And a sample index.html

<!DOCTYPE html>
<html>
    <head>
        <title>Free Kubernetes</title>
    </head>
    <body>
        <h1>This is a custom page running on a Free Kubernetes cluster on Oracle Cloud</h1>
        <h1>GitLab CI/CD pipeline build a custom Docker image on top of nginx and deployed it to Kubernetes</h1>
    </body>
</html>

And we need our .gitlab-ci.yml to build the docker image:

build:
  image: docker:20.10.16
  services:
    - docker:20.10.16-dind
  script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    - docker build -t $CI_REGISTRY/tel/k8s_test . --push
    - docker push $CI_REGISTRY/tel/k8s_test

I should add, i’m using a self hosted GitLab instance and at this stage, I had to reconfigure it as container registry wasn’t even enabled. But the above was now working great!

So how do we get the custom docker image into Kubernetes? Let’s combine our build and deploy jobs in our .gitlab-ci.yml

build:
  image: docker:20.10.16
  services:
    - docker:20.10.16-dind
  script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    - docker build -t $CI_REGISTRY/tel/k8s_test . --push
    - docker push $CI_REGISTRY/tel/k8s_test

deploy:
  image:
    name: bitnami/kubectl:latest
    entrypoint: ['']
  script:
    - kubectl config use-context tel/oracle_cloud:default
    - kubectl -n free-ns delete secret free-registry-secret
    - kubectl -n free-ns create secret docker-registry free-registry-secret --docker-server=$CI_REGISTRY --docker-username=gitlab+deploy-token-5 --docker-password=$read_registry_access_token
    - kubectl -n free-ns apply -f deployment.yaml
  needs:
    - build

Note we also needed to create a deploy token, and use that to create a Kubernetes secret which we can use to authorize against our container registry to pull our custom image. Here’s the updated deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: xxx:5050/tel/k8s_test
          ports:
            - containerPort: 80
      imagePullSecrets:
      - name: free-registry-secret

Now I should be able to hit the load balancer and see my custom page. Nope! Doh- what went wrong?

I should have taken some better notes/screenshots at the time, but I was seeing a pod status crashloopbackoff (when running kubectl get pods and the error exec /docker-entrypoint.sh: exec format error (when running kubectl logs pod-id).

Google and ChatCPT both told me it was likely an architecture issue (e.g. the Docker image being built on amd64 but my Kuberenetes nodes running arm64).

This looked like a simple problem, and using docker buildx should have me up and running in no time. But again, down the rabbit hole we go. Targetting arm64 didn’t seem to be making any difference. I tried all sorts, the image was even running AOK locally on my M1 MacBook Pro (arm64).

Eventually I noticed the docker push output seemed to be suggesting the built image was identical (even when changing the target platform):

And this got me to the problem/solution. For some reason the image I was building was not being pushed, but a previously built image. I updated the Docker commands to build and push in a single operation, and bingo!

Here’s my updated build job from .gitlab-ci.yml

build:
  tags:
    - docker
  image: docker:20.10.16
  services:
    - docker:20.10.16-dind
  script:
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    - docker buildx create --name multiarch-builder
    - docker buildx use multiarch-builder
    - docker buildx inspect --bootstrap
    - docker buildx build --platform linux/amd64,linux/arm64 -t $CI_REGISTRY/tel/k8s_test . --push

And hitting the load balancer:

Tada!

I hope to circle back to this in the coming weeks and get everything neatened up a little. Perhaps try and understand GitOps and how I might be able to use the Kubernetes load balancer to host multiple web applications behind the single public IP.

GitLab recently implemented functionality to manage Objectives and Key Results (OKRs): https://about.gitlab.com/company/okrs/#how-to-use-gitlab-for-okrs

GitLab also use Google Docs for meetings: https://about.gitlab.com/company/culture/all-remote/live-doc-meetings/

During my team’s weekly meeting we discuss our OKRs and were getting frustrated with the manual effort of collating (duplicating) the OKRs. We considered dropping it, but didn’t want to miss the opportunity to share and discuss progress.

Fortunately, GitLab has an awesome API (GraphQL is my favourite) and we can get the data out programmatically. Similarly, Google has Apps Scripts which allow us to call APIs.

GitLab provides a handy “out of the box” UI for knocking up GraphQL queries (GraphiQL / GraphQL Explorer https://gitlab.com/-/graphql-explorer). I wasn’t familiar with work items and widgets, but it didn’t take too long to knock up:

Now we need a script to make the API request and process the response:

function getOkrSummary() {
  const query = `
query {
  project(fullPath: "gitlab-com/gitlab-OKRs") {
    workItems(iids: ["129", "130", "131", "132"]) {
      nodes {
        title
        widgets {
          ... on WorkItemWidgetProgress {
            progress
          }
          ... on WorkItemWidgetHealthStatus {
            healthStatus
          }
          ... on WorkItemWidgetHierarchy {
            children {
              nodes {
                iid
                title
                widgets {
                  ... on WorkItemWidgetAssignees {
										assignees { nodes { username } }
                  }
                  ... on WorkItemWidgetProgress {
										progress
                  }
                  ... on WorkItemWidgetHealthStatus {
										healthStatus
                  }
                  ... on WorkItemWidgetNotes {
										discussions(filter: ONLY_COMMENTS) {
                      nodes {
                        notes {
                          nodes {
                            createdAt
                            body
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      } 
    }
  }
}
  `;
  const response = executeQuery(query);
  const parents = JSON.parse(response).data.project.workItems.nodes;
  //Logger.log(parents[0]);
  let output = "";
  for (let i = 0; i < parents.length; i++) {
    const parent = parents[i];
    const progress = parent.widgets.find(w => Object.keys(w).includes('progress')).progress || "";
    const healthStatus = parent.widgets.find(w => Object.keys(w).includes('healthStatus')).healthStatus || "";
    output += `${parent.title} - ${progress}% (${healthStatus})\n`;
    const children = parent.widgets.find(w => Object.keys(w).includes('children')).children.nodes || [];
    for (let c = 0; c < children.length; c++) {
      const child = children[c];
      const dri = child.widgets.find(w => Object.keys(w).includes('assignees')).assignees.nodes.map(a => a.username).join(", ") || "";
      const childProgress = child.widgets.find(w => Object.keys(w).includes('progress')).progress || "";
      const childHealthStatus = child.widgets.find(w => Object.keys(w).includes('healthStatus')).healthStatus || "";
      output += `\t${dri}: ${child.title} - ${childProgress}% (${childHealthStatus})\n`;
    }
  }
  const cursor = DocumentApp.getActiveDocument().getCursor()
  cursor.insertText(output);
}

function executeQuery(query) {
  const url = "https://gitlab.com/api/graphql";
  const options = {"headers": {"Authorization": "Bearer " + getAccessToken(),
                             "Content-Type": "application/json"
                            },
                 "payload": JSON.stringify({query}),
                 "method": "POST"
                }
  const response = UrlFetchApp.fetch(url, options);
  return response.toString();
}

function getAccessToken() {
  try {
    const userProperties = PropertiesService.getUserProperties();
    let token = userProperties.getProperty('gl-token');
    if (token == null) {
      token = DocumentApp.getUi().prompt('Access token?').getResponseText();
      userProperties.setProperty('gl-token', token); 
    }
    return token;
  } catch (err) {
    Logger.log('Failed with error %s', err.message);
  }
}

I’m sure there are a lot of opportunities for improvement, but it does the job and didn’t take long to throw together (we can iterate later if we so desire).

Unfortunately, I couldn’t find a way to bind the script to a keyboard shortcut, but it was straightforward to create a custom menu to call it:

function onOpen() {
  var ui = DocumentApp.getUi();
  ui.createMenu('GitLab')
    .addItem('Insert Contributor Success OKR Update', 'getOkrSummary')
    .addToUi();
}

Then simply adding a trigger to call the onOpen function when the Google doc is opened.

Now it’s all together- it works a little something like this:

Not too long ago I was introduced to Terraform as a mechanism of managing Cloudflare DNS “as code”. It took quite a bit of trial and error, poking around, and some great support from the GitLab community over on Discord: https://discord.gg/gitlab – thank you!

During recent conversations with GitLab team members, we expressed a desire to manage Discord via Infrastructure as Code (IaC) e.g. Terraform. Others had taken a look and shared some concerns around the lack of a well maintained Terraform provider for Discord, but I wanted to do a little investigation and take another opportunity to learn.

So the first thing was to look at the Terraform provider options. It appears, at the time of writing, the best maintained provider is https://registry.terraform.io/providers/Lucky3028/discord/latest with commits most weeks (although small).

One of the big considerations was the underlying Discord API version being used. I think 6 and 8 have been, or will be deprecated soon. I know very little about Golang, but from https://github.com/Lucky3028/terraform-provider-discord/blob/master/go.mod it looks like the Terraform provider is using https://github.com/bwmarrin/discordgo as an API wrapper. Checking for issues/pull requests, it appears to be using API version 9 with an open pull request to bump it to version 10, so I think we’re in safe hands!

Early on, while poking around I spotted a typo (copy/paste error) in the docs, raised a pull request and had it merged within a couple of hours- so this was quite reassuring.

What next? Well, we need to setup a GitLab project to manage the Discord server. I used the same .gitlab-ci.yml I’m using for my Cloudflare project:

# Terraform/Base.latest
#
# The purpose of this template is to provide flexibility to the user so
# they are able to only include the jobs that they find interesting.
#
# Therefore, this template is not supposed to run any jobs. The idea is to only
# create hidden jobs. See: https://docs.gitlab.com/ee/ci/yaml/#hide-jobs
#
# There is a more opinionated template which we suggest the users to abide,
# which is the lib/gitlab/ci/templates/Terraform.latest.gitlab-ci.yml

image:
  name: "$CI_TEMPLATE_REGISTRY_HOST/gitlab-org/terraform-images/stable:latest"

variables:
  TF_ROOT: ${CI_PROJECT_DIR}  # The relative path to the root directory of the Terraform project
  TF_STATE_NAME: default  # The name of the state file used by the GitLab Managed Terraform state backend

cache:
  key: "${TF_ROOT}"
  paths:
    - ${TF_ROOT}/.terraform/

.terraform:fmt: &terraform_fmt
  stage: validate
  script:
    - cd "${TF_ROOT}"
    - gitlab-terraform fmt
  allow_failure: true

.terraform:validate: &terraform_validate
  stage: validate
  script:
    - cd "${TF_ROOT}"
    - gitlab-terraform validate

.terraform:build: &terraform_build
  stage: build
  script:
    - cd "${TF_ROOT}"
    - echo $TF_ADDRESS
    - gitlab-terraform plan
    - gitlab-terraform plan-json
  resource_group: ${TF_STATE_NAME}
  artifacts:
    paths:
      - ${TF_ROOT}/plan.cache
    reports:
      terraform: ${TF_ROOT}/plan.json

.terraform:deploy: &terraform_deploy
  stage: deploy
  script:
    - cd "${TF_ROOT}"
    - gitlab-terraform apply
  resource_group: ${TF_STATE_NAME}
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
      when: manual

.terraform:destroy: &terraform_destroy
  stage: cleanup
  script:
    - cd "${TF_ROOT}"
    - gitlab-terraform destroy
  resource_group: ${TF_STATE_NAME}
  when: manual

stages:
  - validate
  - test
  - build
  - deploy

fmt:
  extends: .terraform:fmt
  tags:
    - docker
  needs: []

validate:
  extends: .terraform:validate
  tags:
    - docker
  needs: []

build:
  extends: .terraform:build
  tags:
    - docker

deploy:
  extends: .terraform:deploy
  tags:
    - docker
  dependencies:
    - build
  environment:
    name: $TF_STATE_NAME

This deviates a little from the latest “off the shelf” but nothing too crazy. It essentially:

  • Validates the Terraform files
  • Lints the Terraform files
  • Displays changes (based on the state managed by the GitLab backend compared to the Terraform files)
  • Deploys the changes (via a manual job created once merged to the default branch)

I’ve yet to learn all the best practices, but opted for a backend.tf containing some basics:

terraform {
  backend "http" {}

  required_providers {
    discord = {
      source  = "Lucky3028/discord"
      version = "1.1.2"
    }
  }
}

provider "discord" {
  token = var.discord_token
}

data "discord_server" "discord_api" {
  server_id = "XXX"
}

variable "discord_token" {
  type = string
}

For this to work, we’ll need to define the TF_VAR_discord_token in GitLab CI/CD variables:

After pushing, the backend is initialized and we can grab the init command from the GitLab UI:

Now it starts to get interesting. We can run the terraform init command locally and start importing our existing Discord configuration into the state (and dump it into Terraform files).

When working with CloudFlare, I had the benefit of a special tool to to do this: https://github.com/cloudflare/cf-terraforming – so this is the first time I’ve had to try and understand really what’s going on.

I chose to start with text channels. I stubbed each text channel in a dedicated Terraform file, text_channels.tf:

resource "discord_text_channel" "welcome" {}
resource "discord_text_channel" "announcements" {}
resource "discord_text_channel" "dev" {}
resource "discord_text_channel" "support" {}

Now we can import the existing config into the state (and link it to the resource) with the terraform import command. Each import operation takes the resource address as the first parameter, and an identifier as the second (here, the channel id). e.g.

terraform import discord_text_channel.welcome XXX
terraform import discord_text_channel.announcements XXX
terraform import discord_text_channel.dev XXX
terraform import discord_text_channel.support XXX

However, this does not populate the resources in the Terraform files. We then need to examine the state OR push up the empty resources and allow GitLab to show us the diff. We can dump the state with terraform show:

# discord_text_channel.announcements:
resource "discord_text_channel" "announcements" {
    category                 = "XXX"
    id                       = "XXX"
    name                     = "announcements"
    nsfw                     = false
    position                 = 1
    server_id                = "XXX"
    sync_perms_with_category = false
    type                     = "text"
}

And simply copy the values into the relevant resources. However, this will include optional attributes with default values. I chose instead, to push the empty resources up and use GitLab to see which attributes were actually needed:

Subsequently, I’ve also now realised, this is simply the output of terraform plan, so I can run that locally to get the same feedback but more quickly and easily.

I was able to rinse and repeat this formula to dump and import all voice channels and text channels. Next it was time to get the roles. But trying to import roles using the role_id threw an error:

Fortunately, it was reasonably self explanatory/easy to guess. Switching the role_id for a composite id using server_id:role_id did the trick and now we have text_channels.tf:

resource "discord_text_channel" "welcome" {
  name                     = "welcome"
  server_id                = data.discord_server.discord_api.server_id
  category                 = "XXX"
  position                 = 0
  sync_perms_with_category = false
}

resource "discord_text_channel" "announcements" {
  name                     = "announcements"
  server_id                = data.discord_server.discord_api.server_id
  category                 = "XXX"
  position                 = 1
  sync_perms_with_category = false
}

voice_channels.tf:

resource "discord_voice_channel" "reception" {
  name                     = "Reception (Public)"
  server_id                = data.discord_server.discord_api.server_id
  category                 = "XXX"
  position                 = 0
  sync_perms_with_category = false
}

resource "discord_voice_channel" "development" {
  name                     = "Development"
  server_id                = data.discord_server.discord_api.server_id
  category                 = "XXX"
  position                 = 1
  sync_perms_with_category = true
}

and roles.tf:

resource "discord_role" "manager" {
  name        = "manager"
  server_id   = data.discord_server.discord_api.server_id
  color       = 10181046
  mentionable = true
  permissions = 7247232849
  position    = 0
}

resource "discord_role" "support" {
  name        = "support"
  server_id   = data.discord_server.discord_api.server_id
  color       = 15105570
  hoist       = true
  mentionable = true
  permissions = 865540234816
  position    = 1
}

Next up, member roles (to give each user the relevant access). This time, another error:

Unfortunately, this time there wasn’t as helpful message, although the displayed API endpoint did suggest we might need a server_id and user_id, but switching to server_id:user_id didn’t help, and Terraform continued to identify the input as empty/0.

After some discussion with Patrick Rice (a maintainer of the Terraform provider for GitLab and all round community legend), it seems there is a bug or gap in the underlying functionality, so we’ll need to:

Patrick explained how I could modify the source and instruct terraform to use the local development, by creating a ~/.terraformrc file containing:

provider_installation {

  dev_overrides {
      "Lucky3028/discord" = "../../github/terraform-provider-discord/bin"
  }

  # For all other providers, install them directly from their origin provider
  # registries as normal. If you omit this, Terraform will _only_ use
  # the dev_overrides block, and so no other providers will be available.
  direct {}
}

After modifying the source, I ran: go build -o bin/terraform-provider-discord, then delete the .terraform folder and re-run the earlier terraform init command (obtained from the GitLab UI). We are now using the custom provider.

I have had some success, and depending on priorities, hope I might find some time to start contributing to the project. I can see Patrick has already started contributing too: https://github.com/Lucky3028/terraform-provider-discord/pull/78

I first need to learn Golang and some more about Terraform :)

Thanks for all of your help Patrick (and those others who stepped in to support me along the way).

A frustrating limitation of the AWS Amplify platform: it can “plug straight in” to GitLab.com SaaS, but doesn’t have support for self managed yet. This issue is tracked https://github.com/aws-amplify/amplify-hosting/issues/14

Wanting to automate all of our deployments, I had to “find a way”.

It took far longer than it should have because I tripped up and went down a few rabbit holes, but here’s the final, simple, working solution. This assumes you have created the app in the AWS console and just want to be able to deploy new releases.

Create an AWS IAM user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "amplify:StartDeployment",
            "Resource": "arn:aws:amplify:REGION:ACCOUNT_ID:apps/APP_ID/*"
        }
    ]
}

You will need to replace REGION, ACCOUNT_ID and APP_ID with the relevant values. You can also go further and limit to a specific branch if desired.

You may wish to grant your user less restrictive permission (for example, if you want the same credentials to be used to deploy multiple apps).

Configure GitLab CI/CD Variables:

I chose to configure: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_REGION and set them to be protected and masked. You could also define your APP_ID and BRANCH_NAME here if you wanted to (instead we do it in the yaml below).

Configure .gitLab-ci.yml:

build:
  script:
    - apt-get update
    - apt-get install zip -y
    - echo "JOB_ID=$CI_JOB_ID" > build.env
    - npm run build
    - zip -r -j dist.zip dist 
  artifacts:
    paths:
      - dist.zip
    reports:
      dotenv: build.env

deploy:
  image: 
    name: amazon/aws-cli
    entrypoint: [""]
  variables:
    APP_ID: app_id
    BRANCH_NAME: branch_name
    SOURCE_URL: $CI_API_V4_URL/projects/$CI_PROJECT_ID/jobs/$JOB_ID/artifacts/dist.zip?job=build&job_token=$CI_JOB_TOKEN
  script:
    - aws amplify start-deployment --app-id $APP_ID --branch-name $BRANCH_NAME --source-url $SOURCE_URL
  needs:
    - build

You will need to replace app_id and branch_name with the relevant values.

That’s it!

It’s fairly straightforward, but there’s a tiny bit of “magic”.

AWS Amplify requires a publicly accessible URL to download your project. We can achieve this by:

  • splitting the pipeline into multiple jobs
  • storing build artifacts and the build job id
  • using the $CI_JOB_TOKEN to authorize the “private” artifact download endpoint

Big shout to Marco and Niklas from the GitLab community for helping me debug/troubleshoot and get this working.

I would love to know if you have a better/simpler/easier method, feel free to share.

While developing this solution I had some issues with the AWS CLI:

root@a1b5517efdf2:/# aws amplify start-deployment --app-id abc123 --branch-name test --source-url "https://www.example.net/wp-content/uploads/2022/08/dist.zip"

Error parsing parameter '--source-url': Unable to retrieve https://www.example.net/wp-content/uploads/2022/08/dist.zip: 'utf-8' codec can't decode byte 0x9c in position 11: invalid start byte

My solution initially was to use the AWS Ruby SDK:

require 'aws-sdk-amplify'

amplify = Aws::Amplify::Client.new
amplify.start_deployment({
  app_id: ENV["APP_ID"],
  branch_name: ENV["BRANCH_NAME"],
  source_url: ENV["SOURCE_URL"],
})

There may be a scenario where you would prefer to use the SDK, so I felt it worth including this alternate approach.

It turned out I was using an old version of the CLI, and updating to the latest fixed the issue, allowing me to get rid of the custom Ruby solution.

We have been transitioning from a GitLab runner on Windows Server 2016 (with a pre-configured/persisted environment) to a runner on Ubuntu 20.04 using throw-away/docker images.

Although we often use SQL Server, most of our recent development projects have been API based without the need for a database so we had avoided tackling this task until now.

Building the Database

Unfortunately the dotnet command isn’t capable of building SQL Server Database Projects. Fortunately, there’s an awesome project which can assist: https://github.com/rr-wfm/MSBuild.Sdk.SqlProj

I think it is possible to start with that project template and build the database entirely that way, however we opted to keep our standard SQL Server Database Project and use this as a layer on top.

That essentially involves configuring the solution to not build the SQL Server Database Project, then create a new project along the lines of:

<Project Sdk="MSBuild.Sdk.SqlProj/1.16.2">

  <PropertyGroup>
    <TargetFramework>netstandard2.0</TargetFramework>
  </PropertyGroup>
  <ItemGroup>
    <Content Include="..\Database\dbo\**\*.sql" />
    <PostDeploy Include="..\Database\seed.sql" />
  </ItemGroup>

</Project>

Deploying/Publishing the Database

The build step of the new project creates a .dacpac, but we now need a way of pushing this to SQL Server.

Initially I found some instructions on google, which took me a bit of a long winded route, and led to the linux sqlpackage command.

Of course, this command isn’t available on our tickett/dotnet.core.selenium docker image… so it took a while to add the pre-requisites etc (see https://gitlab.com/tickett/dotnet.core.selenium/-/blob/master/Dockerfile). So after doing that, building the new image, publishing to dockerhub etc… I then discovered a far simpler way (see https://github.com/rr-wfm/MSBuild.Sdk.SqlProj#publishing-support).

Essentially MSBuild.Sdk.SqlProj provides a dotnet publish option which just works out of the box. So we end up with something like:

script:
  - cp $NUGET_CONFIG ./NuGet.Config
  - dotnet build
  - dotnet publish Database.Build/Database.Build.csproj /p:TargetServerName=$MSSQL_HOST /p:TargetDatabaseName=$DB_NAME /p:TargetUser=sa /p:TargetPassword=$SA_PASSWORD

Spinning up the Database

So we can now build the .dac.pac and publish/deploy using our .gitlab-ci.yml but in order to run integration/end-to-end tests, we need to spin up a SQL Server in docker to host the database.

Google turned up some good resources which quickly got me up and running with mcr.microsoft.com/mssql/server:2019-latest

The only slight challenge I found the environment variables needed to pass to the service need to be defined in the .gitlab-ci.yml and can’t simply be defined in the GitLab UI. I have raised a merge request to improve the docs, and there is an open issue to track this behaviour (unsure if it’s a bug or intended).

Summary

So we’re not actually deploying this project using GitLab CI/CD, but for the build/test- our full .gitlab-ci.yml now looks like:

variables:
  MSSQL_HOST: mssql
  SA_PASSWORD: $SA_PASSWORD
  ACCEPT_EULA: "Y"
  DB_NAME: some_database
  SQL_CONNECTION_STRING: "Server=$MSSQL_HOST;Database=$DB_NAME;User id=sa;Password=$SA_PASSWORD;"

stages:
  - build
  - test

build:
 stage: build
 image: tickett/dotnet.core.selenium:latest
 tags:
  - docker
 script:
  - cp $NUGET_CONFIG ./NuGet.Config
  - dotnet build

test:
 stage: test
 image: tickett/dotnet.core.selenium:latest
 services:
  - name: mcr.microsoft.com/mssql/server:2019-latest
    alias: mssql
 tags:
  - docker
 script:
  - cp $NUGET_CONFIG ./NuGet.Config
  - dotnet build
  - dotnet publish Database.Build/Database.Build.csproj /p:TargetServerName=$MSSQL_HOST /p:TargetDatabaseName=$DB_NAME /p:TargetUser=sa /p:TargetPassword=$SA_PASSWORD
  - dotnet test -v=normal Tests/Tests.csproj /p:CollectCoverage=true --logger "junit;LogFilePath=TestOutput.xml"
 coverage: '/Average\s*\|.*\|\s(\d+\.?\d*)%\s*\|.*\|/'
 artifacts:
  reports:
   junit: Tests/TestOutput.xml

Note where we “redefine” SA_PASSWORD at the top as the CI variable defined in the GitLab UI is not available to services, but once “redefined” in the yaml it is.

We had one final tweak to our DbContext to tell it to grab the connection string from the environment variable if available, otherwise fall back to the standard connectoin string in the appsettings.json file (appsettings.

    public partial class DatabaseNameDbContext : DbContext
    {
        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
            optionsBuilder.UseSqlServer(Environment.GetEnvironmentVariable("SQL_CONNECTION_STRING") ?? Settings.DbConnectionString);
        }
    }

I have no doubt there are still numerous improvements to be made, but it feels like good progress on our end now the full environment can be spun up at a moments notice in docker.

We recently upgraded Android Studio and don’t seem to be able to generate signed APKs compatible with the Clover platform anymore. Fortunately this coincided nicely with a drive to leverage more GitLab CI / CD.

We previously migrated our Maven package repository to GitLab and this would probably have been the next logical step. Not only would it fix our immediate problem, but also lock down access to the keystore and reduce the manual effort required getting our local development environment setup and building the signed APKs.

For completeness, I’ve included a full example of our app/build.gradle, but only two additions were necessary:

  • The android -> signingConfigs section
  • The signingConfig signingConfigs.release line under android -> buildTypes -> release
buildscript {
    repositories {
        google()
    }
}

plugins {
    id 'net.linguica.maven-settings' version '0.5'
}

apply plugin: 'com.android.application'
apply plugin: 'com.google.gms.google-services'
apply plugin: 'com.google.firebase.crashlytics'

repositories {
    maven {
        name = 'GitLab'
        url = System.getenv("MAVEN_URL")

        credentials(HttpHeaderCredentials) {
            name = System.getenv("MAVEN_USER")
            value = System.getenv("MAVEN_TOKEN")
        }

        authentication {
            header(HttpHeaderAuthentication)
        }
    }
}

android {
    compileSdkVersion 28
    defaultConfig {
        applicationId "tickett.net.xxx"
        minSdkVersion 17
        targetSdkVersion 22
        versionCode 135
        versionName '135.00'
        multiDexEnabled true
    }

    // Start of new section
    signingConfigs {
        release {
            storeFile file("/.keystore")
            storePassword System.getenv("KEYSTORE_PASSWORD")
            keyAlias System.getenv("KEY_NAME")
            keyPassword System.getenv("KEY_PASSWORD")
            v1SigningEnabled true
            v2SigningEnabled false
        }
    }
    // End of new section

    buildTypes {
        release {
            minifyEnabled false
            proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
            // New line
            signingConfig signingConfigs.release
        }
    }

    compileOptions {
        sourceCompatibility JavaVersion.VERSION_1_8
        targetCompatibility JavaVersion.VERSION_1_8
    }

    android {
        lintOptions {
            abortOnError false
        }
    }
}

dependencies {
    implementation 'androidx.appcompat:appcompat:1.0.0'
    implementation 'androidx.constraintlayout:constraintlayout:1.1.3'
    implementation 'com.clover.sdk:clover-android-sdk:latest.release'
    implementation 'com.google.firebase:firebase-crashlytics:17.2.1'
    implementation 'com.android.support:multidex:1.0.3'
}

Setting v1SigningEnabled to true and v2SigningEnabled to false is specific to the Clover platform. I suspect for most android platforms these settings can either be removed or both set true.

So far nothing too complex. Next we need to configure the keystore and build script in GitLab CI.

Initially I found a “standard” gradle docker image, but quickly realised it needed to be an image with the android SDK. cangol/android-gradle was one of the first google turned up and seems to work perfectly.

Now we need to get the keystore into a CI variable. Because it’s a binary file, base64 encoding it seemed like the simplest approach. In windows I used the following command:

$base64string = [Convert]::ToBase64String([IO.File]::ReadAllBytes("keystore.jks"))

I quickly found our master keystore (containing a number of signing keys) was too big, so each key was split into it’s own store (I used https://keystore-explorer.org/) – this is probably a good practice from a security standpoint anyway.

Adding the KEYSTORE_PASSWORD, KEYNAME, and KEY_PASSWORD didn’t require anything special, but now we need the CI script to:

  • Convert the KEYSTORE back to binary: base64 -d $KEYSTORE > /.keystore
  • Run the build: gradle assembleRelease
  • And extract the resulting .APK (see the artifact -> paths -> app-release.apk (the resulting GitLab artifact zip retained the folder structure, so extracting the APK initially involved navigating into app then build then outputs then apk then releases… I’m sure there must be a simpler approach, but you can see the workaround I used, which copies it to the root: cp app/build/outputs/apk/release/app-release.apk ./)
stages:
  - build
  
build:
  stage: build
  image: cangol/android-gradle
  tags:
  - docker
  script:
  - base64 -d $KEYSTORE > /.keystore
  - gradle assembleRelease
  - cp app/build/outputs/apk/release/app-release.apk ./
  artifacts:
    paths:
    - app-release.apk

That’s it!

Until recently we were using Sonatype Nexus repository manager to host our maven packages and NuGet Server (http://nugetserver.net/) to host our NuGet packages.

Whilst they were working reasonably well, they were yet 2 more systems adding to our already bloated list of systems, so we were keen to drop them in favour of the “all singing, all dancing” GitLab.

On top of this, we were building and publishing the packages by hand. Now seemed like the ideal opportunity to start to use the integrated package repository in GitLab at the same time as scripting the build/publish using CI/CD.

We only have a dozen or so packages in each system, but we did anticipate it being a pretty big challenge.

The first decision we made was to have two projects- one to host NuGet packages, and one to host Maven packages.

Being a primarily Microsoft shop (and more in our comfort zone with C#), we decided to attack NuGet first…

NuGet

The instructions in the GitLab docs are pretty thorough and we were already doing some CI with our .NET Core projects, so this turned out to be pretty straight forward. We deviated a tiny bit but following https://docs.gitlab.com/ee/user/packages/nuget_repository/#publish-a-nuget-package-by-using-cicd would probably have done the job.

Here’s what our .gitlab-ci.yml ended up looking like:

stages:
  - test
  - deploy

variables:
  CI_DEBUG_TRACE: "false"
  ASSEMBLY_VERSION: "2.0.5"

test:
 stage: test
 image: tickett/dotnet.core.selenium:latest
 tags:
  - docker
 script:
  - cp $NUGET_CONFIG ./NuGet.Config
  - dotnet test Tests/Tests.csproj /p:CollectCoverage=true
 coverage: '/Average\s*\|.*\|\s(\d+\.?\d*)%\s*\|.*\|/'

deploy:
 stage: deploy
 image: tickett/dotnet.core.selenium:latest
 tags:
  - docker
 script:
  - cp $NUGET_CONFIG ./NuGet.Config
  - dotnet pack -c Release
  - dotnet nuget push ApiRepository/bin/Release/Tickett.ApiRepository.$ASSEMBLY_VERSION.nupkg --source gitlab
 dependencies:
  - test
 only:
  - master

$NUGET_CONFIG is a CI variable which was already in place for projects to consume NuGet packages, so it seemed more logical than using dotnet nuget add source...

Again, this process is even semi documented https://docs.gitlab.com/ee/user/packages/nuget_repository/#project-level-endpoint-2 with the subtle difference that the docs suggest creating a file in the repo, but that seems a bit repetitive and laborious, so we opted for a variable (of type file).

Maven

Again, the docs were really helpful https://docs.gitlab.com/ee/user/packages/maven_repository/#create-maven-packages-with-gitlab-cicd-by-using-gradle but as most of our packages are android libraries, we needed to find a docker image with the android SDK, and cangol/android-gradle works great. Our .gitlab-ci.yml looks like:

build:
  image: cangol/android-gradle
  tags:
    - docker
  script:
    - gradle build
  except:
    - master

deploy:
  image: cangol/android-gradle
  tags:
    - docker
  script:
    - gradle build
    - gradle publish
  only:
    - master

Similarly with our build.gradle- the instructions on https://docs.gitlab.com/ee/user/packages/maven_repository/#publish-by-using-gradle were a great start, but seemed to build an empty .jar where we need to build android .aar packages. A bit of googling soon had us on track, and now it looks like:

buildscript {
    repositories {
        google()
        jcenter()
    }

    dependencies {
        classpath 'com.android.tools.build:gradle:3.1.3'
    }
}

plugins {
    id 'java'
    id 'maven-publish'
}

publishing {
    publications {
        library(MavenPublication) {
            groupId = 'net.tickett'
            artifactId = 'logger-repository'
            version = '6.0'
            artifact("/builds/tel_clover/logger_repository/LoggerRepository/build/outputs/aar/LoggerRepository.aar")
        }
    }

    repositories {
        maven {
            url System.getenv("MAVEN_URL")

            credentials(HttpHeaderCredentials) {
                name = System.getenv("MAVEN_DEPLOY_USER")
                value = System.getenv("MAVEN_DEPLOY_TOKEN")
            }

            authentication {
                header(HttpHeaderAuthentication)
            }
        }
    }
}

allprojects {
    repositories {
        google()
        jcenter()
    }
}

You can see we opted to pull the URL, user and password/token all from CI rather than just the password.

This is the root build.gradle, so a there is a bit of duplication in the module/app build.gradle. I would love to make this a little bit more dynamic (pulling the artifact path, package name/version etc, but for now this works).

We consume the packages by adding the following to our build.gradle:

repositories {
    maven {
        name = 'GitLab'
        url = gidlabMavenUrl
        credentials(HttpHeaderCredentials) {
            name = gitlabMavenUser
            value = gitlabMavenToken
        }
        authentication {
            header(HttpHeaderAuthentication)
        }
    }
}

Then set the variables in our ~/.gradle/gradle.properties

Good luck, and if you have any suggestions, please shout!