Cowabunga to Cloud: With Migrations Comes Great Responsibility  

Vlad Petrean (Java Software Engineer) talks about the careful decisions involved in a cloud migration project, using a hypothetical project to showcase his thought process. This article was previously featured in the July 2024 issue of Today Software Magazine.

Cowabunga to Cloud: With Migrations Comes Great Responsibility  

With the advent of the first cloud services, software development has entered a new phase of managing, maintaining and utilising remote resources. In this context, many IT services have undergone major changes during this period. Application management has become easier thanks to the services offered by cloud providers. In-house applications can also replace some custom-made deployments. However, these benefits come with responsibilities, each representing an extra cost.

So, how do we migrate a project from an on-premise environment to the cloud in a way that simplifies processes and maintenance but also balances costs? The answer to this question is also a question: when should we use a cloud provider's services, and when should we create a custom solution?

Migrating a project to the cloud is a complex process with various hurdles and often comes with compromises. We can choose between different migration strategies, depending on the customer's needs. The migration itself can involve rewriting the software from scratch, moving each application with help from the cloud provider or a combination of the two approaches.

In this context, we'll discuss based on the following scenario: a project with several microservice applications (developed in Java and Spring) to be migrated to the cloud. Besides those, the project uses many additional technologies: a messaging system (for example, Apache Kafka), a relational database (maybe Postgress) and possibly other additional monitoring and notification tools (Prometheus, an email server, etc.). The cloud provider we want to migrate to is one of the following: Amazon Web Services, Microsoft Azure, or Google Cloud Platform.

So, how would we go about that migration?

01. Migrating microservices

First, we start by migrating the microservices mentioned above.

In the unlikely case that the applications don't require refactoring or possible code-level changes, they can be migrated and run as Docker containers, assuming this is how they are currently deployed. We have three options:

  1. use servers -virtual machines (VMs) or unikernels (Us);

  2. use a Kubernetes cluster;

  3. use specialised deployment services offered by the cloud provider.

For this case, we'll focus on the first two solutions. The three market giants (AWS, Azure, and Google) offer both VMs and Kubernetes services under different names, and the solution involving specialised services would generate even more debate.

Instead of asking why to choose VMs or Kubernetes, it's better to ask when to use each, as both can be advantageous. The first comparison criterion between the two options is the cost.

We mentioned in the scenario that our application is running on the Spring framework, which, in terms of hardware requirements, will require a minimum of 500 MB of RAM plus at least one CPU core. A single instance, a VM with the minimum required resources, costs in a vendor's offer about $15 per month, but we have to keep in mind that there are several microservices to migrate in the proposed project.

In comparison, a CPU in Kubernetes standard mode costs around $6 from the same vendor. At first glance, this seems advantageous, but the price doesn't include cluster maintenance, RAM, or initial setup and additional configurations needed to simplify the developer's work. You get all this by default if you use the more advanced version of this Kubernetes service, which costs around $42 per month.

Prices may change depending on the region where we use these services/resources, the period for which we commit to use them, etc.

At first glance, we are tempted to say that choosing a Kubernetes cluster as standard is the right choice for our scenario.

If we were to choose to migrate applications to VMs, it would mean having more machine instances, which would impose higher costs. Alternatively, we could have a single VM with adequate hardware resources. But this approach would probably cost as much as Kubernetes and bring other disadvantages, making project maintenance more difficult.

Another thing to consider is that a virtual machine can't be scaled infinitely in terms of hardware. Once we reach its resource limit, we must distribute applications across multiple VMs.

The next comparison criterion is the ease of deployment and maintenance of applications. From this point of view, if we had a single application to migrate to a cloud environment, both options would probably be equally efficient. However, most on-prem to cloud migration scenarios involve multiple applications/microservices to migrate, including our scenario.

Therefore, Kubernetes is the optimal option. While some knowledge is required to configure a Kubernetes cluster and set up the deployment of applications in such a cluster, most cloud providers offer user-friendly interfaces within the Kubernetes services to perform these configurations.

On the maintenance side, unlike a classic VM, where you need to make an "ssh" connection to the machine to see (e.g. the application status or logs), Kubernetes clusters are normally connected to the project's monitoring and logging service for a fee. Also in Kubernetes, we can configure the applications' auto-scaling based on their load.

These facilities can also be implemented in VM pairs, but we should analyse whether this effort is cost-effective. When you have a free auto-scaling application or a customised solution, the risk of failing is high. The same could be said for other additional necessary systems: monitoring, alerting, etc. Based on the project constraints (e.g. the budget), they may be a suitable solution, but if the provider offers such services at reasonable prices, it would be advisable to use their cloud services.

I want to mention that this is not due to convenience or the desire to reduce the workload but to promote long-term improvement in application state maintenance. To support this argument, I present the following situation: to notify the team when an application encounters a problem, we used various Python scripts to send these notifications through different channels (Teams, Mail, SMS). In this situation, a simple upgrade of the Python version was enough to stop these scripts from working, and additional changes would be needed. We would also need an automation tool/software like Jenkins, which will need upgrades over time.

In this situation, it is much simpler and more practical to use solutions already provided by the vendor for this purpose. Such solutions may come with options such as "auto-upgrade", where the cloud provider takes the responsibility to complete the upgrade of a component.

In conclusion, for the first part of this migration process, a suitable solution for migrating applications is to use a Kubernetes cluster as a standard way to reduce some of the costs. This solution will require some knowledge to perform the configurations, but the extra effort will lead to lower costs and simplified maintenance in the long run.

02. Migrating the messaging application

Let's move on to the next component to migrate - the messaging system part. For the example scenario, we're using Apache Kafka. But we could have other applications as a replacement for Kafka, such as RabbitMQ.

These types of software do not currently have direct corresponding deployments or services in any of the popular cloud providers. Instead, they have services that can replicate their messaging functionalities. Without promoting a particular vendor, we can mention Google Pub/Sub, Azure Event Hubs and Amazon Kinesis. In addition to these implementations, we can purchase possible implementations of some services from the market section, including Apache Kafka. However, services purchased from the market often have configuration constraints, higher costs, and lower uptime compared to vendor services.

Excluding the last option, we are again left with two possible solutions: either we configure a VM on which we will install the software we need, or we switch to one of the products mentioned above.

In the case of the first solution, expect additional effort for the server setup as well as the installation of the software itself. For an easier installation, I recommend running Apache Kafka either as a Docker container or as a systemd service. As the installation is relatively easy, all that's left is configuring the Kafka cluster settings. Of course, we'll need to ensure that the new cluster is a faithful copy of the original one, that it's secure and, equally important, that it's accessible over the internet for the apps migrated in the previous step.

We will also most likely need additional tools to monitor the state of the software and see in real-time if there is a problem. For Apache Kafka, there are solutions such as Kafka-Manager, Kafdrop, etc. In the case of a virtual machine, the vendor provides by default only metrics to observe hardware resources, as well as access to operating system logs. Various notifications can be set based on these metrics, but they do not cover cases where application-level errors occur. They may eventually cause an increase in hardware resources, but the user (person or another app) may already be affected before a VM reaches the upper limit of its resources.

Suppose we choose to use a technology made available by cloud providers. In that case, we should first test and analyse its compatibility with our project, both from a technical and business point of view. The easiest way to check is to carry out a small proof of concept or a small-scale test. In case of compatibility, we will have to change the code. The length and complexity of the work depend on how important the messaging software is to the project.

If we refer to the first case, where we only have to make minor adjustments, then it's all good. Most replacement variants offer solutions for both the code side (dependencies, libraries) and the maintenance side (scaling, backup, message visualisation, lag monitoring).

On the other hand, if the messaging software is a core component that controls data transmission in any microservice, then migrating to a replacement technology becomes more challenging to execute and implicitly increases the testing effort.

In this case, I recommend we go with the first option to ensure project functionality. Even if the effort is higher from a maintenance point of view, the advantages are significant - ensuring application stability and reducing errors, migration time, and costs.

03. Database migration

Most vendors offer counterpart services to the databases (DBs) we have been using so far in an on-prem environment - Cloud Sql (GCP), Aurora PostgreSQL (AWS), and Azure Cosmos DB (Microsoft Azure).

From a cost perspective, there aren't significant differences between the previous products and setting up VMs to maintain a database. Moreover, from a compatibility point of view, being the same types of databases, there is no problem or difference in connecting the migrated applications to the new DBs in the cloud or to the existing ones from on-prem.

In my opinion, the advantage of vendor solutions is the plethora of features that come with them. For example, some metrics give us information about:

  • the number of connections to the database;

  • the duration and number of queries performed;

  • the logs about actions performed on a table or on the database;

  • the interface for querying the DB;

  • migration operations;

  • importing data from various sources.

Many of these features are included in the standard pricing. So, setting up and maintaining certain virtual machines is not necessarily cost-effective.

Another important feature of the DB versions offered by the vendors is the ability to back up data automatically. All these functionalities are easy to enable, and most have user-friendly interfaces. Also, in the case of DB services, we can also designate a time period in which the provider can perform version upgrades without our intervention.

One aspect that has not been discussed in the migration process so far is data security. From this point of view, there is no difference between having the data stored in a database on a VM or in a database that is part of a dedicated service. Both can use encryption based on a key held only by the project members.

04. Additional components

To migrate the additional tools for the project, we should first review the recommendations so far:

  • The microservices in Spring will be migrated to a Kubernetes cluster;

  • Apache Kafka will be installed either as a service or mounted as a Docker container on virtual machines;

  • we will use a pre-existing service offered by the cloud provider for the relational database.

As mentioned previously, cloud services can replace some tools. For example, there are services that centralise and monitor all logs related to a project. Here, we recall Kubernetes application logs, operating system logs on some VMs, database logs and so on. We can also configure these services to capture information from Docker containers, and Apache Kafka containers can be integrated.

The advantage of these services is that they minimise the time needed to investigate possible problems. For example, these tools allow filtering of certain logs based on time (a specific date), the occurrence of certain keywords, or particular errors, etc. We can also set notifications to ensure we take swift action when certain errors occur.

For the project's hardware, cloud vendors offer standard, free solutions. Here, we recall metrics measuring CPU resources, RAM memory, network traffic, etc. For more advanced metrics (e.g. I/O operations and storage space observation), we will need to install specialised agents, which will involve a fee.

Also on this topic, we have the possibility to back up all infrastructure resources, be it VM, Kubernetes Cluster or DB. Through various settings, we can make backups both after the VMs' configurations, a Kubernetes cluster, or even after storing the data on these components.

Last but not least, cloud providers allow us to store various secrets needed by our applications in places such as storages, config maps, etc. So, any required software or tools can be easily replaced. If there are concerns about exceeding a budget allocated for a particular project, a possible solution is to notify the team via alerts when the set limit is exceeded.

Not a conclusion

I would not like to end the article with a concrete conclusion, as the topic is subjective. Even for the scenario we considered above, there may be more optimal solutions, both in terms of effort and cost.

One thing is certain — before we start migrating a project to a cloud environment, we need to analyse the solutions provided by the cloud provider. Following this analysis, we can see which technologies we can use and which components already existing in the project should be migrated as they are.

I don't believe that an absolute target should be imposed, such as achieving the lowest possible costs at the expense of complex long-term maintenance or simplifying the project as much as possible through supplier services, reaching costs higher than on-prem solutions.

It's best to choose a solution that balances costs, effort required and long-term maintenance, taking into account the difficulty and complexity of the project to be migrated.

If you would like the help of an experienced, strong tech team to help you navigate the challenges of cloud migration, find the right balance, and migrate without disruption or downtime, we'd love to help. Let's get in touch and discuss the details!