In this article we are going to cover a number of concepts about containers, mainly focused on Docker Containers and Docker technologies.
The aim of this writing is to give you a fresh start, a high level guide on this young technology that is already changing the world of IT application architecture.
Docker Containers has been out there only for a few years, but it's something that is continuously growing, changing and spreading, almost exponentially. Who knows, you and your company may be already using it without even knowing it.
Whether you are a tech guy or a decision maker, this guide will help you better understand what containers are about, how do they work and ultimately find if containers are useful for you and your company and how to start on this path.
So, let's get started!
At this point, you might not be entirely sure whether to continue reading and learning about containers, or to invest your time in something else…. I mean, are containers for you and your organization? Why should you bother to learn about a new architecture and new technology? Should you wait more time just to be sure containers are a safe bet? There is a fair amount of effort to put to catch up to this new technology, and there never seems to be the right time to do so.
The thing is, even if our businesses are ok without containers, they could definitely be better with them. If you’re not using containers at all, then you’re most likely wasting resources, time and money.
In simple words: Containers are meant to run applications, while keeping everything secure, using the bare minimum hardware and required resources, and being portable. It also reduces the issues that commonly arise when moving an app from development to testing environment and from testing to production environment.
Who should use containers? I think that at least everyone should learn the basics about containers, and ideally get them working in a lab. Whether to use them in your real world applications or not, will be ultimately your choice.
Those who come from the old IT days know what it was like to deploy a new application to production. Usually, it consisted of a long process of finding the writer server and specifications-- thoroughly choosing the CPU, memory, hard drive. Identifying the best operating systems, licensing options, costs, and trade offs for each option.
After that youwould have to install the OS, make the proper configurations, patches and updates, add security layers, also to think about the power, redundancy, etc. All this process was needed to host one application (ideally, you would have had one application per server).
This process involved a great deal of time, money and effort, and every time you needed a new server, you had to go through all the same process again.
By adding new server(s) to your company’s infrastructure, you were implicitly increasing the need for more resources to maintain those servers. Its hardware and software needed to be maintained periodically.
Finally, a very high percentage of CPU power, memory and disk were being wasted per server due to the fact that usually, the application installed would use only a fraction of the full potential of the server.
In conclusion, having hardware on premises to keep business applications running meant high cost, and too much time and resource consumption. In this scenario, hardware virtualization came in, and it was really great for that time because it offered big improvements in resource management.
The first improvement was hardware virtualization, introduced with Hypervisor, or Virtual Machine Monitor (VMM). This solution was revolutionary and gained widespread adoption. It has been out there for many years now and is still prevalent today.
A hypervisor, or VMM, allow us to take a better advantage of the hardware we already have (known as the host machine) by using it to run several virtual machines (called guests machines) inside the host. Having the ability to install a different operating system on each virtual machine, and ultimately to run an application on top of a VM’s OS.
Each VM can be assigned CPU, memory and disk space while it remains isolated from other virtual machines. Applications on the VMs are confined to the resources assigned to the VMs and not the Host.
So the benefit introduced here was that the waste of processing power was reduced: we were able to run several virtual machines in only one physical server (Host). This way we were taking a better advantage of its underlying hardware while reducing the effort needed for maintenance.
But still, this was not perfect: We still needed to perform software installation and configuration, maintenance, and often times the VMs took too much resources for certain small tasks. Each VM had an overhead of the OS and that was dragging down the host. This problem was something impossible to circumvent, at least not with virtualization.
Wouldn’t be great if we can run different server applications under the same VM?
Finally, containers came in, and they are all about operating system-level virtualization. This is not a new technology either, but the difference now is that it became massively used, supported, maintained and expanded over the past few years. And they are here to stay, the main players of software industry are betting on this and contributing to its growth.
To be able to use containers, you just need one server and one operating system that supports the chosen container technology. Then, you simply create a new container anytime you need to run a new app (ideally you'll have one application per container). Containers are isolated and secure, and they use bare minimum software and hardware resources they need to run. These resources are also managed and can be increased or decreased as needed.
Thus, containers are lightweight, extremely fast to boot, and portable. Lightweight because comparing them to virtual machines, containers consumes less CPU, less RAM and less disk space. Extremely fast to boot because they need only a fraction of second to start running.
And portable because you can move and run your container from one host to another without having to change anything, as long as the new host supports the container runtime environment of choice. So, you could take it from your laptop or from your own managed server, to another developer machine, or to any other virtual machine, or even to a service in the cloud (like Azure, Amazon AWS, etc).
You can add as many containers as needed on the same server, at least while it still has available resources for the container to be added, of course. And due to the fact that containers are lightweight and efficient at consuming resources, you can have hundreds or even thousands of containers running on a server where you would only be able to run a few virtual machines. This can be seen in the following high level diagram:
Hypervisors versus Containers:
Containers consists of virtualization at the operating system-level: They are self-contained application runtime environments, each one having its own isolated CPU, memory, block I/O and network resources, and sharing the kernel of their host operating system.
A container is like a package or bundle of an application and all of its dependencies, including files and configurations. This is what allow the container to be easily ported from one environment to another, because actually not only the app is moved, but also everything it needs to run is shipped with the app too. This gives you the benefit of a fast and reliable portability and deliverability of your app. Forget about running into issues when the application is deployed on another host, finding that the host lacks of certain resources or any other prerequisite.
Containers are isolated, this means that the containerized application will be isolated from any other containers and applications running on the same host OS. But not only the application is isolated, but its dependencies are isolated too. This way, you won’t have conflicts between those dependencies either. In the same way, the container access to the file system on disk, processes and network are isolated too. In the end, inside a container you have something similar to a very small OS with the bare minimum functionality needed to run applications.
In order to understand how this isolation is accomplished, or at least to have a technical but still high-level glimpse of it: containers were born on Linux, and Linux runs user processes inside something called User space. Containers are isolated instances of the Linux user space:
At their beginning, Docker Containers were using LXC (Linux Containers), which was already using two important linux kernel features called namespaces and cgroups. These two features are the baseline for achieving isolation between containers.
Linux kernel namespaces virtualize and isolate system resources and assign them to a container. A few examples of resources being virtualized are: process IDs, file systems, network access and user namespace.
Linux kernel cgroups (or control groups) feature takes care of available resources management for a group of processes. It allows to set and change limits for memory, CPU, disk use, etc. Each container has a cgroup to handle these resources limits.
There is one more feature worth to mention, which in conjunction with namespaces and cgroups, completes the container’s isolation and makes a more robust security boundary: this linux kernel feature is named capabilities.
Linux kernel capabilities constraints processes' and users' permissions by assigning or removing privileges to them. Capabilities handles a granular set of privileges, and containers usually have all these privileges un-assigned by default. This way, you have to specifically assign permissions to your container as pertains to your application needs.
There are several types of solutions out there regarding containerized environments and runtimes, that assess the same underlying concept explained before in this article: userspace segmentation and isolation. Some of these solutions have been available for many years now, and some of them are still young. The level of support, matureness, and richness of features vary, and are evolving constantly. Containers are a hot topic for the IT world, and it will continue to be so, because they are shaping the future of applications.
As you already know, this guide is focused on Docker Containers, but here you have a small list of other available options: Unikernels, LXC, OpenVZ, Rkt (Rocket), Windows Server Containers, Hyper-V Containers, FreeBSD Jails, Solaris Zones.
Some of these benefits have already been covered in previous topics of this article. Nevertheless, here you will find a summary of the main benefits when using containers for your projects:
I've chosen Docker for this article because, at the moment that I'm writing, it is the leading software container platform. It is currently being used and maintained massively, not only by small groups or startups, but also by the biggest companies and organisations like Intel, Microsoft, Dell, Google, IBM, Red Hat, Cisco, and more. They are not only putting their bets on this, but also taking actions and participating actively on the project.
To name a fairly recent example, Windows Server 2016 just made Windows Containers available for production. Before that, if your host OS was Windows based, then you were only able to run containers with Docker running inside a VM with a Linux distribution pre-installed. And this was natural of course, because Docker was born on Linux. But, needless to say that a better approach for a Windows OS based machine was to have a native implementation of containers avoiding an extra virtual machine for that.
Windows Containers are young and there is a lot of work to be done on that area, but I just wanted to name it as an example of the importance of containers, where a company as big as Microsoft decided to be a part of the container world by supporting a windows native version of containers, with all the changes this sort of decision means.
Docker Inc. is the company that initially created the Docker project. Nowadays, Docker Inc. no longer owns this project, but sponsors it instead. The Docker platform currently is an open source project with a large number of participants. Docker Inc. itself builds tools, as the rest of the community do, and also give support for these products.
Docker Inc. currently offer two main products: Docker Community Edition (CE) which is free to use and Docker Enterprise Edition (EE) which is paid. Each edition has a series of different features and support. Also, there may be differences on the platforms supported by them (eg. available for desktops, servers, and on cloud providers platforms).
Docker started as a standard runtime for containers, bringing together everything needed to run them. Then, Docker scaled up to become a platform with many more components. These components are evolving as the time passes, by adding new features or changing existing ones, sometimes even changing the names of some commands or tools.
This may sound a little discouraging now, but remember that Docker is still young and there is a lot of work to be done and room for improvement. Nevertheless, they have a set of applications and services that are mature enough to be considered production ready.
When you download and install the Docker Platform from their website, you'll be installing the docker engine and docker client both together. This is by default, and is great for you to start playing around with Docker and containers right away. It is also great for a quick setup of your development environment: as simple as download, install, start to use it.
It is also possible to have the docker runtime set up in one host, and the docker client in a different host, and make them communicate over the network.
Docker is available through different channels. Those channels vary from one OS Platform to another. But as an example, at the moment I’m writing this article, Docker CE has two update channels: the stable channel, and the edge channel.
Stable channel: This is the reliable channel where you get updates every quarter.
Edge channel: This channel includes experimental features faster. On this channel you get new features every month. But some features may change or not remain available when passing to the stable release.
In the next paragraphs, you’ll find a detail of major Docker Components available at the moment I’m writing this article. I’m providing an overview for each one. Remember that this information may change at any moment, as many of the tools and features listed here are changing, growing, evolving. But at least you’ll find a few key concepts to help you understand the basics and get you going on the right track for further investigation.
The Docker Engine, also known as Docker Runtime or Docker Daemon, is the software that takes care of handling images and running containers. It has all the technology needed to build and run containerized applications. It takes care of the communication with the underlying OS and gives your containers all they need to run.
The Docker Client allows you to run docker commands, which are mostly easy to write and remember. These commands are sent to the docker runtime over a REST api.
These are a few basic docker commands:
docker pull download an image, or set of images
docker run start running a container
docker ps get a list of running containers
docker stop stop running a container
docker rm delete a container
You’ll find a detailed list of commands available here: https://docs.docker.com/engine/reference/commandline/docker/
At a high level, a Docker Image are prebuilt containers consisting of all the data needed to build and run a container from it. Images are designed for the build time, while containers are for the run time.
Images are stored (publicly or privately) at Docker Hub (or the new Docker Store), and you can use the pull command to download images from there at any time.
For example: docker pull -a ubuntu
This command will download a copy of an image named "ubuntu". More specifically, it will give you the latest version available at Docker Hub.
You can pull a specific version by adding its number at the end of the same command, like the following example:
docker pull -a ubuntu:17.02
where the tag :17.02 means that you want to download the version number 17.02 of the ubuntu image.
Even while the version tag is not mandatory, it is recommended as a best practice to always include it in all your commands, so you’ll always get the same behavior.
If you don’t specify a version number, the default is :latest, so you may end up getting a different version at some point, which in turn may give you an undesired outcome.
Images are internally composed by a set of layers. You can think of these inner layers as a stack of images again. The lowest layer, or base image, contains a basic set of OS features. Every time you want to add functionality to your image, you need to add a new layer on top of the top most layer. Also, every layer is read-only once it was added to the image, this way you keep your image always at a “stable” state, and every container generated from any image is going to be as stable as the image is.
When you make a pull request for an image from your docker client, if the docker daemon can’t find that image locally in the host, then it will automatically make a pull request from the cloud registry service. The registry service set by default is Docker Hub.
Docker Hub has a set of features, and is also changing in time as all the other containers-related tools. Main features offered by Docker Hub nowadays are: to serve public or private registries; to manage image repositories inside of those registries (which means being able to find, push or pull images), automated builds and actions triggers after pushes, integration with Bitbucket and GitHub.
For more information you can go to: hub.docker.com, where you can register to get your Docker ID, and start using Docker Hub for free.
At this moment there is also available a new service named Docker Store, which is going to replace the Docker Hub. You can find it at store.docker.com
This was initially created with the goal of being able to install and manage the Docker Engine on a VM on top of the host OS. This was needed due to the fact that Docker was only running on Linux, and with Docker Machine it was possible to start using it on Windows and Mac OS too.
Later, Docker Machine added tools to allow you to remotely manage the engine on any host in your local network or even in any host in the cloud. Not literally on any host in the cloud, but at least on many of the most popular platforms, like Digital Ocean, Azure, AWS, etc.
The Docker Machine has its own command-line interface: docker-machine
For a detailed list of Docker Machine commands, go to: https://docs.docker.com/machine/reference/
This tool is meant for defining and managing applications made up with multiple containers. It comes with a handset of commands useful for the entire lifecycle of your app, from its definition, to build, start, stop and monitoring the status of your services, making Compose the ultimate toolset for your app development, testing, staging and continuous integration process.
One of the most recent features added to Docker Compose is the ability to deploy a multi-container application to a remote Docker Engine (both single instance or cluster).
You can find more details about Docker Compose commands here:
A swarm is a cluster of Docker Engines running in swarm mode. This mode is natively available with the engine, so you don’t need any extra tool or feature to be able to use it.
This technology is designed to deploy application services, and to manage the swarm. You can’t run services without enabling the swarm mode.
The swarm consists of nodes categorized into two roles: managers and workers. You can have several manager nodes and several worker nodes in a swarm, but only one of the managers is the leader.
When you pass a service definition to a manager node via the docker api, the manager splits that service into tasks and schedules those tasks to the available workers in your swarm. A task is the smallest unit of work that can be assigned to a worker node.
Docker Swarm comes with a great set of features out of the box, like load balancing, multi-host networking, secure communication between each node, and more.
This tool is an image storing solution meant for the enterprise. It works like the Docker Registry solution included in Docker Hub, with the difference that you can install DTR (Docker Trusted Registry) wherever you need it and most importantly, with your own security policies. With DTR you’ll be able to push and pull images behind your own firewall, and have those images stored in your own datacenter.
Another available tool for the enterprise is the Universal Control Plane. You can install this solution in your server of choice and then use it to manage all your resources from one place: monitor and manage your clusters, images, containers, services and more.
Well done! you’re getting to the end of this article, but this could not be further away from the end of the topic that brought you here on the first place: containers. Docker or any other container, are one of the hottest areas nowadays, and I think it will continue to be so for a long while.
These kind of changes that affects all areas of the enterprise, from planning, developing and ultimately to deploy and maintenance, are not made overnight. Also, it takes time for a new technology to be mature enough and to satisfy all business case scenarios. But do not be careless, just a couple of years has passed, at least for the Docker life, and the community has made tremendous progress.
Many of the Docker tools are already being used for real life applications and services. Many startups are based entirely on containers. And many of the biggest IT companies are also putting their efforts and budget on solutions based on containers.
So the sooner you introduce yourself and your company on the containerized world, the better for you and your business.
Even though this article wasn’t meant to be too technical, I thought it would be nice to include a super basic guide on how to install Docker and run an app within a container.
Go to www.docker.com and click Get Docker.
For this article, I’m choosing Docker for Windows:
Click on the button “Download from Docker Store”:
You’ll be redirected to store.docker.com, and you’ll be getting the “Docker Community Edition for Windows” description page.
After clicking “Get Docker” button located on the right side of the screen, you’ll be asked to save your file in your machine. Click save.
After the download finishes, run the installer. It will ask you to confirm the installation process first:
Once you run the installer, the first screen you’ll see is the next one, where you’ll need to press “Install” button:
Wait a few moments for the installation to complete…
Click on Launch Docker, and Finish:
An information message will be shown, and you’ll find a new Docker icon in your Windows’ status bar. That icon is animated while Docker is starting, and it will be static once Docker is up and running.
At that moment, you’ll see a screen similar to the next one:
Press the Windows Start button, or press your keyboard’s Windows key, and start typing “powershell”. Then click on the shortcut:
Windows PowerShell will start up, and from there we’ll start running Docker commands:
Let’s start with the basics commands: type in “docker -v” and hit “enter”:
Now, if you want to see a more detailed information about the version of Docker that is installed in your machine, you can try with this command: “docker version”:
With this command, you’ll find more information regarding the Client and Server both APIs versions, Go-lang version used for this release, and more.
And the last useful command to start with is: “docker info”.
With this command you’ll be able to quickly find out how many containers your host has, and how many of them are running, stopped or paused, how many images you have, if the host is running with swarm mode active or not, how much memory is there available, how many CPUs, and much more.
Lastly, two more commands that you’ll be using a lot when working with containers are: “docker ps” and “docker images”
docker ps will show you a list of running containers. It has a variation to show all containers, running and stopped ones. You can get this information with docker ps -a or docker ps --all
In our case we shouldn’t have any containers listed, because we have a brand new installation of Docker, and we have not created any containers yet.
docker images will give you a list of all images that you have available in your local host:
Again, this list is going to be empty for us now, because we don’t have images stored locally.
There is already an image publicly available at Docker Store: https://store.docker.com/images/hello-world
We can easily pull that image and run a container from it.
Run the next command from your PowerShell window: docker pull hello-world
docker pull command will search in the Docker Hub for the requested image and if it finds it, pull a copy of that image to the local hard drive. In this case we didn’t supply an image tag (version number), so by default “:latest” tag is used. So, the latest available version of hello-world image is going to be pulled.
You can verify that the image is available now, running the command: docker images
Our local images repository list is giving us one result: the repository named “hello-world” and tagged as “latest” version.
Now that we have our image, we can run a container from it.
For that, simply type in: docker run hello-world
For now, just take look at the the first line of the output of this app. It says “Hello from Docker!”, meaning our demo executed flawlessly.
What the docker run command did was to communicate with the Docker engine, and ask for an image named “hello-world”. Given that this image was locally available, the daemon simply created a container from it, and run that brand new container immediately. The app within that container started to run, wrote a few lines of text to the console, and ended.
This was simple, we just needed to know two commands: docker pull image-name, and docker run image-name.
Well, the same result may have been accomplished with only one step: just running one command instead of two. If we had only typed in the console: docker run image-name, we would have get the same outcome, because the docker pull command is implicit in the docker run command.