Getting to know Docker… and God knows what else [Part 1]

ACGoff
12 min readApr 8, 2019

--

The more I learn in development the more I can understand how to build little projects, and that is great. I do also see however these shaded characters on the perimeter which I am continually trying to avoid. Dev Ops.

The Docker logo: the first shaded figure on the perimeter

Let’s dive into Docker

So already I know it is a whale symbol which somehow allows you to spin things up to cloud with relative ease. My guess, and hence probably not functionally accurate, is that it boils things down to be much simpler kind of like zipping a file — remove the things you don’t need to make it easier to move.

An example here is I could send over my code to someone and get them to run it (hoping they installed all prerequisites).

If they don’t have the infrastructure (ie. a computer in this example) I could post my laptop to them. To save the effort though, I could give them a shopping list and directions and then they could follow those instructions to run the code.

Right, let’s see if that is right. [Me-from-the-future note: Not far wrong but it is much more than that.]

Lesson 1: What are Containers and Virtual Machines

Fresh out of the lesson. So a virtual machine kind of explains itself. It is a machine which doesn’t exist physically, but runs as if it does. The machine in this context is a computer. As such, it will have the full operating system running, some virtual drives, a virtual output maybe… heck why not a virtual mouse too! Either way, think of it as a digital-clone of a computer.

A container is similar but more stripped down.

Top Gear doing the same thing

If I wanted a car you might give me a Ford Focus. If I wanted that car to be fast we could start chucking out the seats and the stereo etc. We can do the same with a virtual machine to make it a container. Strip out all the parts we won’t use in the OS, the drives etc.

Doing this makes them quicker to start, take less space up and can be more agile to change.

What can be stripped down depends on what the function is. Is the car going off road or on tarmac? Is the program we want to use going to be using these features of the OS. If not, it goes. The container should only have enough of stuff to be able to run the program we want to, nothing more.

The break down of the analogy

If you have a Mac you may have at some point run a virtual machine to be able to run Windows programs.

As a virtual machine is the same and can do the same as a physical machine, a virtual machine has the ability to also run a virtual machine.

A container can be run on a real machine. Likewise, a virtual machine also has the ability to run a container.

Cars don’t fit this very well.

Why is this useful? Well the cloud is basically a big ol’ server farm. These servers may run virtual machines on the servers which in turn will run your container.

What is Docker then?

Containers have been around for a long time but Docker has not. Docker is a tool that allows you to package and manage a program by making a container. This then allows it to run anywhere.

Why do I care?

Like NPM helps bundles up dependencies so you can share your code, Docker helps, (click here for me to explain NPM). Without NPM you could make a list of your dependencies and share that list etc but that’s difficult to maintain and is a hope that someone installs them all properly. Now try to make that list with not only package dependencies but with hardware and OS dependencies.

Why do I care? Because it is lightweight and faster it is cheaper and more efficient to run (perhaps another breakdown of the car example) and allows your code to be run anywhere.

Container and VM Architecture

Containers are typically run on Linux or Windows.

Containers will run a program or group of programs and will be the containers run on the container manager. From this manager multiple containers will sit underneath and share items — items like OS and libraries/packages. As they share libraries there is no duplication and hence it is faster.

An images showing the difference from: https://www.backblaze.com/blog/vm-vs-containers/

Note on Hypervisors: A hypervisor is essentially the piece of software that sits above the VM(s) on the actual physical server/hardware/infrastructure of the cloud provider. A hypervisor is required for taking the instructions from the VM’s shopping list of requirements to be able to run it and then the hypervisor is used to start them up.

Isolation

A key part of containers is that it isolates programs. This can be useful to stop one program affecting another. It is typically used to ensure security and stability. This needs extra attention as now programs are run on single machines and hence there is a potential to affect or watch an unlucky other.

Quoting from Ed King: https://medium.com/@teddyking/linux-namespaces-850489d3ccf

It follows that you can’t interfere with something if it’s not visible to you. And that’s really what namespaces provide — a way to limit what a process can see, to make it appear as though it’s the only process running on a host.

Without namespaces, a process running in container A could, for example, umount an important filesystem in container B, or change the hostname of container C, or remove a network interface from container D. By namespacing these resources, the process in container A isn’t even aware that the processes in containers B, C and D exist.

Additionally, an example of namespaces, being used outside of the context of Docker but for security still, comes from TopTal https://www.toptal.com/linux/separation-anxiety-isolating-your-system-with-linux-namespaces :

Recently, there has been a growing number of programming contest and “hackathon” platforms, such as HackerRank, TopCoder, Codeforces, and many more. A lot of them utilize automated pipelines to run and validate programs that are submitted by the contestants. It is often impossible to know in advance the true nature of contestants’ programs, and some may even contain malicious elements. By running these programs namespaced in complete isolation from the rest of the system, the software can be tested and validated without putting the rest of the machine at risk.

So namespaces seems to me like putting great walls up. Some regions behind the walls have some special functions. Think of it like a map: one area has the sea and hence ports, one area has… other stuff.

There are 7 types of namespaces but I haven’t worked out what they do.

Docker installation

The first lab is essentially a check of the docker installation by asking you to spin up a container based on the ubuntu image. An image is essentially and instruction set and materials required to create a container. (Similarly you can also get images of OS, for example if you have ever played with a raspberry pi.)

First create one docker container run -t ubuntu top. The -t “allocates a pseudo-TTY” which from reading means it will have a terminal but frankly I am confused here. The top makes it print out a dashboard of data. You need the former to get the latter.

After creating a container it also shows how to jump into that instance by docker container exec -it <nameGoesHere> bash. The -it means interactive terminal and bash is the type of terminal.

The only thing that wasn’t taught is how to terminate a container once completed. This is handily done by the command docker kill <nameGoesHere>

You can find the name of the container again by the command docker ps -q. ps stands for list containers — go figure…

Using a Container

Much like NPM monitors and manages a library of packages, Docker also manages a library of images. This means you can install Ubuntu very easily through downloading the image from it. In fact, that is what we had done in the first lab.

This is also sort of what we did in the second lab. Look at the following command and break them down:

docker container run --detach --publish 8080:80 --name nginxContainer nginx

--detach means to run in the background. What does that mean? It means that you can’t see what is going on in the container without connecting to it. Why would you want that? If you are subscribed to everything it is information overload and an extra process for the container. Read more here: https://medium.freecodecamp.org/dockers-detached-mode-for-beginners-c53095193ee9

--publish means to create or allow ports which are available for communication outside of the container. Essentially, it opens the ports up to other containers/apps/people to interact with. This means that now we can go to port 8080 on our browsers and see the container. this is the significance of 8080:80. It published its internal port 80 as an external port 8080.

--name is a feature which allows you to name the container. We name it nginxContainer.

Finally we need to specify the image we want to run. In the docker library the name is nginx hence we are searching for it here. Note that here we can also specify the version of the image we want to use by adding a colon to the name, eg. nginx:1.0

Great…

Well it kind of is. Just by having docker installed you are able to run anything with no laborious installation or issues and you have been able to connect to it.

Killing it

So in Lab one I may have got a bit OCD and jumped into learning how to delete/remove containers

The only thing that wasn’t taught is how to terminate a container once completed. This is handily done by the command docker kill <nameGoesHere>

You can find the name of the container again by the command docker ps -q. ps stands for list containers — go figure…

This is sort of right and sort of wrong. This is the difference between closing a program or using Task Manager to Force Stop. See the link below:

If you don’t want to read it:

  • docker stop first politely asks please get ready to shut down (Aka SIGTERM), waits a little while and then hits the red button (Aka SIGKILL).
  • docker kill smashes that red button.

So I should have used docker stop and put the names of the container. How do I find the names of the containers. Well you can use docker ps or docker container ls. Both do exactly the same.

Also if you container has a default name (usually looks like a keyboard-headsmash like wtf67rs23hsa) you can save yourself typing the full name by typing as many letters as are unique to other containers. This means I can type docker stop wtf and there is a high chance that is unique so it will target that one and terminate it.

Docker Images: What are they again?

So I did intentionally gloss over that simply by saying:

An image is essentially and instruction set and materials required to create a container.

This isn’t wrong but we need more detail. An image is a tar file (similar to a zip file but zip files are compressed, tars aren’t).

How do they work? Essentially, we put images into docker and they act as the instruction set. This is fed into the docker engine which will spit out our containers. Docker will store (cache) images once you have downloaded them for future use.

If you want your work to be open to the public you can create a docker hub account and upload it to Docker registry. You can also run private registries.

Can we make an image?

100% yes. You need a few things.

You need to make a dockerfile. A dockerfile is a list on instructions which are stored as a text file. An example looks like

FROM ubuntu
ADD myapp /
EXPOSE 80
ENTRYPOINT /myapp

Once we have a dockerfile we can easily call the command docker build which runs the setup activities as well as the dockerfile commands.

Why are images good?

Well in addition to being a really simple way to transfer working code and run it on any computer/server, it also has a neat feature called Docker Image Layers which to my eye looks like a git integration. I say this because when you update an image, on a server for example, it doesn’t remove the previous image and install yours but instead recognises the main differences and just adds those as a layer. Because of this any changes can be very very fast. For example if you had a very large program, removing and uploading it would take ages, hence it is a magnitude quicker this way.

Another benefit is a reduction in duplication. If two apps have the same base layers (ie. they both are made of 5 layers but the base 2 layers in both are the same) there is not a duplication on the container in storage or running.

Back to it, Dockerfiles

We need to be comfortable with writing dockerfiles, which are simple text files, but they have a particular form. This form is a command in capitals and some command details after. We ran this:

FROM python:3.6.1-alpine
RUN pip install flask
CMD ["python","app.py"]
COPY app.py /app.py

Almost always we start with a FROM as this defines the base layer. In this example we use a particular python image called alpine which is smaller in file size than typical. You can see other options on Docker Hub.

Next we have RUN commands which are used as if you a writing in the terminal during the build phase. In this example we install a package with pip (a python package manager).

CMD is similar to RUN but it only runs at after the container has been built. They the first command that is run. As such, there can only be one CMD in a dockerfile. In our example, we use python to run app.py. This is translated into the first command when the container is built. Docker will turn CMD ["python","app.py"] into a command by simply writing it into the containers equivalent console. In a normal terminal, if you wanted to run the program app.py you would write python app.py. This is exactly what happens here.

Example of running the program on a normal console.

COPY is a weird one. It does copy the app.py file. More specifically, it copies the program to the container, but that is counter-intuitive to me as it is the last command. The reason for this glitch in sense is to do with caching. The CMD will only be run when the container is built and ready. It will always be the last command to be run. That makes sense, so why not put it last. COPY should always be the last line in a docker file as this is the thing that will change the most in future updates. For example, if you keep making changes to your program that will be the command that changes the most as your app picks up the updated program file. When docker files are updated they cache as much as possible — everything up to the change in the new dockerfile. As such, if we know something will be regularly updated we want it low down the dockerfile file.

More commands are available in the docs: https://docs.docker.com/engine/reference/builder/

Now build the image

Now we have written the dockerfile we can build an image/.tar file. NB: ignore the <<< >>> it is just to show it.

docker image build -t <<<name goes here>>> .

The -t allows you to name it. Make sure you are in the location of the dockerfile and python app (app.py in this case) when you run this command, otherwise it will not know where to get the information.

Summary of the next steps:

In the lab we next pushed our image file to DockerHub. We did this by creating an account and then, within the command line, using the docker commands login, tag and push. Similar to Github, we can now make changes and push those changes to DockerHub simply by rebuilding the image.

This also means we can retrieve the docker image easily too. We need to know the unique name (it is suggested you always put yourUsername/nameOfApp as the titles to ensure they are unique and easy to find).

Finally we stop all the containers we have created and then remove them all:

docker container lsdocker container stop <<<idOfContainer>>>docker system prune

Prune will go and delete stopped containers.

Why are containers good?

Let’s pause and review:

  • Containers can be run on any device/server without issue
  • They are lightweight and fast
  • You can use more containers on a host than you could with a Virtual Machine
  • Developing it is similar to npm and github ways of working

I hope you agree and can see this from what we have done above!

Part two is here

--

--