Stay updated with our latest articles, subscribe to our newsletter
Start with Development
What does any development project start with? It starts with writing codes using a selected programming language by software developer(s). If it is a simple project and one person is working on it, it is sufficient to have source code stored in files and folders on a local or remote storage. But when a project is being developed and maintained by more than one person – some challenges appear and questions to be answered.
Questions like: Who made what change, when, why and how it affects the rest of the project core or changes made by other maintainers/contributors? And what to do with these changes – accept them, reject, or send back for fixes and refactoring?
Fig. 1. Challenges and questions to be addressed by VCS
Let us dig a little deeper into these questions and see how version control in DevOps makes a big difference in the software development process.
Who made what change?
Who made what change? – each developer that works with the same codebase in a collaborative environment adding new lines, deleting existing, rewriting/refactoring codes according to his task (it can be a bug reported he has fixed or a new feature). So, it is important to be able to identify the author of each change in the codes and be able to tell at what point in time this change was made.
Why would we want to know it? Well, let us move on to the next questions and it will become clear.
Why did this developer make these changes? Were they related to an issue ticket or was it a planned feature? Knowing the time and the reason why these changes were made we could understand and make a decision how they affected the solution.
So basically, now we have a set of problem statements why we need a tool for tracking the development results, and we can shape out some specific requirements to this tool:
- Provide ability for multiple users to access source codes and change them
- Show the difference between code revisions (current and previous versions of files)
- Be able to accept or reject changes
- Have ability to restore state of codebase to a particular timestamp in the past
- Control access rights – define who can do what
All these capabilities are provided by a specific type of software – Version Control Systems (VCS), part of a Software Configuration Management (SCM) class of systems. It is also an integral building block of Continuous Integration (CI) toolset which makes it necessary for any DevOps engineer to be familiar with such type of software and be able to configure and use it.
In fact, VCS has a broader area of application than just tracking of programming source codes – any type of files can be tracked in such systems. From DevOps perspective, it is also common to manage configuration files in such repositories, but it can be any artifacts that you would like to track versions that have all the above-mentioned features, for example, documentation for a software project.
History and Classification
First predecessors of VCS were introduced somewhere in the middle 1960s – early 1970s by IBM and other large corporations, but at that time it was not even a separate type of program and had quite limited functionality. Nowadays, there are many both open-source and proprietary programs that you could choose for your project. They all vary from each other in some ways,and you can check their basic comparison here.
But let us see how they differ from each other and figure out what DevOps version control system you would want to use in your own project as a Developer or DevOps engineer. Even though most of these systems solve similar problems and provide similar capabilities, there are some significant differences in how the version control ensures the files are stored and tracked.
Local vs Centralized vs Distributed vs Combined
The simplest approach possible, when a developer wants to track his codebase changes locally – he might use simple “copy & paste” commands to copy the whole structure of the project to a new folder with a timestamp and probably comments. In this case he will maintain a snapshot of the old codebase and will have a fresh copy to continue working with it. This method works fine when you have a linear and straightforward way of developing your software – adding feature by feature consequently and just want to save your project at some milestones before you start working on some new feature.
Local Version Control Systems (LVCS)
Local Version Control Systems (LVCS) work in a similar way – it is basically a database that records changes to files for each revision of codes, and adds timestamp and revision version. Most systems of this type were developed long ago but they are still commonly used, examples of such systems are: RCS (Revision Control System) which is distributed a part of most Linux distributions and SCCS (Source Code Control System).
Fig. 2. Local Version Control System
This kind of VCS tool solves only part of the problems defined earlier in this post. They really do not help in collaborative development much since they are local and cannot be used by many users concurrently. Another disadvantage is that it is difficult to work on different features/fixes at the same time since local VCS does not support branching – creation of code duplicates for parallel work.
We can conclude that Local Version Control Systems are suitable mostly for single person developed projects (small to medium size) and provide only basic functionality. Since this concept of local source code management is not scalable, you should avoid using such systems in any of your projects unless you have a good reason to do it. There are much more advanced version control tools that you would prefer to use even from the beginning.
Centralized Version Control Systems (CVCS)
Centralized Version Control Systems (CVCS), introduced slightly later, were to facilitate collaborative development and they are also called “Client-server”. They have a central server which stores the master version of codes – master repository and multiple clients that can connect to the server to check out (copy files to local machines) and check-in changes (upload them into the repository). The best examples of such systems are: SVN (Subversion), CVS (Concurrent Versions System).
In this type of DevOps version control system, master codes and all the changes are stored as snapshots on a central server. Developers can take a current revision and download it to a local machine – the main benefits of version control systems is that they are pretty easy to control and limit a subset of folders and files available for developers to download, so called “selective checkout”.
At the same time, by its nature, Centralized VCSs depend on one main server, which, if not available for some reason, blocks developers from checking in (uploading) their work and in case. Another drawback is if the central server is lost then all codes are also lost. To prevent such situations from happening, it is necessary to configure backup and HA (high availability) for such servers. So, to use such a system, it is crucial to have a stable connection to the central server for development and operations continuity.
Fig. 3. Centralized Version Control System
Distributed Version Control Systems (DVCS)
Back in 2005 Linus Torvalds, father of Linux, introduced a new concept – Distributed Version Control System (DVCS) and an implementation of it – a system called Git. Since that time, the system has become extremely popular and its name sometimes is even used as a synonym to Version Control System.
In this new approach – a developer downloads to his or her machine not only the latest snapshot of the codebase but the whole history of changes, basically becoming a full-scale source code server mirror. Of course, more space on a local machine is used, but in this case the redundancy naturally helps to backup a full repository on each developer’s machine locally, so there is no single point of failure in such infrastructure by design – that is why it is called Distributed.
Also, the distributed version control tool provides much better flexibility and each developer’s workplace becomes much more independent and autonomous even without connection to other computers or servers.
Almost at the same time some other distributed systems were developed using the same approach, such as: Mercurial and Bazaar.
Such kind of source code versioning tool has some drawbacks also – it is much more difficult to control access rights for a subset of files in the project than in centralized systems. Also, if a project is big and complex, but a developer needs to work only with a small part of it – he or she still needs to download the whole repository.
DVCSs are suitable for distributed development teams where each team member can independently work on his part of the project for quite a long time and there is no centralized master server. Code synchronization with other code repository copies is performed on demand by a developer who is ready to share his results of work with the others.
Fig. 4. Distributed Version Control System
Some of the systems combine approaches and can work in distributed and centralized modes, such as Breezy (fork of Bazaar).
Which One to Choose?
Throughout the years, the popularity of distributed VCS in general, and specifically Git, has grown and significantly overcome the dominant player of the 2000s and early 2010s – SVN. Now, almost three out of four (72%) development teams choose Git or Git-based Cloud repositories. SVN is also used but only by 23% of developers. The rest of the systems share the remaining 5% of market share.
General recommendation is – if you are joining a team or a company that already has a VCS chosen for their needs, it makes sense to learn how to work with and master it. But at the same time – for your own project, it is better to use the most commonly used Git or its Cloud versions (GitHub, Gitlab or Bitbucket). In most cases, this will be a well-known system for those who would want to join you and contribute to your project. Nowadays, most Cloud Service Providers have managed versions of git-based repositories, so if you have general knowledge of Git – then you will be able to switch between Cloud repositories in a smooth and continuous flow.
Fig. 5. Git-based Cloud Repositories and other VCS
If you want to know more about VCS and other DevOps & CI/CD software tools – stay tuned for updates in our blog. To get hands-on experience and learn how to work with Git – join our Project Based Leaning courses, where you will implement real life use cases with help of modern DevOps tools and best practices.
Liked this article?
Stay updated with our latest articles, subscribe to our newsletter
You may also like
Demystifying Devops 2