There are very few books written for junior-to-mid level developers that answer the question “How do I run a real-world software project?”. Industry best practices often arise as the result of cross-pollination and institutional/tacit knowledge rather than explicitly prescribed rules that you can read about in a book.
For developers working alone or in small organizations such as startups, these norms may not be obvious. Knowing which tools one needs to deploy a production-scale application is crucial knowledge.
In 2019 I published the article “Software Tools for Hobby-Scale Projects.” It is still one of my most popular blog entries. This post will explore the same idea within a professional context and hopefully help new or solo developers get guidance on tools and practices for new projects at small to mid-scale organizations.
Objective: Provide a list of tools and practices that apply to a majority of real-world software projects.
Intended audience: Developers familiar with software authorship wishing to learn about real-world software deployments and practices.
The 12 Factor Methodology
If you run an application with real customers, you should use the 12-Factor Methodology. The 12-Factor Methodology is a set of practices that improve commercial software development and will make your software easier to manage over the long term. Many recommendations are simple yet overlooked (example: “Store config in the environment”). If your application runs on Docker or Heroku, you may already practice many of the 12 factors. If you have not yet read the entire document, I recommend giving it a look. It’s a straightforward read and offers a great set of practices for structuring new and existing projects.
A Reliable Version Control System
Software development is costly. The result of that process is computer source code. As the organization and its related software projects grow in complexity, you will need a reliable way to:
- Create backups of the project source code.
- Revert problematic changes to the project.
- Figure out who last modified a particular part of the codebase.
- View historical versions of the project.
- Merge changes from two developers who created parallel versions of the project.
- Figure out what changed between different versions of a project.
Most (but unfortunately not all) developers in the software industry have adopted version control systems (VCS) to solve the issues noted. For all of my software projects, I use Git as my VCS, which is the current industry leader. Alternatives such as Pijuul, Fossil, and Subversion are also available.
For some, choosing to use version control on a project may seem obvious that it does not even require mention. Unfortunately, there are still companies out there that like to host zip files on FTP servers as a form of pseudo-version control. Often this situation is the result of a developer’s unwillingness to learn a version control system. Opting out of VCS is inappropriate in all cases.
Choosing to build an in-house version control system (or opting out of version control) will cost you and your organization time and money. You will not create a solution that is more robust than Git. You will lose parts of your source code and historical information about the project. A good VCS can fix severe problems like data loss in seconds. Conversely, without a good VCS in place, certain classes of issues may take days or weeks to resolve if they are fixable at all.
You should use Git on every professional software project. You can find alternatives to Git if you genuinely don’t like it, but you must use some turn-key version control system and not build your own. Failing to use a real VCS will result in a loss of time and data.
Although I’ve yet to find a version control system that is easy (or enjoyable) to grasp at first, you still need to eat your vegetables. The benefits of having a version control system cannot be understated, and that’s why it appears in the top half of this list. You need to invest the time to learn it, reap the benefits, and move on.
You can find a good starting point here.
Source Code Hosting
Even if you are the only developer on a project, it is still a good idea to host redundant copies of source code somewhere other than your local machine. In the early days of the internet, it was not uncommon to hear horror stories of shareware projects that ceased development for no reason other than the lead developer had lost the source code. The advent of free and cheap source code hosting has made this type of disaster less common.
Additionally, if you are working on the project with other developers, you will want to discuss and review changes to the Git repository as they happen. Many source code hosting tools offer extras such as features to review and discuss incoming changes easier.
The most popular options are:
- Gitea (self-hosted)
- Use Git’s internal server tools (the DIY option)
The vast majority of developers will choose Github as the current industry leader. Github offers a broad feature set and decent free plans. The market for hosting solutions widened in recent years, with several alternatives gaining traction among developers that wish to avoid Microsoft products.
TL;DR: You will need shared source code backups no matter your project or team size. Many companies that offer source code hosting offer additional features alongside the hosting such as webhooks and pull requests that make the product offerings extremely useful.
Automated Database Backup System
This one is obvious but overlooked. Your backup system needs to be automated- it’s too easy to forget, and the consequences are disastrous (see the previous section about version control backups). Once you have a backup system that works, perform periodic “fire drills” to ensure that you can handle backups properly in an actual database failure. If your backup system is cloud-based, be sure to create redundant hard copies periodically. There is no guarantee that your data center will not burn to the ground.
I currently manage my database backups with Heroku. There are too many options to name in this space. Your main concern when shopping for a backup system should always be reliability.
TL;DR: Database backups prevent disasters and are easy to set up, but you need to regularly make sure they will work in an emergency.
A Testing Framework
During a project’s lifetime, the scope and importance of features will only increase, and so will the need for stability. Project maintainers ensure the reliability of a codebase by testing it. In the early days of a software project, it is easy to ensure things work. You can manually click buttons and pass in data. Project maturity brings an inevitable challenge- there is no way your team can manually test every change before a release. The number of ways data can be passed and combined makes it mathematically impossible to cover every possible use case within a single release cycle. How can you be sure that a recent update will not break other parts of the application?
For decades, software developers have solved this challenge by writing a second software project to go alongside the main project. This second codebase is known as a “test suite,” and it will automate away the process of testing the main application. Sometimes, the test suite will click actual buttons on the screen. These sorts of tests are known as “integration tests.” A more common approach is to only test functions and modules in isolation. This second approach is known as “unit testing.” Unit testing is more common in the wild because it produces consistent test output, is fast to run, and is easy to setup. Although unit testing is easier to manage, its main criticism is that it does not simulate real-world conditions effectively as integration testing.
Philosophies abound on the most effective way to approach automated testing. There is also extensive jargon and countless books written on the subject. If you add a test suite to a project for the first time, do not be discouraged by philosophical debates. The most important aspect of a test suite is that it exists. Testing is an art that takes time to master, and the best way to increase your project’s reliability is to begin writing a large volume of automated tests as soon as possible. For a delightful satire article on the subject, see: “Expert Excuses for not Writing Unit Tests.”
Once you are committed to the idea of automated testing, you must pick a library to automate the boring parts. The options are language-specific, and every language ecosystem has a good set of options available. The ubiquity of high-quality testing frameworks means you should not attempt to create an in-house testing framework.
TL;DR: Manually testing a product before each deployment is mathematically impossible for small teams. Startups and solo projects need a way to test the product for defects in an automated manner.
A Test Coverage Reporter
Once you have a large volume of tests, you will find that they detect many problems that humans could not have noticed. Getting a project to that point takes considerable effort. A codebase will receive new features and remove old ones as it matures. If left unchecked, a test suite’s usefulness will deteriorate with time until it no longer is a helpful tool for defect prevention. Developers need a way to ensure that every line of code added to a project has accompanying tests.
The term “test coverage” is used in the software industry to determine what percent of application code is touched by test code. Under ideal circumstances, your application test coverage numbers should only ever go up. For real-world applications with thousands of lines of code, enforcing test coverage is difficult, even in teams where everyone is motivated to write tests. Luckily, there are tools available to record test coverage trends. Running these tools before merging new code is effective in maintaining a test suite’s usefulness.
I use CodeCov and Coveralls to manage code coverage metrics. Both tools are excellent industry leaders, though I find that each product has strengths and weaknesses depending on the project’s language. I recommend trying both and deciding for yourself.
TL;DR: Even with the best intentions, a test suite will lose effectiveness over time due to neglect. Measuring test coverage will alert you to these situations and inform you of which tests need updates.
“Works on My Machine” is a horrible state of affairs in software development and the punchline of many jokes. Running software consistently, reproducibly, and reliably in any environment is a mostly solved problem for small to mid-sized projects. Many of the issues of environment inconsistency (e.g., accidentally running the wrong version of a library) are avoidable by running applications in “containers.”
Below is a list of situations that containerization improves:
- Running a test-suite in a clean environment, free of state from previous test runs.
- Setting up a development environment for new team members
- Keeping multiple versions of the same application running in parallel (example: test, staging, and production environments).
- Creating temporary sandboxes, such as those required to run tests against proposed changes to a codebase.
- Re-creating an environment from backups after hardware failures
In 2021, Docker and Docker Compose are the most used solutions. They solve the problems by hosting software in a “container” with a clean slate on every run. When the application restarts, the container is destroyed. No state is saved for the next run unless you explicitly allow it through the use of “storage volumes.”
Currently, I use Docker Compose to manage my containers. Although Docker-based tools still reign as the defacto industry standard for containerization, a cohort of competitors have taken root in recent years. I have not investigated these alternatives because I am still satisfied with Docker and Docker Compose. I welcome comments on this matter.
TL;DR: Running software in containers will prevent most (but not all) environment-specific software problems.
Automated Deployment System
A simple way to reduce the severity of bugs is to deploy as often as possible. Anyone who has worked on a large-scale enterprise project with 8-month release cycles will know what I am talking about- massive releases create huge error backlogs and upset customers. The need to batch significant changes into a single release is a relic of a bygone era when software lived on CD-ROMs and floppy disks. Publishing minor incremental releases regularly should be the norm for most projects. There is rarely anything to gain by batching multiple changes into a single release.
Once you have a reliable test suite that runs in a container with good version control practices, you should automate software releases as much as possible. There are exceptions to this rule, but they are just that- exceptions.
In the last decade, the industry has seen the rising use of “push to deploy” offerings. The premise of such tools is that you can release your software more frequently (multiple deployments per day) by pushing changes to a version control repository managed by a specialized tool. These tools handle the most common use cases, such as pushing changes, reverting changes, restarting applications, and managing configuration. All actions are completed in a single step and with minimal developer intervention. These tools take time to learn and set up, but it truly is a better way to ship software. High adoption rates are a testament to their effectiveness.
The ecosystem for push-to-deploy tools is diverse. In the past, I’ve automated the deployment of micro-projects with Dokku. For projects where reliability is a more significant concern than cost, I use Heroku. Many language and platform-specific tools are out there. Research the tools offered by your ecosystem.
An alternative title for this section was “Continuous Delivery and Integration Systems.” I decided against this title because those terms come with extra baggage and may distract from my main point- you need a sound system to publish new versions quickly. CI/CD practices propose many things beyond increasing deployment time. Readers who are unfamiliar with CI/CD are encouraged to research the topic.
TL;DR: You should automate most aspects of your deployment process and deploy as frequently as possible to increase reliability.
Real-time Error Reporting
What’s valid with Pokemon is also true with production error reports: You gotta catch them all. When faced with a software defect, very few users will take the time to report it to you. When they do, they are already upset. Unreported errors aren’t good for business since these bugs can translate to lower usage and overall satisfaction.
Real-time error reporting tools help keep track of errors as they happen. They also give you a head start on fixing the problem. Once captured, the tool immediately reports the error to the appropriate communication channels, such as a team Slack or staff email.
A good error reporting tool will capture the error and additional information, such as the user that triggered the error and any relevant stack trace(s). In my experience, error reporting tools are universally easy to install and come in a wide range of price points (with free plans available).
(Sometimes) Performance Monitoring and Profiling
Identifying performance issues in production is hard. It is especially hard for mature, real-world applications servicing production workloads. In addition to being a complicated process, it is also essential to an application’s reliability. The solution is never as simple as
npm install performance. Performance monitoring tools help you untangle the situation by recording statistics about your application’s runtime characteristics.
Typically, you install a library into your application code and deploy the app to production. Once deployed, the performance monitoring solution will store and upload runtime metrics. You can view graphs of the data in a dashboard, hopefully giving you a better picture of application hot spots. These tools are often language and platform-specific. Many Ruby on Rails performance monitoring tools can help identify numerous anti-patterns and mistakes, such as N+1 queries in ActiveRecord. Although the tools do an excellent job of isolating bottlenecks, it is ultimately the developer’s responsibility to devise solutions for each hot spot.
I can’t say I have a favorite performance monitoring tool at the moment. The landscape for performance profiling and monitoring tools appears to have two main kinds. On the one hand, you have high-quality tools that are significantly overpriced, with pricing tiers intended for large enterprises. In my experience, performance monitoring tools are one of the highest-priced services that a software startup will pay for, aside from server hosting. On the other hand, there are cheap or free tools that are only marginally useful and do not mirror their competitors’ offerings. I’ve yet to find a performance monitoring solution that offers a good feature set and affordable pricing plans for startups and small businesses.
Depending on your project’s scale and maturity level, it may be possible to delay the adoption of a performance monitoring tool. In the case of specialized software used by a small userbase, commercial performance monitoring tools might offer more cost than benefit.
The tools and practices above build a basis for most real-world applications. Did I miss anything? Feel free to post a comment if you would like to share real-world experiences about tools and practices that make your life easier as a developer.
Special thanks to Daniel Legut for proof-reading earlier versions of this article.