Beyond the Hype: Strengths, Struggles, and DevOps Practices that Really Work

ikerdomingoperez
Aug 19, 2025
7 min read

My wife is not technical, but she understands what I do better than most people in tech. When someone asks about my job, she says, “He makes things do things.” No mention of python, automation, or CI/CD pipelines, just the core truth that makes sense to anyone. I have tried explaining DevOps to SysAdmins and developers who look at me like I just made up a job title, and honestly, some days I feel the same way. Maybe that is the real problem with DevOps: we have wrapped simple concepts in so much complexity that even we forget what we are actually trying to accomplish.

devops skills — *It's a long way to the top if you want to do DevOps*

DevOps is no longer the shiny new concept people toss around in strategy meetings. It is the foundation of how modern software gets built, tested, and shipped. But as the tech evolves, so does DevOps practices, quietly, steadily, and sometimes in ways that are easy to miss.

We will look at what it actually means today, how it plays out in real teams, and how it connects to newer DevOps practices like Site Reliability Engineering, DevSecOps, and GitOps. We will also explore the strengths and limitations, and why it still matters, even as the tools, titles, and acronyms keep changing.

Modern DevOps Practices: What They Actually Look Like

DevOps is not a job title or a toolchain. It is a way of working that emphasizes:

Fast, reliable delivery through automation
Shared responsibility across development and operations
Continuous feedback from systems and users
Smart integration of security and reliability from day one

In practice, DevOps teams:

Automate builds, tests, and deployments
Use infrastructure as code to manage environments
Monitor systems proactively with logs, metrics, and traces
Collaborate across legacy platforms to improve uptime and performance
Integrate security early with automated scans and policy enforcement

The Expanding DevOps Universe

As systems grow more complex, DevOps has evolved into specialized branches. These are not replacements, they are extensions built on the same core principles.

Site Reliability Engineering (SRE)

Site Reliability Engineering term was created by Google. It takes the core ideas of DevOps (collaboration, automation, shared ownership) and adds a layer of engineering focused on reliability: Yes, your automation is awesome, but is the app really up and running?

SRE teams treat uptime and performance as measurable goals. They use metrics to define what “reliable” actually means, and they build systems that can meet those expectations without squeezing development. It is DevOps with accountability: every feature, deployment and incident is weighed against the impact on user experience and system health.

SRE works best when reliability is not some optional feature, but a design requirement. It encourages teams to automate wisely, respond to failures quickly, and make trade-offs between speed and stability based on facts.

DevSecOps

Security is no longer a final checkpoint. DevSecOps brings it into every phase of development, using automated scans, policy as code, and cross-functional collaboration to build secure systems without slowing down delivery. While security certifications are not mandatory, many DevSecOps-focused credentials (DevSecOps foundation, AWS Certified Security, etc.) are more accessible than traditional security certifications, which often require years of experience, formal mentorship, and frequent recertification (thanks K for sharing). These lightweight options allow engineers to build practical security skills without navigating the complexity and gatekeeping that can characterize legacy certification paths.

GitOps

Git becomes the source of truth for both code and infrastructure. Changes are made through pull requests, and tools continuously reconcile the desired state with production. It is especially powerful in Kubernetes environments. When done right, it is amazing.

But GitOps is not just a choice, it is a discipline. It requires that the entire team rows in the same direction, embracing declarative infrastructure, version control, and automation as core principles. Success depends on a shared mindset, clean practices, and powerful automation that enforces consistency without constant human intervention. Without that alignment, GitOps can quickly become a mess of half-declared states, broken pipelines, and manual overrides that defeat the purpose.

DataOps, MLOps, FinOps

These variants (and a few others) apply DevOps principles to data pipelines, machine learning workflows, and cloud cost management. Each reflects the constant need for automation, collaboration, and transparency in specialized domains.

Classic DevOps Strengths and Weaknesses

What works:

Speed: Faster releases and feedback loops
Reliability: Automation reduces human error
Collaboration: Shared ownership improves outcomes
Early detection: Monitoring and testing catch issues early

What still hurts:

Cultural resistance: Not every team embraces shared responsibility
Tool overload: The ecosystem is vast and often overlapping
Skill gaps: DevOps demands a rare mix of software, infrastructure, and security expertise
Overautomation: Automating without guardrails can cause bigger problems

Many talk about the positive aspects of DevOps, but not all about the negative points. So, let's review them.

Cultural Resistance

As we already mentioned earlier, DevOps is a mindset. You can't just hire a few DevOps engineers, throw them at a company that has been doing things their own way for decades and expect a change in months. Developers may not want to own operational concerns. Operations will refuse to give up control. Security teams may reject every attempt to change anything. Without buy-in across departments, DevOps either becomes a surface-level rebrand, or it just fails miserably.

Tool Overload

The DevOps ecosystem is enormous. There are tools for CI/CD, monitoring, infrastructure as code, container orchestration, secrets management, policy enforcement, and more. Many tools overlap in functionality, and some are so flexible they can be used to do things they were never designed for.

Why should I use terraform if i can create all resources with ansible?
why should I use ansible if I can write a python script to configure stuff?
why should I even use any tool if I can do everything with python?.

This leads to Frankestein stacks, duplicated effort, and fragile integrations. Choosing the right tools (and knowing when to say no) is a skill itself.

Skill gaps

DevOps demands a unique mix of software engineering, infrastructure management, and security awareness (and Linux, plenty of Linux). It is not enough to write code or configure servers, you need to understand how systems behave in production, how to automate safely, and how to design for resilience. Finding professionals who can span these domains is difficult. The term “DevOps engineer” is often used loosely, specially when the candidate does not have a clear defined profile (not a developer, not an operator, not a security engineer, I used Jenkins on my previous company so.. yeah, DevOps!). This leads to mismatched expectations between candidates and employers, and sometimes to underqualified hires.

Over Automation

Automation is a core DevOps principle, but it does not mean is always good or the best approach. Ever heard of "Automation ROI"? Basically, before jumping into automating something, ask yourself:

Is this task repetitive enough?
Is it time-consuming enough?
Is it error-prone enough to justify the upfront work of automating it?

If at least one of the answers is yes, go ahead. Otherwise, think twice about it. What would you do on the following examples?

Engineer needs to create dozens of Windows VMs. For each of them, the engineer needs to launch it after a given image, connect to AD, do some extra configurations, reboot, do sanity check and report the status. This engineer needs around 4hr to complete the work on each VM.
Operations team, almost every sprint, needs to perform a change on a production application. This application does not have REST API endpoints, so the changes must be done on the browser manually. Every time they receive a different runbook with detailed step-by-step of the change. The developer needs 2hr to write the runbook, the operations team takes another 2hr to complete the change.
Operations team, every week, need to create new AMIs for the applications they support, at least just to fix CVEs. Then they need to update the infrastructure code with the new AMIs and deploy to lower environments. A regular operator needs the entire day to complete this task on all the applications supported.

In my opinion, examples 1 and 3 are clear automation tasks.

The first one is trivial enough with Ansible. May take more or less depending on the ansible skills, but those VMs can be created in bulk if needed. The ansible code can also be reused or adapted for future requests, and the knowledge can be used for inventory management.
The third one is also easy pick, in this case I would go for python as orchestrator. It may take a few hours/days too write the script, but it will save the operator a day a week. The script approach is also scalable, while the operator may need eventually more time if the number of applications grow, or it could potentially left behind if other requests arrive with higher priority. Also think about the poor operator, repeating the same thing over and over again.

Number 2 is up to debate, but for me definitely not worth it.

While Selenium could technically handle it, the changes are irregular and the runbooks vary each time. Automating this would require constant updates to the Selenium code or a significant upfront investment in building a complex, generic, reusable framework that can interpret instructions dynamically.
It’s doable, but not realistic. It would demand heavy maintenance, deep testing, and would likely break often. In short, it’s a fragile solution for a task that’s better left manual.

Automating without clear boundaries, monitoring or proper testing can cause cascading failures. It can wipe out environments, push broken code to production. In short, a flaky script can fail more often than it succeeds, undermining the credibility of DevOps engineers and make your company appear on the newspapers (not in a good way). Automation should reduce risk, not amplify it. Guardrails, rollback mechanisms, and observability are essential to keep automation safe and sustainable.

Why DevOps Still Matters

Going back to the wins:

Users expect frequent updates, zero downtime, and fast bug fixes. DevOps makes that possible.
Cloud-native systems are complex. DevOps provides the automation and observability needed to manage them.
Security, cost, and scale are no longer separate concerns. DevOps helps teams address them with confidence.

Final Thoughts

DevOps is not a trend, again, it is a mindset. It is about building better software by working together, automating wisely, and learning from failure. DevOps encourages teams to break down legacy pipelines, take ownership of their work, and create systems that are resilient, scalable, and maintainable.

It’s not about avoiding mistakes, it’s about learning from them. It’s not about following a strict set of rules, it’s about evolving your practices to meet the needs of your team and your users. And it’s not about automating everything, it’s about making the right things do the right things.