The history of our industry seems to move in circles, or better yet, in a spiral that always comes back to the same point but at a higher level. Back in the day, things were simple: the software we built was a lonely inhabitant, an island living and dying inside a single computer. Those were the times of local databases and physical file transfers, where control was absolute but collaboration was an artisanal struggle.
Soon after, we learned to weave networks. We moved from the isolated PC to the Local Area Network (LAN), where software started to breathe in groups within the office. We still held the keys to the kingdom: the server was right there, in a small air-conditioned room at the end of the hall. It was an era of physical trust, where we knew exactly which cables our bits were traveling through before hitting a coworker's terminal.
Then came the great migration: The Cloud. That massive "swarm" of invisible servers that promised us infinite scalability and access from anywhere on the planet. As developers, we got used to "uploading everything" as the new normal. We outsourced the infrastructure and, almost without realizing it, we also started outsourcing data sovereignty. If it was in the cloud, it was modern; if it was local, it felt old school.
However, with the AI explosion, we’re witnessing a fascinating twist. While most people interact with AI through massive cloud services, in the corporate and high-performance world, a quiet but powerful move back to local has begun. The reason is simple and critical: security and privacy. Companies have realized that sending their trade secrets, internal processes, and "know-how" to an external LLM is like leaving the keys to the vault at a hotel’s front desk.
I’ve been living this transition firsthand, installing RAG (Retrieval-Augmented Generation) systems in business environments. It’s an extraordinary experience seeing a local LLM, running on the company's own infrastructure, start interacting with internal data and processes without a single byte leaving the network. The way these models talk personally to employees, advise managers, or analyze critical metrics for CEOs in an extraordinary way is nothing short of revolutionary.
To be clear, when I talk about installing local AI for businesses, I’m not talking about a generic chatbot answering basic questions. I’m talking about models with 32 to 70 billion parameters running on servers that a mid-sized company can actually afford and maintain without needing a fleet of data scientists. These models, in independent benchmarks, are now going head-to-head with the big cloud services; something that wasn't even possible two years ago, but is now a tangible reality.
To put it in a concrete technical context for us: a server with two 24GB VRAM GPUs hardware perfectly accessible for a company can run a quantized 70B parameter model with total fluidity. If we go up a notch, four professional GPUs like the NVIDIA L40S or RTX 6000 Ada allow serving that same model to multiple simultaneous users within the organization with latencies that feel instant. This isn't science fiction; it’s what I’m installing and seeing in action every day.
While the AI-in-the-cloud hype is undeniable because of its ease of use, there’s a major trend toward going back to local to protect the most valuable asset: information. As a community of software solution developers, I think it’s an exciting challenge to integrate these capabilities into our development logic, proving that the most advanced tech today isn't necessarily the one furthest away, but the one we manage to master and secure right at home. ![]()




