Skip to main content

Can Investors Trust AI Sales Figures? Asks Wall Street Journal Opinion Piece

6 days 8 hours ago
A Wall Street Journal opinion piece warns of "a troubling trend" in AI's growth. "Rather than selling software, some AI companies are paying their partners to use it." It cites OpenAI's $1.5 billion joint venture with private-equity firms, Anthropic's $200 million contribution to a private-equity firm joint venture, and Google's $750 million subsidization of Gemini's adoption by consulting firms. "These agreements muddy the distinction between a company's sound growth trajectory and artificial financial engineering." [T]he scale and structure of the recent AI deals go beyond standard incentive mechanisms... When a seller pays customers to buy its products, it is unclear if its revenue growth reflects vibrant demand or a willingness to accept subsidies. Slashdot reader destinyland writes: This warning comes from a prominent figure in the investing community. For six years Robert Pozen was chairman of America's oldest mutual fund company, after five years at Fidelity. An advocate for corporate governance, he's currently a lecturer at MIT's business school (and the author of the book Remote Inc.: How to Thrive at Work...Wherever You Are). "As AI companies prepare initial public offerings, investors should scrutinize their numbers closely," Pozner writes, warning about "time-limited financial support". "In evaluating AI sales figures, analysts should consider the distorted incentives that the recent financing deals create," writes Pozner: Private-equity firms, enticed by promised returns, might demand rapid rollouts of AI products, rather than ensuring their orderly and safe development. Portfolio companies of private-equity firms may embrace AI tools not because they are needed but because adoption is mandated by their owners. Consultants may favor one set of AI models based on the subsidy instead of the merits. If guarantees and subsidies are major factors in the rapid adoption of AI tools, investors should be skeptical of AI companies' revenue projections. Many of their customers enticed by consultants will stop paying full price when the financial incentives are gone. Many of the portfolio companies of private-equity firms could back away from selected AI tools once these joint ventures expire. The challenge with evaluating these AI financing deals is the lack of transparency. At present, AI vendors don't separate revenue driven by subsidies or joint ventures from standard sales. The lesson from the telecom debacle is that financial engineering can obscure, for years, the difference between real customer demand and demand driven by incentives. When AI companies begin to finance their own product distribution, guaranteeing returns to investors and subsidizing sales, it's a signal for investors to dig deeper. Investing in an AI company? Ask what percentage of enterprise revenue is coming from subsidized channels or joint ventures, Pozner suggests. And the renewal/retention rate for customers not supported by subsidies or joint ventures...

Read more of this story at Slashdot.

EditorDavid

Hope your holiday was horrid: You botched the last thing you did before leaving

6 days 8 hours ago
That box-full-of-old-tech-you-should-probably-have-thrown-out-but-kept-just-in-case got a techie in trouble

Who, Me?  Monday is upon us once again and The Register hopes that when you arrive at your desk, all is well. We offer that sentiment because we use the first day of the working week to bring you a fresh instalment of "Who, Me?" – the reader-contributed column in which you confess to making mistakes, and explain how you survived them.…

Simon Sharwood

Empty Pockets

6 days 9 hours ago

If you've seen one developer recounting how their AI agent deleted production, you've seen them all. They're mostly not interesting stories. It's like watching someone speeding through traffic on a motorcycle without a helmet: the eventual tragedy is sad, but it's unsurprising and not an interesting story to tell. It's not even interesting as a warning: the kind of person who speeds on a motorcycle without a helmet isn't doing so because they don't understand the danger. They've just decided it doesn't apply to them.

But the founder of PocketOS, Jer, recently shared how- whoopsie!- their AI agent deleted production. There's a lot of ingredients that go into this particular disaster, which I think makes it interesting, because the use of a poorly supervised AI agent is only one ingredient in this absolute trainwreck of a story.

PocketOS is a small company that makes software for rental companies to manage reservations. Car rentals are a big customer, but the tool is more general than that. They manage all of their infrastructure via a service called Railway. Railway is a pretty-looking GUI tool for automating your deployments and the target environments.

PocketOS also is heavily adopting Cursor wrapping around the Claude model. They've paid big bucks for the top-end model offered. Many of their components, like Railway, offer MCP services so that their LLM can do useful things. They're using the Claude LLM to automate as much as they can.

So far, this is all a pretty typical setup. They pointed Claude at their code and gave it a "routine" task, and sent it to work. It toddled through the problem and encountered a credential issue. It "decided" that the fix for this issue was to delete a storage volume and recreate it. It scanned through the code to find a file containing an API key, found it, and then sent a POST request via cURL to delete the volume in question.

Jer writes:

To execute the deletion, the agent went looking for an API token. It found one in a file completely unrelated to the task it was working on. That token had been created for one purpose: to add and remove custom domains via the Railway CLI for our services. We had no idea — and Railway's token-creation flow gave us no warning — that the same token had blanket authority across the entire Railway GraphQL API, including destructive operations like volumeDelete. Had we known a CLI token created for routine domain operations could also delete production volumes, we would never have stored it.

Wait, the tokens you create in Railway all have god-level privileges? That sounds like a terrible idea. And you were storing the token in your code? We'll come back to this in a moment, but sure, this is bad, but you can just restore from backup, right?

The volume was deleted. Because Railway stores volume-level backups in the same volume — a fact buried in their own documentation that says "wiping a volume deletes all backups" — those went with it. Our most recent recoverable backup was three months old.

Oh. Oh no.

Now, I don't think it's literally true that Railway is storing your backups literally in the same volume as the thing they're backing up. I certainly hope not. But they do apparently delete your backups when you delete the volume associated with them. Which is a choice, certainly. A bad one. And one that they documented, according to Jer. It was, in his words, "buried" in the docs.

But let's go back to the tokens for a moment. I am not a Railway user, but I checked out the tool and went through the process of creating a project token. And while no, Railway does not give you big red flags warning you "Hey, this token can do ABSOLUTELY ANYTHING", it also never gives you an opportunity to scope the token. Which, I don't know about you, but the first thing I do when I create an authentication entity is try and figure out how to control its authorizations, because I assume at the start it doesn't have any. That'd be sane.

The scoping happens when you create the token, depending on what context you're in when you do it. It's only a handful of scopes, and no fine grained permissions on API keys at all. The lowest level is "Project" which can do anything to a single environment- which does mean that even if you, like Jer's team, wanted to have a script that changed some DNS settings in production, that same key could be used to delete volumes in production. Which means you really really want to take care of that key, and you certainly don't want to leave it where some junior developer or bumbling AI agent can find it.

Jer also complains that Railway shouldn't allow an API call to take destructive actions without more protections, like forcing someone to type in the name of the thing being deleted or sending a confirmation email, or something. This, I'm more skeptical of. Most cloud providers don't offer anything like this in their APIs, at least that I've seen, because on a certain level, if you're invoking the API with the proper credentials, that's a big enough hill to climb that we can assume you've intended your action. The correct way to protect against this is properly scoped keys and keeping those keys secure and not just lying around in plain text. There's a certain aspect of understanding that you're using a potentially dangerous tool and need to take the responsibility for safety into your own hands; while a table saw can easily take some fingers off, it's perfectly safe when used correctly.

This is all bad, but how can we make it worse? Well, Jer demanded that Claude "explain itself". In a section called "The Agent's Confession", Jer highlights that the agent is able to identify the explict rules that it failed to follow.

Read that again. The agent itself enumerates the safety rules it was given and admits to violating every one. This is not me speculating about agent failure modes. This is the agent on the record, in writing.

No, it is not the agent on record. I see this kind of thing a lot when people talk about LLMs. An LLM cannot explain its reasoning. It cannot go on "the record". It cannot confess to anything. While what it plops out when asked might be interesting, it is not an explanation. The only explanation is that it's a powerful statistical model trying to create a plausible string of tokens! It's simply looking at its context window and your prompt and trying to predict what it should say. It can tell you what rules it violated not because it understands the rules or knows it violated any rules, but because those rules are in its context window. If you ask it right, it'll confess to killing JFK and framing Oswald for the crime.

Jer then tries to ensure that Cursor takes some of the blame, pointing to Cursor's "guardrails" documentation. Except, here, the documentation is actually quite explicit about what those guardrails guarantee. If you're using a first-party tool, it will prohibit unsafe operations. When using 3rd party MCPs, like Railway's, the only guardrail is that it requires human approval for every action- unless you update your allowlist for that MCP. If you put them in your allowlist, the guardrails go away. Jer argues that tools should enforce more protection against LLM behaviors, but the problem with that is people- like the PocketOS team- turn those protections off. And like a lot of safety mistakes, they can get away with it all the way up until the point where they can't.

Jer follows this by listing off a pile of other times using Cursor has caused disasters, which isn't making the argument he thinks it is: yes, Cursor is dangerous, but those dangers are well known. It makes the choice to turn Cursor loose without strict supervision seem even more foolish.

Jer writes:

For now I want this incident understood on its own terms: as a Cursor failure, a Railway failure, and a backup-architecture failure that all happened to one company in one Friday afternoon.

It's also a PocketOS failure. It's a failure to properly assess the tools and environments you chose to use for your product. A failure to read and understand the docs for vital features, like *backups*. A failure to employ even the most basic safeguards. A failure to put a second's thought into key management- even if that key was only for DNS entries, you still shouldn't chuck it in source control. A failure to have a competent backup strategy. It's worth noting that they did restore from a three month old backup, which means they were at one point taking backups outside of Railway's volume setup. That was a wise decision. That they stopped is a failure.

The first rule of disaster retrospectives is that it's never one piece that's the failure. It's never one person's fault, one tool's fault, one vendor's fault. It's a systemic failure. Railway's keys should be finer grained. But also, you shouldn't leave keys lying around. Deleting backups when you delete the volume is a terrible idea, but having only one service for backups (that's also your primary site) is a terrible idea. Claude's ability to enforce its own guardrails should be better, but LLMs are notoriously dangerous about this: you should know better, and by your own words you did.

This is not an anti-AI post, or even a "get a load of this asshole" post. It is a "understand the damn tools you're using" post. Be critical of them. Don't trust them. Ever. Especially LLMs, because the worst part of an LLM is that it takes away the one thing computers used to be good at: predictable, deterministic behavior. But not just LLMs: don't trust your cloud provider, don't trust your infrastructure manager. Dig into them and understand how they work, and if they seem to complicated to understand, than they may be too complicated to trust.

Update: As pointed out in the featured comment below, Railway did finally get a backup restored. So they got their data back. Yay? From the post, Jer remains committed to making this a Railway issue and not a PocketOS issue.

[Advertisement] Picking up NuGet is easy. Getting good at it takes time. Download our guide to learn the best practice of NuGet for the Enterprise.
Remy Porter