Developer and refugee from Reddit

  • 1 Post
  • 18 Comments
Joined 3 years ago
cake
Cake day: July 2nd, 2023

help-circle
  • Very serious. Your personal amount of usage means nothing at all in this conversation. It is entirely about tokens per watt. The amount of energy the memory operations involve scale incredibly well when people are accessing the same object in memory simultaneously. Last I looked it was around a 10x difference for the same models efficiency.

    Hold up. Are you talking about caching? Because if you are… yeah. That has nothing to do with the model and everything to do with the service layer around the model. The same service layers can be - and have been - implemented in tools like Lemonade Server, llama.cpp, Ollama, etc.

    And I really do want to know your sources.

    Mine say GPT 5.5 is probably using quite a lot more than 0.34 Wh per query (0.34 Wh is what Sam Altman claimed for the then-current version of GPT in June of 2025, but he hasn’t released numbers since then and no one has done an independent analysis). With Claude, an independent estimate from last year pegged Sonnet at 0.8 Wh for a short prompt, 2.8 Wh for a medium one, and 5.5 Wh for a long one. Current numbers are, again, almost certainly much higher. And just for fun, there’s DeepSeek (which I’ve never used and never would use), with the reasoning-tuned DeepSeek-R1 hitting a whopping 29 Wh for a complex query.

    Meanwhile, small, open models are probably in the 0.07 - 0.2 range, depending on the model, the hardware it’s running on, and the nature of the query. Of course, there are much weightier open models too, with ones like Llama 3.1 405B using about 9 Wh for a medium-length prompt. On the other hand… who is going to run that on their local machine?

    Look… If I’m wrong, and using local models the way I do - sparingly and infrequently - really does consume more electricity than using Claude Code, I want to know. I have no problem whatsoever with eschewing AI models entirely, since I despise all of them. But given how tight-lipped OpenAI and Anthropic are about energy consumption per average prompt, and what independent analyses have estimated, I am highly skeptical that they are acting as some sort of paragons of environmental stewardship.


  • You’re probably burning more energy turning it off and on again. It doesn’t really use any noticeable power sitting idle.

    I am absolutely not burning more energy than a frontier model by doing things like putting my laptop to sleep or shutting down unused services when I want to conserve battery power.

    Anyway, a direct comparison would be pretty difficult because your model is probably tens of billions of parameters, not over a trillion.

    True.

    Energy consumption per output token will probably be a bit higher for the frontier models but something that people have found is that higher quality models often need fewer tokens to achieve the same goal.

    That’s actually not true. In fact it’s much the opposite. Frontier models churn through tokens at a much higher rate, because of their higher complexity and higher number of parameters. Research is still new on this, but having a frontier model analyze your code files versus a small, local model for the same task seems to be enormously wasteful. If you must use a frontier model for something, have it do that work after receiving the output from an agent using a small model to read and summarize your code.

    Plus how many times do you re-prompt your local model vs Claude Fable or Opus for example to get the desired result?

    …Almost never? I’m not a fan of letting AI do much of ANY of my coding, because it will inevitably bloat my codebase with garbage regardless of which model I use. So I severely restrict my model usage to simple, clearly-defined, narrow-scoped tasks that can save me a bit of time, and that’s it. With guardrails and discipline like that, I barely ever have the need to re-prompt.







  • Hey, before I say anything else, I just wanted to tell you I’ve been enjoying this conversation. It’s nice to be able to disagree with someone on something without it becoming a religious war. :)

    I’d say they’re the ones most likely to get screwed on this but I just don’t see how a tech titan drawdown causes the whole world go into a great depression

    I don’t think it will cause a great depression, but I do think it will cause a massive recession. If you look at the S&P 500, 35% of it is tied up in AI-related stocks. If AI crashes out, that’s a truly massive hit to the domestic economy, and there are certainly going to be ripple effects throughout the world.

    Honestly, the best thing the world could do right now is something a lot of countries are already scrambling to do because of Trump: Decouple their economies from the United States.

    Maybe I’m just too optimistic but if captain dickfingers giving the world a tariff hit, a global pandemic and “the worst oil crisis ever” can’t put a dent in things, I just don’t think a pullback on AI spend is going to do it (unless there’s a ton of other structural problems in the global economy that bloggers will point out how obvious it was only once it fails)

    Dickfingers! Excellent nickname for the orange sack of shit. But I’ll remind you of the old saying: The market can remain irrational for longer than you can remain solvent. And right now, the market is unbelievably irrational.

    The valuations and market caps of these companies are completely disconnected from their profitability (or extreme lack thereof). Among all of them, only NVIDIA seems to be making actual money, and even with them there are some indications of very esoteric math being involved, too. They’re “investing” money into the AI model providers and then having the AI model providers use that money to buy NVIDIA GPUs. Then they book those sales as profit, even though it’s the same money they just invested.

    It’s not something that can continue indefinitely. Either the model providers have to show that their unit economics works - by putting actual profit into actual bank accounts - or they will eventually hit a point where no matter what the funny math on their books says, they don’t have the cold, hard cash to pay their bills.

    Remember how Uber was about to go broke? Not going to lie, this literally feels like just more circlejerking about companies that are spending a lot and will go broke any day now…

    Maybe. But there’s a material difference between Uber / Spotify, and the AI companies. Did you ever look at the SEC filings to go public for either of them? I actually did. They had detailed plans for how to eventually achieve profitability. Uber went public in 2019, and wasn’t profitable until 2023, but they had a solid roadmap for getting to that point, and now they’re consistently profitable.

    Our ability to look at profit plans is limited at the moment. The only publicly traded AI company right now is SpaceX, and their filing is… hilarious. Grok will never be profitable. Not remotely. And their filing is full of WeWork-style insanity. They don’t have anything remotely like a roadmap to profitability. Their stock price is 100% speculation. Yet it keeps managing to tick up over $170, at least for the time being.

    Speaking of WeWork, I think they’re the model SpaceX, OpenAI, and Anthropic are following. The company raised $12.8 billion in financing, and ultimately reached a valuation $47 billion, mostly from investment by SoftBank - the same bank funding a lot of AI companies now, and which owns 11% of OpenAI.

    But it never made profit. It never had a roadmap for profit. It never had any means of bringing in income higher than its operating costs. WeWork declared bankruptcy a few years ago.

    Reddit and Lemmy were right about WeWork. So the question now is if the AI model providers are as economically unviable as WeWork always was, or if there’s somehow a path to profitability like Uber or Spotify. SpaceX’s filing doesn’t fill me with hope on that, and the fact that both OpenAI and Anthropic are delaying their own moves to go public doesn’t fill me with hope, either. It’s not the behavior of a company with unit economics that work.

    Side note: None of this is to say that Claude Code or Codex or any of their coding tools are bad. I just don’t see how they can operate them profitably. If you’ve got a good product, but the only way to get companies to adopt it is to sell it at a loss, you will eventually fold.

    Sorry if I hold a grudge and you are not as dim-witted and not good with computers as the average lemmy user, it’s hard to shake hearing the same prophecies I’ve been hearing about other high spending companies for decades

    That’s fine. It’s actually why I mentioned WeWork as a counter-example. Because you’re right, the group-think on both sites (Reddit and Lemmy) can blind people.

    On inference alone, both companies project profitability

    https://aiafterhours.substack.com/p/openai-vs-anthropic-the-121-billion

    https://www.tradingkey.com/analysis/stocks/us-stocks/261756528-anthropic-openai-ipo-tradingkey

    That’s true. They project it. Using non-GAAP accounting, and without letting anyone know in detail how much computing for inference costs. Their claims resemble WeWork’s claims, pre-bankruptcy. Bluntly, I don’t believe them.

    And I really have to emphasize this: Focusing on the cost of computing for inference all by itself and excluding all the other costs of the business is just crazy, even if inference when looked at by itself can be theoretically profitable.

    They are very clearly in the business segment, I’ve heard nothing like this for Qwen or GLM or local hosted models, in fact when self hosting was bought up the dev’s mentioned “??? you can’t self host claude ??”

    No argument on Claude being in the business segment, that is absolutely true. But at my company at least (again, this is admittedly an anecdote), the skyrocketing cost of tokens has us working on implementing local models and models on the network edge. We’re also severely restricting token budgets and having devs do as much as they can by hand.

    Maybe your company hasn’t reached the point where tokenmaxxing with Claude is frowned on, but the costs are enormous. And the thing is, they have to be enormous for Anthropic to have any hope of ever recouping their losses. It’s not like Claude is a loss-leader, it’s their only product.

    This is like saying I work with actual system admins, they say that Windows is terrible and that it’ll be the year of the linux desktop any day now

    You need to vibe check the office and devs vs the engineers

    The vibe in the office (and what I’m reading) is that claude is gold

    I do. I am one of the devs in the office. My boss currently loves me, because my token cost is $0 and I still get my code written. I do most of it by hand, and some of it with Qwen running on my local system (I have a workstation with a good enough GPU to run it). I have it wired up so GitHub Copilot uses it. And I’ve been teaching other developers how to do it. Once our current hardware refresh cycle is complete, our token budget is predicted to drop to almost nothing. We’ll probably still use Claude here and there, but the bulk of our work will be doable without it.

    In terms of quality, Qwen 3.6 is at about the same level Claude Opus was a few months ago. I don’t see how Anthropic can compete with that over the long term.

    That said I spoke to a normal person the other day and they hadn’t heard of claude at all, blew my mind

    That doesn’t actually surprise me. Claude is still a niche product when it comes to general consumers. As far as the public is concerned, AI = ChatGPT (and the annoying Google AI summary).



  • We need to start with voting in such overwhelming numbers this November that Democrats get the House and Senate - ideally with a huge proportion of actual progressive leftists. It needs to be a large enough majority to completely stop any further actions on Trump’s part.

    Next, there needs to be congressional hearings that expose everything - fucking everything - the orange sack of shit has done to corrupt the government. Make it stay in the news 24 hours a day. Put federal officials under constant congressional subpoena. Make it impossible for the remaining Republicans in the House and Senate to avoid hearing about it.

    Finally, when those Republicans have finally had enough, move to impeach and remove both Trump and Vance. Get both those motherfuckers out of there. Let the new Speaker of the House become acting president, and start cleaning up the mess.

    At that point, start packing the court. Boost the number of justices to 15. Implement term limits for SCOTUS as well, and make them retroactive. Alito, Thomas, and Roberts: Gone.

    If this happens, maybe in 2028 we can start really repairing the damage.



  • nvidia, microsoft, google, oracle, etc are all profitable

    • NVIDIA is a hugely profitable hardware company. Not an AI model provider. They’re selling the shovels for the gold rush. And their profits will tank as soon as one of the model providers fold, though pivoting back to consumer hardware is definitely an option for them.
    • Microsoft is a hugely profitable business and consumer software company. Again, not an AI model provider. They have trained a couple models, but have made no profit on serving them. Their add-on “Copilot” services aren’t profitable (hence the recent enormous price hikes for GitHub Copilot, which has resulted in a bunch of companies scaling back their AI usage). All of Microsoft’s profit comes from software sales.
    • Google is a hugely profitable software company as well - but again, not with AI. Their model, Gemini, isn’t remotely profitable, and has similar costs to maintain as Anthropic and OpenAI see for their models. Heck, they just did a fundraising round for the first time in years to support it, which is NOT a sign of a healthy AI business.
    • Oracle is also profitable in software, but they’ve staked their company on hardware roll-outs supporting data centers for OpenAI. They’ve booked future profit from OpenAI owing them almost a trillion dollars. If OpenAI can’t pay that bill when the time comes, Oracle is completely fucked.

    inference is profitable, training is not

    When I don’t include the costs of doing business, my business is profitable! That’s silly. Inference might be very slightly in the green now when viewed by itself (although that’s deeply questionable; no actual GAAP accounting has shown it to be so). But since training is an ongoing expense that frontier model providers have to constantly engage in, their companies are - and will remain - very deeply in the red.

    And without seeing GAAP accounting showing where all the money goes in support of inference, I am highly doubtful that it’s profitable.

    training is what the majority of data centre spend is going towards

    if they want to be profitable pull back the training but right now they are competing for market share

    They can’t. Ever. Pulling back on training means allowing model drift. You need to understand that models are obsolete the moment they’re released. Their training data is set in stone. New version of Typescript ships? Some celebrity dies? Big election happens? The model not only doesn’t know about any of it, it can’t be updated. The best you can manage is throwing MCP and RAG at it in the hopes that the model will pay attention to it, but the point of diminishing returns on that arrives almost instantaneously. You have to train. Constantly.

    feel free to look back at all the times lemmy predicted the end of spotify because it wasn’t profitable, now they turn around and cry it’s making money

    Bad comparison. Spotify has already been a profitable, publicly-traded company for years.

    And - this part’s important - I’m not Lemmy. The platform we’re having this conversation on has nothing to do with whether or not the AI model providers are profitable.

    At work nobody is talking like this, everyone is talking about claude and it makes sense, it’s the best thing since vscode

    Anecdotes aren’t data. But as long as we’re swapping anecdotes, here’s mine: I work with actual machine-learning engineers. They’re the ones who bag on Anthropic and OpenAI the most. And they use Qwen, Gemma, and a few other small, open-source, open-weight models. Have you looked at Hugging Face? Its community is huge, and growing daily. No one wants to be locked in to Claude Code or any other proprietary development tool when the service has been unstable and the pricing has becoming ridiculous in their desperate attempts to become profitable.

    The cost for using Qwen tokens is $0, no matter how many tokens they use.

    You say no one talks like this… Are you sure you’re listening?


  • Anthropic is already profitable if you take out the enormous spend they have on training, which if the bubble bursts would leave them as the number 1 ai provider, it’s also insanely in demand and has trouble keeping up with its current product, they also have several products mythos etc lined up

    First off… Why in the world would you take out the enormous spend they have on training? Training is an ongoing expense, not a startup expense. If your expenses exceed your income, then you’re not making a profit.

    Secondly, they had one quarter in which they reported (using non-GAAP accounting) a very slight amount of profit. That same quarter, SpaceX gave them a massive - and temporary - discount on rented compute.

    We don’t have any reason to think they’re actually profitable.

    I doubt it, I think it’d be closer to liberation day tariff’s or the oil crisis, it’ll go down for a bit, many articles will be written about how this is the worst thing ever then 6 months later it’ll be back up again

    You’re way more optimistic than I am. If OpenAI and Anthropic crash, there are a huge number of businesses that have built themselves around their products, and those will crash, too. And I think you’re downplaying the damage the tariffs have already done.

    As said all the major players in this game are super profitable major companies, that won’t change

    Again, not true. OpenAI is not profitable. Anthropic is almost certainly not profitable. Grok from SpaceX is not profitable. Google is profitable, but not from Gemini. Microsoft is profitable, but not from Copilot.

    No business that is built entirely on AI is profitable. Not one.

    Look… No one’s arguing that the coding tools built around AI are entirely useless. They’re not (although their capabilities are way, waaaaay over-hyped). The problem is an economic one: Serving up AI models cannot be profitable. There’s just no way, especially now that we have small AI models that can be run on local workstations, and offer similar performance to the frontier models.

    Qwen, running in a well-designed harness such as OpenCode, with a carefully written AGENTS.md file, is of comparable performance to at least Claude Sonnet, and possibly Claude Opus. All without the massive, ludicrous infrastructure requirements.

    How is Anthropic supposed to compete with that? Sure, you can probably get something useful out of Opus faster, but at the cost of thousands of dollars. Using Qwen and similar local models is free.