One of Spez’s answers in the infamous Reddit AMA struck me

Two things happened at the same time: the LLM explosion put all Reddit data use at the forefront, and our continuing efforts to reign in costs…

I am beginning to think all they wanted to do was getting their share of the AI pie, since we know Reddit’s data is one of the major datasets for training conversetional models. But they are such a bunch of bumbling fools, as well as being chronically understaffed, the whole thing exploded in their face. At this stage their only chance if survival may well be to be bought out by OpenAI…

  • @spoonful@beehaw.org
    link
    fedilink
    English
    71 year ago

    Reddit data is public and can be easily web scraped. Reddit doesn’t own it. Spez is just throwing random memes in to distract people.

    • @gotofritz@beehaw.orgOP
      link
      fedilink
      English
      11 year ago

      I am sorry but you don’t know what you are talking about. These things are regulated by legal documents, you don’t just wake up on morning and say “trust me bro, their data is public”

      If you go and read their TnC’s it explicitly statea that scraping is forbidden without prioir written consent. They only allow access to their data via APIs, which of course they charge for

      The fact that it can be easily scraped it’s neither here nor there, if they catch you they can sue you

      • @spoonful@beehaw.org
        link
        fedilink
        English
        2
        edit-2
        1 year ago

        Nah Terms of Service is not enforcable through browse wrap agreement in the US and most of EU. You can’t implicitly agree with a legal document just by looking at something.

        Check out LinkedIn v. Hiq case which went to 9th circuit and set the precedent for this. LinkedIn lost.

  • @Crotaro@beehaw.org
    link
    fedilink
    English
    61 year ago

    Surprisingly tough question. On one hand, I don’t think every ex-Reddit user should go “Nah, it’s too late, fam” because then it wouldn’t even make sense for the devs to make any changes if they had no chance of regaining their userbase. On the other hand, I feel like even if they made really good changes, I would still always be on edge waiting for the bad thing to happen (pretty much what I imagine an abusive relationship to be like).

  • @schmurian@beehaw.org
    link
    fedilink
    English
    71 year ago

    Honestly, I think so. It looks like all big tech collected enough data from us, so that they now can create AI models from it. Like a snapshot of humanity for some years

  • @EvilColeslaw@beehaw.org
    link
    fedilink
    English
    4
    edit-2
    1 year ago

    I think this is the main reason for the insane prices, but it could have easily been avoided. They don’t need to have one price class for every type of use of their Data API. They could have easily had one rate for LLM and other AI training uses and another for third party client applications. I feel like at some point they realized they’d rather just kill the third parties while they’re at it and thus seemed like the logical moment.

    • @gotofritz@beehaw.orgOP
      link
      fedilink
      English
      51 year ago

      Yeah, one of the other answers to the AMA was “we are not profitable yet, unlike the 3rd part app devs…” - that is something that wouldn’t sit well with any investor I know

  • @Klinkertinlegs@beehaw.org
    link
    fedilink
    English
    41 year ago

    I think the LLM wave hit, they saw dollar signs, and they made a change without thinking it through, but then they were backed into a corner between money and avoiding outrage, but greed won over.

  • @whofearsthenight@beehaw.org
    link
    fedilink
    English
    81 year ago

    Could they have something to do with it? Yes, for sure. But the thing is that they didn’t have to do any of this the way they did. They could have made an API plan that allowed third party apps to still exist/thrive, and also charge big companies that just want to use reddit to train LLM’s. Change the pricing/terms based around this idea. They deliberately went after third party apps, and then double and tripled down on it in the face of massive backlash. If spez was competent, he would have been able to better pivot this conversation and make it about training LLM’s for megacorps, but he didn’t and even then it would have still been bullshit that is easily seen past.

  • @nob0dy@beehaw.org
    link
    fedilink
    English
    41 year ago

    They could have created better licensing models. It does rely on people honoring the agreements but besides countries that disregard IPs I think its a viable model. Their business is social media, not curating datasets.

    • @gotofritz@beehaw.orgOP
      link
      fedilink
      English
      21 year ago

      They could have, probably / maybe, but they are quite inept. What is social media if not a giant dataset?!?

    • @gotofritz@beehaw.orgOP
      link
      fedilink
      English
      11 year ago

      It’s a 13 minutes rehashing the same points everyone has been making to death. And it doesn’t even mention LLMs

  • @SkyNTP@lemmy.ml
    link
    fedilink
    English
    81 year ago

    Reddit’s business model was not founded on selling LLM data. Reddit got greedy and decided to change their business model to cash in on an unexpected revenue stream. What was also unexpected (to Reddit) is that you cannot cater to social media users and monetize their data for LLM training effectively at the same time. And now Reddit will have neither, and will die just like all other businesses that adopt Enshitification as a core operating procedure.

    Let this be a lesson to them and all that follow: do not let your greed make you blind to the consequences of your actions.

    • @gotofritz@beehaw.orgOP
      link
      fedilink
      English
      11 year ago

      Does it matter what Reddit’s business model was founded on? Businesses respond to changing conditions all the time and pivot.

      “they got greedy” seems really a naive way of looking at it. They are a business, that’s what businesses are all about. Additionally, they are a busienss which is NOT profitable, and need to to change things to survive now that the era of low interest rates has come to end. The real issue is that they are so inept IMHO

      I find the word “entshittification” so cringe

  • @rubythulhu@beehaw.org
    link
    fedilink
    English
    71 year ago

    Yup. AI consumers are more profitable than 3rd party apps. why focus on tiered pricing when you can just name a price point everyone has to pay that only huge AI companies are willing to.

    Reddit gets their content for free. Reselling it at a high price to AI/ML consumers is an easy way to turn free content into profit with almost no effort.

  • @IggyTheSmidge@lemmy.blahaj.zone
    link
    fedilink
    English
    61 year ago

    I think that was definitely the impetus - I first read about the changes in this article back in April: https://www.theregister.com/2023/04/18/reddit_charging_ai_api/

    The closing statement is interesting:

    The spokesperson we talked to also wanted to make clear the Data API was still freely accessible for appropriate use cases through the Reddit developer platform; hopefully app developers and other small-scale operators won’t have any surprises ahead this summer.

    I suspect they ran the numbers and started seeing dollar signs - they don’t care about the third-party apps (which don’t make them any money directly), they’re just trying to cash in on Microsoft etc.

    I have a sneaking suspicion they’re going to end up back-pedalling, but it will be too little, too late.

  • @Senseibull@lemmy.ml
    link
    fedilink
    English
    61 year ago

    It is, but reddit don’t own the content on their site according to their TOS, posters merely grant them a license to redistribute it. So it’s not really their call to shut off ChatGPT scraping, it should be a community decision

    • @gotofritz@beehaw.orgOP
      link
      fedilink
      English
      51 year ago

      “Merely” - the TOS basically grant Reddit the ability to do what the hell they want with it, LOL

      When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit.

      And furthermore

      You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.