Correct me if I’m wrong. I read ActivityPub standards and dug a little into lemmy sources to understand how federation works. And I’m a bit disappointed. Every server just has a cache and the ability to fetch something from another known server. So if you start your own instance, there is no profit for the whole network until you have a significant piece of auditory (e.g. private instances or servers with no users). Are there any “balancers” to utilize these empty instances? Should we promote (or create in the first place) a way how to passively help lemmy with such fast growth?

  • TinfoilBeanieTech@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 years ago

    You are right. On the one hand, it’s kind of bad, naive distributed architecture (my day job), it could have been done much better. On the other hand, the more important point is that it demonstrates an alternative to centralized. We’ll learn a lot about usage patterns here, get new ideas, and either improve Lemmy or build something better from the ground up. Big thanks to Reddit for driving users this way to test scalability and get much better knowledge of usage.

    • Senator Bum Cuckets@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 years ago

      What makes a distributed system good that Lemmy hasn’t done? Seems like a pretty robust system to me, seems like scaling issues are on the instance host themself. With Reddit’s experience, I don’t see how there are issues

      • Terrasque@infosec.pub
        link
        fedilink
        English
        arrow-up
        3
        ·
        3 years ago

        Disclaimer: I’ve only looked a bit at the protocols and high levels descriptions of how it works, and this is just my understanding of it. But it seems to track.

        let’s take … Selfhosted@lemmy.world for example. Right now lemmy.world is the Source of Truth on this, which means if you sign up for it on a different host, let’s say myawersomeinstance.com, that first contacts lemmy.world, copies over posts, and then subscribes on new posts for that. Actually not 100% sure if lemmy.world contacts myawersomeinstance.com when there’s a new post, or myawersomeinstance.com polls lemmy.world… But anyway, point is, lemmy.world is authority on it. myawersomeinstance.com also have Selfhosted@lemmy.world data, but it’s a copy of it. And lemmy.world is only authority. So if you post something, your server then sends it to lemmy.world and waits a reply. Then lemmy.world contacts all instances that has at least one user following this to tell about the new post. And that new post now exists on a few hundred databases.

        The problem is the scaling is whack. Okay, you can have 5000 federated servers with users subscribing to Selfhosted@lemmy.world, but that means lemmy.world needs to update 5000 servers per post, and there’ll be 5000x storage used for that post, and ALL 5000 servers contacts lemmy.world to get the new good stuff.

        Frankly, it’s a scaling nightmare. As for a different approach, you could have private / public keys and sign updates from lemmy.world and allow the other instances to fetch the new data from each other. That would also allow more relaxed caching, since it would be generally lower cost to re-fetch the data. Now you need aggressive caching because you don’t want lemmy.world to keel over and die form every server on the planet wanting to hear the latest and greatest posts all the time.

        • ultraHQ@beehaw.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          3 years ago

          Thanks for the in depth write up! I haven’t looked too far into the docs or the subscription model, but is this a fault on Lemmy’s end, or is this a function of how activity pub handles federated communication? (I’m very new to activity pub/federation, just now reading through the activity pub docs)

          I do like your idea of distributed replication via keys,much better than what I had brainstormed

          Edit: yeah it does look like it’s a function of activity pub, wonder if theres a more scalable federation protocol out there

  • KelsonV@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 years ago

    I just commented on this in another thread: https://lemmy.world/comment/76011

    TL;DR: The server-to-client interactions on Lemmy are a lot heavier than the server-to-server interactions, so even if you’re just using your own server to interact with communities on other servers, it should still take load off of the servers you would have been using directly.

  • Bizzle@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    3 years ago

    I have my own Lemmy instance running on my home server, but I’m here. “But Bizzle,” you may be asking yourself, “why go through all the trouble of configuring your own instance just to wind up on Lemmy.World anyway?”

    I’m glad you asked! And the answer is that federation only fetches parent comments. I’m glad Lemmy exists, and I’m going to keep using it, but we need federated sibling comments for this to actually be good, in my opinion.

    EDIT: I actually couldn’t have been more wrong.

  • ComplexLotus@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 years ago

    Since Lemmy instance are not backed by commercial interest, but rather by nice volunteers and donors that have money and time to spare, they will be heavily affected by economic downturns (we still can see commercial interests still affect users negatively tho with reddit). Here are my thoughts on the matter:

    • as far as I understand the owner of the domain: https://lemmy.world even has to pay for this fancy domain name in the DNS system … every month subscription service style
      • (and tbh I hate the Domain name system) why should I fund it with my own money?
      • if you hosted with an onion site over tor that expenditure would not exist, but how would users discover your site then? Let me know if you know something about this
    • in times of deflation (meaning money becomes worth more, spending some money on a self hosted lemmy instance becomes nonsensical)
    • tbh if I hosted a lemmy instance and the users of my instance posted high quality content in quantity I would use it to train my own LLM, that would at least create some economic incentive for me to host such a page … but managing spam and bots will be HARD

    That is why you should always back up your comments on your personal device, would be nice if lemmy had an automated way of doing this (I should look into this more)

  • goldenarchmage@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    1
    ·
    3 years ago

    It’s a bit worse than that actually. I’m now seeing several communities with exactly the same name that originate on different servers - so clearly Lemmy doesn’t have a rule about duplication once you cross a server boundary. That’s going to get unwieldy quite fast particularly if, I dunno, “Aww” gets popular on two separate servers at the same time - I guess I’ll have to subscribe to both…

    • ChrisostomeStrip@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 years ago

      I don’t get argument about duplicates. The same situation was on reddit - you’ve got few, sometimes more, subs about same topic. You could subscribe to whichever you wanted. Why on Lemmy this is suddenly a problem?

    • Ataraxia@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 years ago

      Well one instance shouldn’t monopolize a community. If it takes a dump on one instance at least it exists elsewhere. If I want to start up my own cat community I don’t see why that’s an issue.