Is anyone actually surprised by this?

      • Bleys@lemmy.world
        link
        fedilink
        arrow-up
        72
        arrow-down
        7
        ·
        11 months ago

        Realistically what is the worst thing China is doing with your private data? Selling it? If you’re not a Chinese National, at least you don’t fall under their jurisdiction.

        If you’re a U.S. citizen, with all the tech oligarchs cozying up to the current administration, I’d be a lot more concerned with Facebook/Twitter/Etc collecting your data.

        • frozenspinach@lemmy.ml
          link
          fedilink
          English
          arrow-up
          22
          arrow-down
          30
          ·
          11 months ago

          Realistically what is the worst thing China is doing with your private data?

          Probably mapping out the extended support networks of democratic activists in Taiwan to prepare to throw them in jail after a forcible military takeover.

          • Grapho@lemmy.ml
            link
            fedilink
            arrow-up
            17
            arrow-down
            5
            ·
            11 months ago

            So democratic activists in Taiwan have extensive networks in the US?

            I mean, you said it.

              • Grapho@lemmy.ml
                link
                fedilink
                arrow-up
                4
                arrow-down
                3
                ·
                11 months ago

                Networks with a foreign actor undermining national sovereignty, which financed several massacres in your country

                • catsarebadpeople@sh.itjust.works
                  link
                  fedilink
                  arrow-up
                  4
                  arrow-down
                  3
                  ·
                  11 months ago

                  My country? Not sure what you’re talking about but I know that Taiwan deserves sovereignty. You don’t? Surely you’re not pro imperialism…

      • mspencer712@programming.dev
        link
        fedilink
        arrow-up
        1
        arrow-down
        24
        ·
        edit-2
        11 months ago

        As a US citizen, I prefer services that US consumer protections could apply to. (While we still have them, ahem.) I know that Chinese laws will not protect me from things a Chinese business does in China.

        (What’s with the rude replies? Did I fail to notice what instance I’m on or something?)

          • mspencer712@programming.dev
            link
            fedilink
            arrow-up
            1
            arrow-down
            6
            ·
            11 months ago

            This makes me sad, that we can’t engage in civil discussion about this. Why did you assume and not ask questions? Be curious, not judgmental.

            To me it’s a question of laws. The laws of the U.S. at least somewhat constrain the people of my own country, and can prevent them from working against their own citizens. Like me.

            Please be kind when replying.

    • frozenspinach@lemmy.ml
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      37
      ·
      edit-2
      11 months ago

      but it’s a foreign actor so OOooooOOWwwwooOOOO sCaRrRey!

      I love that people think this is a solid own. Lest we forget Hong Kong, or an impending hot war in Taiwan or building out extradition systems with an expanding network of countries to forcibly repatriate and torture dissidents and human rights lawyers.

      You used to not have to explain why authoritarianism was bad.

      Edit: I would love to know the Pro side of what happened in Hong Kong, or the forced extradition regime, since evidently I’m clearly in the wrong in thinking those were bad. What am I missing?

      • Foni@lemm.ee
        link
        fedilink
        arrow-up
        27
        ·
        11 months ago

        It used to not be necessary because democracies used to have moral authority but since the revelations of Manning and Snowden non-Americans see no difference between giving our data to the USA or to China or any other. We also know from the reaction to the war in Ukraine and Gaza that human rights claims are only sometimes used.

      • Grapho@lemmy.ml
        link
        fedilink
        arrow-up
        8
        arrow-down
        4
        ·
        edit-2
        11 months ago

        Anti terrorism is good, actually. I don’t support people kicking seniors for speaking mandarin to try to bully a government into not prosecuting murderers in the mainland, which was the reason the protests happened (that and Washington money)

      • BrainInABox@lemmy.ml
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        2
        ·
        11 months ago

        or an impending hot war in Taiwan

        When you can’t even find things that China actually has done to complain about, so you have to start complaining about things they haven’t done.

  • Treczoks@lemmy.world
    link
    fedilink
    arrow-up
    57
    ·
    11 months ago

    “We store the information we collect in secure servers located in the People’s Republic of China”

    Now you Americans know how we Europeans feel when Google, Amazon and Facebook store our information on American servers. Hint: The protective wall between Chinese servers and their government are about as good as the one between American servers and their government - at least for non-US citizens. The last thin veil of privacy for Eurpeans has been ripped to shreds by Trump last week.

    • Ferk@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      11 months ago

      The last thin veil of privacy for Eurpeans has been ripped to shreds by Trump last week.

      What did he do? I know Trump does not like the GDPR, but did he sign something affecting it last week?

      • Treczoks@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        11 months ago

        He killed the EU-US Data Privacy Framework. Theoretically, no company is allowed to transfer data of European citizens to US-based servers anymore. Sadly, Ursula von der Leyen is lacking the balls to act on this.

        • Ferk@lemmy.ml
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          11 months ago

          Thanks, I did not know. I think you are referring to this: https://www.freevacy.com/news/noyb/trumps-actions-to-dismantle-pclob-threatens-eu-us-data-transfers/6088

          To be completely honest… as an European I would be happy if they actually did make it so that no EU-US data transfer were allowed… we need to stop depending on all these US-based services… but like you said, they probably don’t have the balls to pull the plug. Which makes me wonder if that board was actually really any protection at all for privacy or it had always been an empty shell used as an excuse on both sides just to keep up appearances and maintain the plug on.

          I honestly think this could be a win for us. Worst case scenario, nothing really changes but some masks fall off and at least some people would stop acting under false pretense (which could open the doors for change). So I’m actually glad he did that.

  • AbouBenAdhem@lemmy.world
    link
    fedilink
    English
    arrow-up
    58
    arrow-down
    1
    ·
    edit-2
    11 months ago

    Anyone using DeepSeek as a service the same way proprietary LLMs like ChatGPT are used is missing the point. The game-changer isn’t that a Chinese company like DeepSeek can compete with OpenAI and its ilk—it’s that, thanks to DeepSeek, any organization with a few million dollars to train and host their own model can now compete with OpenAI.

  • grey_maniac@lemmy.ca
    link
    fedilink
    arrow-up
    54
    ·
    11 months ago

    I’m confused. Isn’t “collecting keystroke data” just an alarmist way to describe text entry?

    • noisefree@lemmy.world
      link
      fedilink
      arrow-up
      13
      arrow-down
      1
      ·
      11 months ago

      Maybe. They could also be doing things like paying attention to input cadence and typos/pre-send typo corrections to use as part of a fingerprint associated with the identifying information a user gives them when creating an account so that they can then attempt to detect the user elsewhere on the web whether they are using an identifying account or not.

    • uis@lemm.ee
      link
      fedilink
      arrow-up
      7
      arrow-down
      3
      ·
      11 months ago

      Not exactly. Timing between key presses can be used to identify people.

      • grey_maniac@lemmy.ca
        link
        fedilink
        arrow-up
        2
        ·
        11 months ago

        I am literally so paranoid I regularly vary my keysteoke rhythms and explore polyrhytmic techniques to create variations. Not even joking.

    • vfreire85@lemmy.ml
      link
      fedilink
      arrow-up
      3
      ·
      11 months ago

      this. i mean, the session logs for the prompt are kept at least for your user, right?

    • tux@lemmy.world
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      11 months ago

      Not usually. Keystroke info is different than text input, like if you didn’t click onto any field and typed it would only be captured if keystroke are all being grabbed. It’s especially scary if you keep the app running in the bg and then type something and it still captures it. Not saying they’re doing that, but the privacy policy says they might.

      The rhythm part is annoying, it’s commonly used to ID people even through things like ad blocks and dns blocks. Could also (in theory) be used to capture what people are typing just by hearing how they type.

    • Ferk@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      11 months ago

      This is the full paragraph:

      We collect certain device and network connection information when you access the Service. This information includes your device model, operating system, keystroke patterns or rhythms, IP address, and system language. We also collect service-related, diagnostic, and performance information, including crash reports and performance logs. We automatically assign you a device ID and user ID. Where you log-in from multiple devices, we use information such as your device ID and user ID to identify your activity across devices to give you a seamless log-in experience and for security purposes.

      It looks to me that they are using it to identify the user uniquely, maybe also related to captcha to prevent bots (it’s common practice to capture mouse and keyboard while resolving captchas to see if the movement is human-like).

  • Subverb@lemmy.world
    link
    fedilink
    arrow-up
    51
    arrow-down
    3
    ·
    11 months ago

    If you think the American companies do anything different you’re not paying attention and simply believing the propaganda.

  • ozoned@lemmy.world
    link
    fedilink
    arrow-up
    48
    ·
    11 months ago

    Chinese company does what American companies have done for 25+ years now!

    Is it time for REAL data privacy laws or are we just gonna keep playing whack-a-mole with Chinese tech companies that get us nowhere?

    • Someonelol@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      4
      ·
      11 months ago

      Our data’s just too valuable for these parasites. Data privacy laws may eventually pass to compel software companies to store everything in US servers only.

      • ozoned@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        11 months ago

        Excellent Point. If that’s the case though, then wouldn’t other countries follow suit which still limits big tech’s reach and makes them less profitable and less powerful? Idk. Guess we’ll see how it plays out. Either way, I’m staying as far from those ecosystems as possible to at least try to mitigate some of what they do. I’ll never be totally successful, genie is put of the bottle, but we can at least attempt.

    • smb@lemmy.ml
      link
      fedilink
      arrow-up
      5
      ·
      11 months ago

      I think its called a data lake, so they don’t “store” it, its rather floating around there 🤪

      • howrar@lemmy.ca
        link
        fedilink
        arrow-up
        9
        ·
        11 months ago

        These lakes are formed when the cloud is saturated and gives us data precipitation.

        • smb@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          11 months ago

          thanks for the great picture 👍

          so here is the current cloud clima forecast:

          The saturated clouds will rain into the data lakes that are already overspilling here and there into the ransomstreams already taking all soil in their way with them. During the day there will be security clouds preventing from visible rain only while during the night those same security clouds rain themselves all collected data to their homelake while their homelake security already is corrupted and spills over regulary.

          As soon as the fort-cisc-pal-ocstricken-redm-ondams breach it’ll gonna have floods with multi-exabyte waveheights and the ripples of the release will be felt over to far east china and the currents will circulate around the world multiple times causing damage and devastation in their wake around the world and eventually even reach connected orbit.

          The floods will have the potential to also wash away and /or drown or choke all the big tech dinosaurs. Only small foss mammals and deep sea amphibics will survive this historic event.

          … you kinda asked for it 😉 same as “they” kinda asked for it too. 🤔

  • ZeroOne@lemmy.world
    link
    fedilink
    arrow-up
    42
    arrow-down
    1
    ·
    11 months ago

    Nope, At least we can check DeepSeek’s source code

    Unlike OpenAI… oops I meant ClosedAI

  • JOMusic@lemmy.ml
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    1
    ·
    11 months ago

    This article is what US propaganda looks like folks. Mashable should be ashamed.

    Literally all AI companies do this to run their services. Except you can actually download Deepseek and run it completely securely on your own devices. You know who doesn’t allow that security? OpenAI and the other US companies currently being screwed.

    • Preston Maness ☭@lemmygrad.ml
      link
      fedilink
      English
      arrow-up
      14
      ·
      11 months ago

      Billions of folk’s keyboards are connected to the internet and the vast majority of them have no idea. It’s absolutely ludicrous that we’ve gotten to this stage with surveillance capitalism. Internet-connected keyboards are malware, plain and simple.

  • ArchRecord@lemm.ee
    link
    fedilink
    English
    arrow-up
    35
    arrow-down
    2
    ·
    11 months ago

    the company states that it may share user information to "comply with applicable law, legal process, or government requests.

    Literally every company’s privacy policy here in the US basically just says that too.

    Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” but it also collects information from your device, including “device model, operating system, keystroke patterns or rhythms, IP address, and system language.”

    Breaking news, company with chatbot you send messages to uses and stores the messages you send, and also does what practically every other app does for demographic statistics gathering and optimizations.

    Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes. There’s also the added issue that DeepSeek sends your user data straight to Chinese servers.

    They didn’t use the word keystrokes, therefore they don’t collect them? Of course they collect keystrokes, how else would you type anything into these apps?

    In DeepSeek’s privacy policy, there’s no mention of the security of its servers. There’s nothing about whether data is encrypted, either stored or in transmission, and zero information about safeguards to prevent unauthorized access.

    This is the only thing that seems disturbing to me, compared to what we’d like to expect based on the context of what DeepSeek is. Of course, this was proven recently in practice to be terrible policy, so I assume they might shore up their defenses a bit.

    All the articles that talk about this as if it’s some big revelation just boil down to “company does exactly what every other big tech company does in America, except in China”

    • tux@lemmy.world
      link
      fedilink
      arrow-up
      8
      arrow-down
      10
      ·
      11 months ago

      Collecting keystrokes is very different from collecting text inputted into fields. Keystroke rhythms is even more alarming as that is often used to identify users despite them using privacy settings, or used to collect what’s typed via audio collection.

      Your argument that this is no different than other apps is complete crap. Don’t trust any app that collects that information

      • Ferk@lemmy.ml
        link
        fedilink
        arrow-up
        4
        arrow-down
        2
        ·
        edit-2
        11 months ago

        The argument stands, though.

        Yes, not ALL other apps do that, but the comment was specifically talking about companies like Google and Meta… they definitely do collect incomplete strings from search forms (down to individual characters) when they display search suggestions, for example. They might not mention “keystrokes” in the legal text, but I don’t see why they wouldn’t be able to extrapolate your typing pattern since they do have the timing information which should be enough data to, at some level, profile it.

        • tux@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          11 months ago

          Keystrokes don’t have to be in a text field or input. That’s my point.

          If I’m on say google. And I type anything into the field it’s definitely capturing it. You know this for no other reason then it would have to be with autocomplete as an option.

          Keystroke capturing is the same as keylogging, aka anything typed even if it’s not into a place where you would assume it’s being seen by the app. Aka, if I had an app open in the background and was typing in my password, it would see and capture that.

          They’re completely different things. While the privacy issues of US large tech companies are abundant and awful, there is a large difference between keystroke capturing and capturing input via fields. Especially when you’re agreeing to allow them to process and transfer or even sell that information.

          • Ferk@lemmy.ml
            link
            fedilink
            arrow-up
            1
            arrow-down
            1
            ·
            edit-2
            11 months ago

            But that’s not what the terms on both Google/Meta and Deepseek say.

            There’s no term in their ToS saying Google/Meta restricts the data collection to forms, which means that if the ToS allowed them to collect them from forms (and as you admitted, we do know for a fact that they do), then it also allows them to collect it outside of forms. The reason I put the search suggestions as example is because it’s one we CAN know (and thank you for agreeing on that), but that doesn’t mean they don’t do other captures at times we DON’T know… and also it’s not the only place, Google owns several captcha mechanisms and capturing input patterns is common on those too (and captchas capture outside forms too!). Another obvious example is Google docs, another is Google translate… and again, those are only the obvious ones, we don’t know if there are non-obvious ones.

            In the other direction too, Deepseek terms don’t say it does it outside of forms either. You are jumping into assumptions by saying it acts the same as a traditional keylogger and that the keystrokes are captured for “anything typed”. For all we know the only place they might be capturing is when the user is in very specific steps of the login process, maybe for captcha purposes too, or specific forms for preloading results, etc. There’s no reason you should trust they do it any less/more than Google/Meta does, the ToS in both have the same lack of information in that respect.

            You can only make assumptions one way or the other, since the terms are not specific on what exactly they allow themselves to do, in the case of Google/Meta they’re so sneaky that they avoid saying they do capture them (even though they do, as you yourself admitted), while in the case of Deepseek, even though they are a bit more specific by using the word “keystrokes”, they also don’t specify where/when/why (other than “to give you a seamless log-in experience and for security purposes” …but that’s also unclear wording).

  • Naia@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    34
    arrow-down
    1
    ·
    11 months ago

    I swear people do not understand how the internet works.

    Anything you use on a remote server is going to be seen to some degree. They may or may not keep track of you, but you can’t be surprised if they are. If you run the model locally, there is no indication it is sending anything anywhere. It runs using the same open source LLM tools that run all the other models you can run locally.

    This is very much like someone doing surprised pikachu when they find out that facebook saves all the photos they upload to facebook or that gmail can read your email.

  • uberstar@lemmy.ml
    link
    fedilink
    arrow-up
    32
    arrow-down
    2
    ·
    11 months ago

    detective conan sure had a hard time cracking the case!

    “The personal information we collect from you may be stored on a server located outside of the country where you live. We store the information we collect in secure servers located in the People’s Republic of China,” the privacy policy reads.

    Oh the horror! Let’s look at what our glorious spawns-of-techbro heroism has for us in store:

    ChatGPT:

    spoiler

    OpenAI processes your Personal Data for the purposes described in this Privacy Policy on servers located in various jurisdictions, including processing and storing your Personal Data in our facilities and servers in the United States. While data protection law varies by country, we apply the protections described in this policy to your Personal Data regardless of where it is processed, and only transfer that data pursuant to legally valid transfer mechanisms.

    Claude:

    spoiler

    When you access our website or Services, your personal data may be transferred to our servers in the US, or to other countries outside the European Economic Area (“EEA”) and the UK. This may be a direct provision of your personal data to us, or a transfer that we or a third party make.

    So not only is your data “possibly” stored in one country, now there’s a possibility of it being stored in many different countries. Where’s the outcry for that?

    Ok, so maybe your data being under the jurisdiction of another country is sus, right?

    In another section about how DeepSeek shares user data, the company states that it may share user information to “comply with applicable law, legal process, or government requests.”

    OH MY GOD SOUND THE ALARM!

    ChatGPT:

    spoiler

    We may use Personal Data for the following purposes: […] To comply with legal obligations and to protect the rights, privacy, safety, or property of our users, OpenAI, or third parties.

    Claude:

    spoiler

    Pursuant to regulatory or legal requirements, safety, rights of others, and to enforce our rights or our terms. We may disclose personal data to governmental regulatory authorities as required by law, including for legal, tax or accounting purposes, in response to their requests for such information or to assist in investigations. We may also disclose personal data to third parties in connection with claims, disputes or litigation, when otherwise permitted or required by law, or if we determine its disclosure is necessary to protect the health and safety of you or any other person, to protect against fraud or credit risk, to enforce our legal rights or the legal rights of others, to enforce contractual commitments that you have made, or as otherwise permitted or required by applicable law.

    So not only can your data be subject to the authorities, but it’s also handed out to 3rd parties (mind you, DeepSeek does the exact same, so why is it any surprise?).

    Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” …

    🤦… You get the idea now, bother yourself with the privacy policies of the respective contemporaries and CTRL + F to “User Content” or “User Input”… Same fucking shit.

    Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes.

    Yes, collecting keystrokes is probably the oddest thing here. To compare data farming giants with a decade and a half’s worth of data collection to a startup in terms of data collection is so astronomically dumb.

    I could go on but I’m bored now. Do your own research.

    • JackAttack@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      9
      ·
      11 months ago

      Not quite on topic but semi related… It’s reasons like this that I started reading privacy policies many times before signing up for a service.

      People would be surprised at some of the extremely concerning things are listed in there. Some is for good reason but some stuff is absolutely unnecessary and should be an issue for some people.

      • uberstar@lemmy.ml
        link
        fedilink
        arrow-up
        3
        ·
        11 months ago

        off-topic here as well, why stop at privacy policies? EULAs can get wilder, best such example of which is Apple:

        • JackAttack@lemmy.dbzer0.com
          link
          fedilink
          arrow-up
          1
          ·
          11 months ago

          Lmaooooo great find. I wonder why exactly they had to clarify that? Maybe a semi Easter egg? Or a genuine concern? Thanks for sharing.