• Psaldorn
    link
    fedilink
    1562 months ago

    From the same group that doesn’t understand joins and thinks nobody uses SQL this is hardly surprising .

    Probably got an LLM running locally and asking it to get data which is then running 10 level deep sub queries to achieve what 2 inner joins would in a fraction of the time.

    • @_stranger_@lemmy.world
      link
      fedilink
      64
      edit-2
      2 months ago

      You’re giving this person a lot of credit. It’s probably all in the same table and this idiot is probably doing something like a for-loop over an integer range (the length of the table) where it pulls the entire table down every iteration of the loop, dumps it to a local file, and then uses plain text search or some really bad regex’s to find the data they’re looking for.

      • @indepndnt@lemmy.world
        link
        fedilink
        102 months ago

        I think you’re still giving them too much credit with the for loop and regex and everything. I’m thinking they exported something to Excel, got 60k rows, then tried to add a lookup formula to them. Since you know, they don’t use SQL. I’ve done ridiculous things like that in Excel, and it can get so busy that it slows down your whole computer, which I can imagine someone could interpret as their “hard drive overheating”.

      • @morbidcactus@lemmy.ca
        link
        fedilink
        242 months ago

        Considering that is nearly exactly some of the answers I’ve received during the technical part of interviews for jr data eng, you’re probably not far off.

        Shit I’ve seen solutions done up that look like that, fighting the optimiser every step (amongst other things)

      • @makingStuffForFun@lemmy.ml
        link
        fedilink
        4
        edit-2
        2 months ago

        I have to admit I still have some legacy code that does that.

        Then I found pandas. Life changed for the better.

        Now I have lots if old code that I’ll update, “one day”.

        However, even my old code, terrible as it is, does not overheat anything, and can process massively larger sets of data than 60,000 rows without any issue except poor efficiency.

  • Tiefling IRL
    link
    fedilink
    1172 months ago

    60k isn’t that much, I frequently run scripts against multiple hundreds of thousands at work. Wtf is he doing? Did he duplicate the government database onto his 2015 MacBook Air?

    • @4am@lemm.ee
      link
      fedilink
      542 months ago

      A TI-86 can query 60k rows without breaking a sweat.

      If his hard drive overheated from that, he is doing something very wrong, very unhygienic, or both.

    • @arotrios@lemmy.world
      link
      fedilink
      English
      72 months ago

      Seriously - I can parse multiple tables of 5+ million row each… in EXCEL… on a 10 year old desktop and not have the fan even speed up. Even the legacy Access database I work with handles multiple million+ row tables better than that.

      Sounds like the kid was running his AI hamsters too hard and they died of exhaustion.

  • @zalgotext@sh.itjust.works
    link
    fedilink
    85
    edit-2
    2 months ago

    my hard drive overheated

    So, this means they either have a local copy on disk of whatever database they’re querying, or they’re dumping a remote db to disk at some point before/during/after their query, right?

    Either way, I have just one question - why?

    Edit: found the thread with a more in-depth explanation elsewhere in the thread: https://xcancel.com/DataRepublican/status/1900593377370087648#m

    So yeah, she’s apparently toting around an external hard drive with a copy of the “multiple terabytes” large US spending database, running queries against it, then dumping the 60k-row result set to CSV for further processing.

    I’m still confused at what point the external drive overheats, even if she is doing all this in a “hot humid” hotel room that she can’t run any fans I guess because her kids were asleep?

    But like, all of that just adds more questions, and doesn’t really answer the first one - why?

    • @GoodEye8@lemm.ee
      link
      fedilink
      English
      212 months ago

      My one question would be “How?”

      What the hell are you doing that your hard drives are overheating? How do you even know it’s overheating as I’m like 90% certain hard drives (except NVMe if we’re being liberal with the meaning of hard drive) don’t even have temperature sensors?

      The only conclusion I can come to is that everything he’s saying is just bullshit.

          • @Mniot@programming.dev
            link
            fedilink
            English
            42 months ago

            Can we think of any device someone might have that would struggle with 60k? Certainly an ESP32 chip could handle it fine, so most IoT devices would work…

            • @zenpocalypse@lemm.ee
              link
              fedilink
              English
              32 months ago

              Right? There’s no part of that xeet that makes any real sense coming from a “data engineer.”

              Terrifying, really.

            • @T156@lemmy.world
              link
              fedilink
              English
              22 months ago

              Unless the database was designed by someone who only knows of data as that robot from Star Trek, most would be absolutely fine with 60k rows. I wouldn’t be surprised if the machine they’re using caches that much in RAM alone.

      • @Auli@lemmy.ca
        link
        fedilink
        English
        182 months ago

        They have temp sensors. But have never heard of a overheating drive.

      • @spooky2092@lemmy.blahaj.zone
        link
        fedilink
        English
        452 months ago

        Plus, 60k is nothing. One of our customers had a database that was over 3M records before it got some maintenance. No issue with overheating lol

        • @surph_ninja@lemmy.world
          link
          fedilink
          26
          edit-2
          2 months ago

          I run queries throughout the day that can return 8 million+ rows easily. Granted, it takes few minutes to run, but it has never caused a single issue with overheating even on slim pc’s.

          This makes no fucking sense. 60k rows would return in a flash even on shitty hardware. And if it taxes anything, it’s gonna be the ram or cpu- not the hard drive.

          • @T156@lemmy.world
            link
            fedilink
            English
            2
            edit-2
            2 months ago

            In my experience, the only time that I’ve taxed a drive when doing a database query is either when dumping it, or with SQLite’s vacuum, which copies the whole thing.

            For a pretty simple search like OP seems to be doing, the indices should have taken care of basically all the heavy lifting.

        • @AThing4String@sh.itjust.works
          link
          fedilink
          142 months ago

          I literally work with ~750,000 line exports on the daily on my little Lenovo workbook. It gets a little cranky, especially if I have a few of those big ones open, but I have yet to witness my hard drive melting down over it. I’m not doing anything special, and I have the exact same business-economy tier setup 95% of our business uses. While I’m doing this, that little champion is also driving 4 large monitors because I’m actual scum like that. Still no hardware meltdowns after 3 years, but I’ll admit the cat likes how warm it gets.

          750k lines is just for the branch specific item preferences table for one of our smaller business streams, too - FORGET what our sales record tables would look like, let alone the whole database! And when we’re talking about the entirety of the social security database, which should contain at least one line each in a table somewhere for most of the hundreds of millions of people currently living in the US, PLUS any historical records for dead people??

          Your hard drive melting after 60k lines, plus the attitude that 60k lines is a lot for a major database, speaks to GLARING IT incompetence.

      • Fuck spez
        link
        fedilink
        English
        42 months ago

        I don’t think I’ve seen a brand new computer in the past decade that even had a mechanical hard drive at all unless it was purpose-built for storing multiple terabytes, and 60K rows wouldn’t even take multiple gigabytes.

      • @baldingpudenda@lemmy.world
        link
        fedilink
        42 months ago

        Reminds me of those 90s ads about hackers making your pc explode.

        Musk gonna roll up in a wheelchair, “the attempt on my life has left me ketamine addicted and all knowing and powerful.”

      • @wise_pancake@lemmy.ca
        link
        fedilink
        12 months ago

        I have when a misconfigured spark job I was debugging was filling hard drives with tb of error logs and killing the drives.

        That was a pretty weird edge case though, and I don’t think the drives were melting, plus this was closer to 10 years ago when SSD write lifetimes were crappy and we bought a bad batch of drives.

    • @zenpocalypse@lemm.ee
      link
      fedilink
      English
      17
      edit-2
      2 months ago

      Even if it was local, a raspberry pi can handle a query that size.

      Edit - honestly, it reeks of a knowledge level that calls the entire PC a “hard drive”.

      • @T156@lemmy.world
        link
        fedilink
        English
        0
        edit-2
        2 months ago

        Unless they actually mean the hard drive, and not the computer. I’ve definitely had a cheap enclosure overheat and drop out on me before when trying to seek the drive a bunch, although it’s more likely the enclosure’s own electronics overheating. Unless their query was rubbish, a simple database scan/search like that should be fast, and not demanding in the slightest. Doubly so if it’s dedicated, and not using some embedded thing like SQLite. A few dozen thousand queries should be basically nothing.

        • @zenpocalypse@lemm.ee
          link
          fedilink
          English
          22 months ago

          Yeah, no matter what way you disorganize 60,000 rows, the data is still going to read into memory once.

    • @Bosht@lemmy.world
      link
      fedilink
      English
      362 months ago

      I’d much sooner assume that they’re just fucking stupid and talking out of their ass tbh.

      • @kautau@lemmy.world
        link
        fedilink
        16
        edit-2
        2 months ago

        Same as Elon when he confidently told off engineers during his takeover of Twitter or gestures broadly at the Mr. Dunning Kruger himself

        Wonder if it’s an SQL DB

        Elon probably hired confident right wingers whose parents bought and paid their way through prestigious schools. If he hired anyone truly skilled and knowledgeable, they’d call him out on his bullshit. So the people gutting government programs and passing around private data like candy are just confidently incorrect

    • @Adalast@lemmy.world
      link
      fedilink
      42 months ago

      Why? Because they feel the need to have local copies of sensitive financial information because… You know… They are computer security experts.

  • @darkpanda@lemmy.ca
    link
    fedilink
    382 months ago

    What is this, a table for ants? Because that’s the average number of ants in an ant colony and it’s nowhere near an impressive amount of rows to be doing any sort of processing on. It wouldn’t be an impressive amount of rows if your rig was an i386DX-33 running off a 5” floppy.

    • @1rre@discuss.tchncs.de
      link
      fedilink
      142 months ago

      Exactly, 60k rows is negligible enough in most cases that you can just treat it as free unless you’re doing a cross join on it or something, unless he’s doing something like using an unordered text file as his database with no ram or cache

  • @hypeerror@sh.itjust.works
    link
    fedilink
    232 months ago

    No matter what actually happened here we can confidently state this clown is full of shit and has no idea what he’s doing.

  • @ryedaft@sh.itjust.works
    link
    fedilink
    English
    792 months ago

    This sounds like trying to do stuff in Excel? The computer isn’t overheating but the amount of memory needed is very high which would make it run poorly. They might interpret that as overheating?

    • @monkeyman512@lemmy.world
      link
      fedilink
      English
      742 months ago

      It also makes sense if they are on calling the entire computer “the hard drive” like grandma and the fans kicked on.

    • @jonjuan@programming.dev
      link
      fedilink
      English
      33
      edit-2
      2 months ago

      Yeah, everyone commenting about being able to handle billions of rows easily, which obviously very true if you are worming with sql or similar.

      But this is probably some finance kid, investment banker analyst, and only knows how to use Excel.

      60,000 rows in excel with formulas, if not done efficiently, can for sure make you computer a little toasty.

  • @adarza@lemmy.ca
    link
    fedilink
    English
    272 months ago

    just a lame-ass excuse for not finding whatever evidence they were looking for.

    elsewhere, some seeding was done.

    now they’ll do the ‘full’ data grab and ‘find’ what they were looking for.

  • @nick@midwest.social
    link
    fedilink
    422 months ago

    60k lol.

    I regularly work with data in the 16tb range and weirdly my computer is fine. Git gud, doge scrubs.

  • @jkercher@programming.dev
    link
    fedilink
    English
    322 months ago

    60k rows of anything will be pulled into the file cache and do very little work on the drive. Possibly none after the first read.

  • @MystikIncarnate@lemmy.ca
    link
    fedilink
    English
    13
    edit-2
    2 months ago

    IT guy checking in.

    The only time I’ve even seen drive temp sensor alarms is on server raid arrays and other similar hard drives/SSDs… Never in my life have I seen one available on a consumer device, nor have I seen any alarm for and drive temp, go off. It just doesn’t happen.

    IMO, this is one of those language barriers where people call their computer chassis (and everything in it) the “hard drive”.

    Applying that assumption, their updated statement is: His computer over heated.

    Idk what kind of shit system he’s running on that 60k rows would cause overheating, but ok.

    • @theparadox@lemmy.world
      link
      fedilink
      English
      8
      edit-2
      2 months ago

      As another IT guy here, it could also be a shitty method of analysis that he got from ChaptGPT. As an amateur coder/script writer, the kinds of code I’ve seen people use from these bots is disturbing. One of my coworkers asked me for help after trying to cobble together something from bots. There were variables declared and never used, variables that were never assigned values but that were used in expressions… it was like it attempted to do that ransom note made from magazine letters but they couldn’t spell coherently.

  • @zqwzzle@lemmy.ca
    link
    fedilink
    English
    212 months ago

    What’s the bet the software they downloaded is malware and it’s crypto mining?

  • will
    link
    fedilink
    English
    372 months ago

    Skill issue, as the kids like to say.