From the same group that doesn’t understand joins and thinks nobody uses SQL this is hardly surprising .
Probably got an LLM running locally and asking it to get data which is then running 10 level deep sub queries to achieve what 2 inner joins would in a fraction of the time.
You’re giving this person a lot of credit. It’s probably all in the same table and this idiot is probably doing something like a for-loop over an integer range (the length of the table) where it pulls the entire table down every iteration of the loop, dumps it to a local file, and then uses plain text search or some really bad regex’s to find the data they’re looking for.
I think you’re still giving them too much credit with the for loop and regex and everything. I’m thinking they exported something to Excel, got 60k rows, then tried to add a lookup formula to them. Since you know, they don’t use SQL. I’ve done ridiculous things like that in Excel, and it can get so busy that it slows down your whole computer, which I can imagine someone could interpret as their “hard drive overheating”.
Considering that is nearly exactly some of the answers I’ve received during the technical part of interviews for jr data eng, you’re probably not far off.
Shit I’ve seen solutions done up that look like that, fighting the optimiser every step (amongst other things)
I have to admit I still have some legacy code that does that.
Then I found pandas. Life changed for the better.
Now I have lots if old code that I’ll update, “one day”.
However, even my old code, terrible as it is, does not overheat anything, and can process massively larger sets of data than 60,000 rows without any issue except poor efficiency.
60k isn’t that much, I frequently run scripts against multiple hundreds of thousands at work. Wtf is he doing? Did he duplicate the government database onto his 2015 MacBook Air?
A TI-86 can query 60k rows without breaking a sweat.
If his hard drive overheated from that, he is doing something very wrong, very unhygienic, or both.
He probably mining crypto on top of running his SQL queries.
There must be more join statements than column names
Seriously - I can parse multiple tables of 5+ million row each… in EXCEL… on a 10 year old desktop and not have the fan even speed up. Even the legacy Access database I work with handles multiple million+ row tables better than that.
Sounds like the kid was running his AI hamsters too hard and they died of exhaustion.
I’d do that if I was given so much stupid access
No, its an external drive, appearently.
deleted by creator
my hard drive overheated
So, this means they either have a local copy on disk of whatever database they’re querying, or they’re dumping a remote db to disk at some point before/during/after their query, right?
Either way, I have just one question - why?
Edit: found the thread with a more in-depth explanation elsewhere in the thread: https://xcancel.com/DataRepublican/status/1900593377370087648#m
So yeah, she’s apparently toting around an external hard drive with a copy of the “multiple terabytes” large US spending database, running queries against it, then dumping the 60k-row result set to CSV for further processing.
I’m still confused at what point the external drive overheats, even if she is doing all this in a “hot humid” hotel room that she can’t run any fans I guess because her kids were asleep?
But like, all of that just adds more questions, and doesn’t really answer the first one - why?
My one question would be “How?”
What the hell are you doing that your hard drives are overheating? How do you even know it’s overheating as I’m like 90% certain hard drives (except NVMe if we’re being liberal with the meaning of hard drive) don’t even have temperature sensors?
The only conclusion I can come to is that everything he’s saying is just bullshit.
Hard drives do get hot and need some cooling but not at 60k rows. Its either made up or their computer case is made of thermal cladding
You could query 60,000 rows on a low tier smart phone. Makes no sense at all.
Can we think of any device someone might have that would struggle with 60k? Certainly an ESP32 chip could handle it fine, so most IoT devices would work…
Right? There’s no part of that xeet that makes any real sense coming from a “data engineer.”
Terrifying, really.
Unless the database was designed by someone who only knows of data as that robot from Star Trek, most would be absolutely fine with 60k rows. I wouldn’t be surprised if the machine they’re using caches that much in RAM alone.
They have temp sensors. But have never heard of a overheating drive.
dude is 100% talking about ssds. NVME ones at that, he’s just stupid.
Have you ever heard of case of overheating hard drives within the last decade?
Plus, 60k is nothing. One of our customers had a database that was over 3M records before it got some maintenance. No issue with overheating lol
I run queries throughout the day that can return 8 million+ rows easily. Granted, it takes few minutes to run, but it has never caused a single issue with overheating even on slim pc’s.
This makes no fucking sense. 60k rows would return in a flash even on shitty hardware. And if it taxes anything, it’s gonna be the ram or cpu- not the hard drive.
In my experience, the only time that I’ve taxed a drive when doing a database query is either when dumping it, or with SQLite’s vacuum, which copies the whole thing.
For a pretty simple search like OP seems to be doing, the indices should have taken care of basically all the heavy lifting.
I literally work with ~750,000 line exports on the daily on my little Lenovo workbook. It gets a little cranky, especially if I have a few of those big ones open, but I have yet to witness my hard drive melting down over it. I’m not doing anything special, and I have the exact same business-economy tier setup 95% of our business uses. While I’m doing this, that little champion is also driving 4 large monitors because I’m actual scum like that. Still no hardware meltdowns after 3 years, but I’ll admit the cat likes how warm it gets.
750k lines is just for the branch specific item preferences table for one of our smaller business streams, too - FORGET what our sales record tables would look like, let alone the whole database! And when we’re talking about the entirety of the social security database, which should contain at least one line each in a table somewhere for most of the hundreds of millions of people currently living in the US, PLUS any historical records for dead people??
Your hard drive melting after 60k lines, plus the attitude that 60k lines is a lot for a major database, speaks to GLARING IT incompetence.
I don’t think I’ve seen a brand new computer in the past decade that even had a mechanical hard drive at all unless it was purpose-built for storing multiple terabytes, and 60K rows wouldn’t even take multiple gigabytes.
Reminds me of those 90s ads about hackers making your pc explode.
Musk gonna roll up in a wheelchair, “the attempt on my life has left me ketamine addicted and all knowing and powerful.”
I have when a misconfigured spark job I was debugging was filling hard drives with tb of error logs and killing the drives.
That was a pretty weird edge case though, and I don’t think the drives were melting, plus this was closer to 10 years ago when SSD write lifetimes were crappy and we bought a bad batch of drives.
Even if it was local, a raspberry pi can handle a query that size.
Edit - honestly, it reeks of a knowledge level that calls the entire PC a “hard drive”.
Unless they actually mean the hard drive, and not the computer. I’ve definitely had a cheap enclosure overheat and drop out on me before when trying to seek the drive a bunch, although it’s more likely the enclosure’s own electronics overheating. Unless their query was rubbish, a simple database scan/search like that should be fast, and not demanding in the slightest. Doubly so if it’s dedicated, and not using some embedded thing like SQLite. A few dozen thousand queries should be basically nothing.
Yeah, no matter what way you disorganize 60,000 rows, the data is still going to read into memory once.
I’d much sooner assume that they’re just fucking stupid and talking out of their ass tbh.
Same as Elon when he confidently told off engineers during his takeover of Twitter or gestures broadly at the Mr. Dunning Kruger himself
Wonder if it’s an SQL DB
Elon probably hired confident right wingers whose parents bought and paid their way through prestigious schools. If he hired anyone truly skilled and knowledgeable, they’d call him out on his bullshit. So the people gutting government programs and passing around private data like candy are just confidently incorrect
Why? Because they feel the need to have local copies of sensitive financial information because… You know… They are computer security experts.
What is this, a table for ants? Because that’s the average number of ants in an ant colony and it’s nowhere near an impressive amount of rows to be doing any sort of processing on. It wouldn’t be an impressive amount of rows if your rig was an i386DX-33 running off a 5” floppy.
Exactly, 60k rows is negligible enough in most cases that you can just treat it as free unless you’re doing a cross join on it or something, unless he’s doing something like using an unordered text file as his database with no ram or cache
Buddy’s probably running code he got from GitHub Copilot that is used to do a visualization of a bubble sort for learning purposes.
Even then, I can crank through orders of magnitude more data on my craptop
No matter what actually happened here we can confidently state this clown is full of shit and has no idea what he’s doing.
“YOU’RE JUST JEALOUS” is such a fucking pussy-ass response, too.
Literally every time someone dismisses Wikipedia, it’s because they believe something crazy that Wikipedia told them is wrong.
I checked conservapedia once, and its actually unhinged. If someone tells you to look at that, or reccommends it, they’re crazy.
Did they ever finish their own bible translation? The one they started because King James was too woke.
YES . its so unhinged . they have an entire page discussing if Obama is actually a Muslim
Flashback to 2014.
heh
God they picked out the ONE possible thing they could criticize her for, there’s like 3 other things RIGHT NEXT TO THAT
“Software Engineer” was literally right next to it.
She is vehemently against crypto. She runs the newsletter “web3 is going great” - https://www.web3isgoinggreat.com/
This sounds like trying to do stuff in Excel? The computer isn’t overheating but the amount of memory needed is very high which would make it run poorly. They might interpret that as overheating?
It also makes sense if they are on calling the entire computer “the hard drive” like grandma and the fans kicked on.
Yeah, everyone commenting about being able to handle billions of rows easily, which obviously very true if you are worming with sql or similar.
But this is probably some finance kid, investment banker analyst, and only knows how to use Excel.
60,000 rows in excel with formulas, if not done efficiently, can for sure make you computer a little toasty.
Could be a gen 5 nvme drive without adequate cooling. Them bastards can run hot. Especially the early gen 5 drives.
But with only 60k rows would one even have enough time to overheat?
just a lame-ass excuse for not finding whatever evidence they were looking for.
elsewhere, some seeding was done.
now they’ll do the ‘full’ data grab and ‘find’ what they were looking for.
Yeah. Hiring inexperienced children into government isn’t fraud, by itself. But I bet it makes fraud way easier.
especially if some of the children have a background in cyber crime . this is somehow not a joke , see this video (youtube.com)
60k lol.
I regularly work with data in the 16tb range and weirdly my computer is fine. Git gud, doge scrubs.
60k rows of anything will be pulled into the file cache and do very little work on the drive. Possibly none after the first read.
Not if each row is pi!
You can put 60k rows in Excel 95.
IT guy checking in.
The only time I’ve even seen drive temp sensor alarms is on server raid arrays and other similar hard drives/SSDs… Never in my life have I seen one available on a consumer device, nor have I seen any alarm for and drive temp, go off. It just doesn’t happen.
IMO, this is one of those language barriers where people call their computer chassis (and everything in it) the “hard drive”.
Applying that assumption, their updated statement is: His computer over heated.
Idk what kind of shit system he’s running on that 60k rows would cause overheating, but ok.
As another IT guy here, it could also be a shitty method of analysis that he got from ChaptGPT. As an amateur coder/script writer, the kinds of code I’ve seen people use from these bots is disturbing. One of my coworkers asked me for help after trying to cobble together something from bots. There were variables declared and never used, variables that were never assigned values but that were used in expressions… it was like it attempted to do that ransom note made from magazine letters but they couldn’t spell coherently.
Maybe
Maybe
They are just making shit up and doing jack shit
I really hope so.
What’s the bet the software they downloaded is malware and it’s crypto mining?
That’s pretty plausible. Inexperienced engineer plus overheating…good chance that adds up to a malware infection.
Skill issue, as the kids like to say.
They even doubled down: https://xcancel.com/DataRepublican/status/1900565922831618202#m
Well thanks for trying to keep us from catching it I guess
thank god, i am going to be ok, i was worried it might be contagious (it’s not, he just has brain damage)
Can you screenshot or something? I can’t load that link
Does this one work? https://nitter.net/DataRepublican/status/1900565922831618202#m Its a bit too long to screenshot.
That works, thanks
So it was a probably shitty external HD… that’s a whole other can of worms