froztbyte

joined 2 years ago
[–] froztbyte@awful.systems 2 points 2 hours ago (1 children)

that list undercounts far more than I expected it to

[–] froztbyte@awful.systems 2 points 3 hours ago

when digging around I happened to find this thread which has some benchmarks for a diff model

it's apples to square fenceposts, of course, since one llm is not another. but it gives something to presume from. if g4dn.2xl gave them 214 tok/s, and if we make the extremely generous presumption that tok==word (which, well, no; cf. strawberry), then any Use Deserving Of o3 (let's say 5~15k words) would mean you need a tok-rate of 1000~3000 tok/s for a "reasonable" response latency ("5-ish seconds")

so you'd need something like 5x g4dn.2xl just to shit out 5000 words with dolphin-llama3 in "quick" time. which, again, isn't even whatever the fuck people are doing with openai's garbage.

utter, complete, comprehensive clownery. era-redefining clownery.

but some dumb motherfucker in a bar will keep telling me it's the future. and I get to not boop 'em on the nose. le sigh.

[–] froztbyte@awful.systems 5 points 4 hours ago (1 children)

following on from this comment, it is possible to get it turned off for a Workspace Suite Account

  1. contact support (? button from admin view)
  2. ask the first person to connect you to Workspace Support (otherwise you'll get some made-up bullshit from a person trying to buy time or Case Success or whatever, simply because they don't have the privileges to do what you're asking)
  3. tell the referred-to person that you want to enable controls for "Gemini for Google Workspace" (optionally adding that you have already disabled "Gemini App")

hopefully you spend less time on this than the 40-something minutes I had to (a lot of which was spent watching some poor support bastard start-stop typing for minutes at a time because they didn't know how to respond to my request)

[–] froztbyte@awful.systems 5 points 4 hours ago

also, my inbox earlier:

24661 N + Jan 21 Apple Developer ( 42K) Explore the possibilities of Apple Intelligence.

[–] froztbyte@awful.systems 3 points 4 hours ago* (last edited 4 hours ago) (1 children)

so, for an extremely unscientific demonstration, here (warning: AWS may try hard to get you to engage with Explainer[0]) is an instance of an aws pricing estimate for big handwave "some gpu compute"

and when I say "extremely unscientific", I mean "I largely pulled the numbers out of my ass". even so, they're not entirely baseless, nor just picking absolute maxvals and laughing

~~parameters~~ assumptions made:

  • "somewhat beefy" gpu instances (g4dn.4xlarge, selected through the tried and tested "squint until it looks right" method)
  • 6-day traffic pattern, excluding sunday[1]
  • daily "4h peak" total peak load profile[2]
  • 50 instances mininum, 150 maximum (let's pretend we're not openai but are instead some random fuckwit flybynight modelfuckery startup)
  • us west coast
  • spot instances, convertible spot reserves, 3y full prepay commit (yeah I know full vs partial is a big diff; once again, snore)

(and before we get any fucking ruleslawyering dumb motherfuckers rolling in here about accuracy or whatever: get fucked kthx. this is just a very loosely demonstrative example)

so you'd have a variable buffer of 50..150 instances, featuring 3.2..9.6TiB of RAM for working set size, 800..2400 vCPU, 50..150 nvidia t4 cores, and 800..2400GiB gpu vram

let's presume a perfectly spherical ops team of uniform capability[3] and imagine that we have some lovely and capable active instance prewarming and correct host caching and whatnot. y'know, things to reduce user latency. let's pretend we're fully dynamic[4]

so, by the numbers, then

1y times 4h daily gives us 1460h (in seconds, that's 5256000). this extremely inaccurate full-of-presumptions number gives us "service-capable life time". the times your concierge is at the desk, the times you can get pizza delivered.

x3 to get to lifetime matching our spot commit, x50..x150 to get to "total possible instance hours". which is the top end of our sunshine and rainbows pretend compute budget. which, of course, we still have exactly no idea how to spend. because we don't know the real cost of servicing a query!

but let's work backwards from some made-up shit, using numbers The Poor Public gets (vs numbers Free Microsoft Credits will imbue unto you), and see where we end up!

so that means our baseline:

  • upfront cost: $4,527,400.00
  • monthly: $1460.00 (x3 x12 = $52560)
  • whatever the hell else is incurred (s3, bandwidth, ....)
  • >=200k/y per ops/whatever person we have

3y of 4h-daily at 50 instances = 788400000 seconds. at 150 instances, 2365200000 seconds.

so we can say that, for our deeply Whiffs Ever So Slightly values, a second's compute on the low instance-count end is $0.01722755 and $0.00574252 at the higher instance-count end! which gives us a bit of a handle!

this, of course, entirely ignores parallelism, n-instance job/load/whatever distribution, database lookups, network traffic, allllllll kinds of shit. which we can't really have good information on without some insider infrastructure leaks anyway. if we pretend to look at the compute alone.

so what does $1000/query mean, in the sense of our very ridiculous and fantastical numbers? since the units are now The Same, we can simply divide things!

at the 50 instance mark, we'd need to hypothetically spend 174139.68 instance-seconds. that's 2.0154 days of linear compute!

at the 150 instance mark, 522419.05 instance-seconds! 6.070 days of linear compute!

so! what have we learned? well, we've learned that we couldn't deliver responses to prompts in Reasonable Time at these hardware presumptions! which, again, are linear presumptions. and there's gonna be a fair chunk of parallelism and other parts involved here. but even so, turns out it'd be a bit of a sizable chunk of compute allocated. to even a single prompt response.

[0] - a product/service whose very existence I find hilarious; the entire suite of aws products is designed to extract as much money from every possible function whatsoever, leading to complexity, which they then respond to by..... producing a chatbot to "guide users"

[1] - yes yes I know, the world is not uniform and the fucking promptfans come from everywhere. I'm presuming amerocentric design thinking (which imo is probably not wrong)

[2] - let's pretend that the calculators' presumption of 4h persistent peak load and our presumption of short-duration load approaching 4h cumulative are the same

[3] - oh, who am I kidding, you know it's gonna be some dumb motherfuckers with ansible and k8s and terraform and chucklefuckery

[–] froztbyte@awful.systems 5 points 10 hours ago (1 children)

This must be mentioned in the acknowledgements

wat

[–] froztbyte@awful.systems 5 points 11 hours ago

in that spirit: Loserus Inamericus

(I don't know if that scans, I have no latin skills and I don't feel like breaking out information to check)

[–] froztbyte@awful.systems 7 points 11 hours ago (3 children)

there's probably a fair couple more. tracing anything de beers or a good couple of other industries will probably indicate a couple more

(my hypothesis is: the kinds of people that flourished under apartheid, the effect that had on local-developed industry, and then the "wider world" of ~~opportunities~~ prey they got to sink their teeth into after apartheid went away; doubly so because staying ZA-only is extremely limiting for ghouls of their sort - it's a fixed-size pool, and the still-standing apartheid-vintage capital controls are Limiting for the kinds of bullshit they want to pull)

[–] froztbyte@awful.systems 6 points 11 hours ago (1 children)

just to be clear: not providing excuses for felon. just think it's unlikely that that was the avenue

[–] froztbyte@awful.systems 8 points 11 hours ago (2 children)

opinion: the AWB is too afrikaans for it to be likely that that is where he picked up his nazi shit. then-era ZA still had a lot of AF/EN animosity, and in a couple of biographies of the loon you hear things like "he hated life in ZA as a kid because .... {bullying}", and a non-zero amount of that may have stemmed from AF bullying EN

(icbw, it's definitely not something I've studied the history of the loon's tendencies, but can speak to (at least part[0]) of the ZA attitude)

(([0] - I wasn't alive at the time it would've mattered to him, but other bits of the cultural attitudes lasted well into my youth))

[–] froztbyte@awful.systems 5 points 17 hours ago

some of those replies, oof

(especially that last one, which hits me with the same vibe as many of the coiner-era non-understanding hype-amplifiers did)

[–] froztbyte@awful.systems 11 points 1 day ago

I recall seeing something of this sort happening on goog for about 12~18mo - every so often a researcher post does the rounds where someone finds Yet Another way goog is fucking it up

the advertising dept has completely captured all mindshare and it is (demonstrably) the only part that goog-the-business cares about

 

the precision and clarity are astounding

by the time the hilbert curves got there my mouth was hanging open, and it still gets better

 

Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

Last week's thread

(Semi-obligatory thanks to @dgerard for starting this)

 

'cuz I definitely do

 

“stepping away because I want to create the time and space to do my own exploration.”

yeah I completely believe you, weird lady. I too want to vaguely step into the warm embrace of my piles of ill-gotten gold, forgetting about the stressors of how to sell something that doesn’t exist and that you helped claim would be here really soon now. ahhh, bliss..

one’s gotta wonder about the timing of this announcement, right? like come the fuck on

 

I will say I'm not much of a fan of mysqls but uhhhhh this seems bad

13
submitted 4 months ago* (last edited 4 months ago) by froztbyte@awful.systems to c/techtakes@awful.systems
 

saw this kicking around on the lobsters frontpage

IBM also includes a DPU for accelerating IO, along with an on-board AI accelerator

ah yes. an AI accelerator. for a chip that goes into a system that'll quite possibly have a lifespan measured in decade-partials. in environments so extremely up to date with the bleeding edge of technology that they are absolutely not losing their programmers to retirement. an AI acceleator for that. makes total sense.

imagine being the poor engineers who had to spec that out, design it, and get it actually existing. nevermind even the awe-inspiringly stunning disregard to reality that it takes for some management fuckhead(s) to have "steered" this

Funny enough, I asked why IBM had different terminology compared to the rest of the industry. They said IBM came up with the terminology first, and later on the industry adopted different terminology. It was all lighthearted and funny.

ah yes! funny! haha. not at all some weird insular shit from the same company that runs a worlds-apart platform with a control grip so strong it makes larry ellison check if they're infringing on anything actionable...

 

Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

 

Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

 

found this kicking around on one of the feeder sites a few days ago and only got to read it now

kinda neat. it's the sort of thing that you used to find quite a lot with keygens and other things prone to easter eggs, and that I don't really know of being as prevalent in more recent gaming and such

 

Need to make a primal scream without gathering footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh facts of Awful you'll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)
Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

 

there are some remarkable instances of bad behaviour in there already, but imagine being the sort of product team that thinks users being gaslit by a chatbot that they couldn’t even consent to choose to use is totes something to deliver without any modification or remark

 

this time in open letter format! that'll sure do it!

there are "risks", which they are definite about - the risks are not hypothetical, the risks are real! it's totes even had some acknowledgement in other places! totes real defs for sure this time guize

view more: next ›