this post was submitted on 18 Jul 2024
802 points (99.5% liked)

Technology

59651 readers
2722 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Companies are going all-in on artificial intelligence right now, investing millions or even billions into the area while slapping the AI initialism on their products, even when doing so seems strange and pointless.

Heavy investment and increasingly powerful hardware tend to mean more expensive products. To discover if people would be willing to pay extra for hardware with AI capabilities, the question was asked on the TechPowerUp forums.

The results show that over 22,000 people, a massive 84% of the overall vote, said no, they would not pay more. More than 2,200 participants said they didn't know, while just under 2,000 voters said yes.

you are viewing a single comment's thread
view the rest of the comments
[–] fuckwit_mcbumcrumble@lemmy.dbzer0.com 2 points 4 months ago (1 children)

But instead of relying on the GPU to power it the dedicated AI chip did the work. Like it had it's own distinct chip on the graphics card that would handle the upscaling.

I forget who demoed it, and searching for anything related to "AI" and "upscaling" gets buried with just what they're already doing.

[–] barsoap@lemm.ee 4 points 4 months ago* (last edited 4 months ago) (2 children)

That's already the nvidia approach, upscaling runs on the tensor cores.

And no it's not something magical it's just matrix math. AI workloads are lots of convolutions on gigantic, low-precision, floating point matrices. Low-precision because neural networks are robust against random perturbation and more rounding is exactly that, random perturbations, there's no point in spending electricity and heat on high precision if it doesn't make the output any better.

The kicker? Those tensor cores are less complicated than ordinary GPU cores. For general-purpose hardware and that also includes consumer-grade GPUs it's way more sensible to make sure the ALUs can deal with 8-bit floats and leave everything else the same. That stuff is going to be standard by the next generation of even potatoes: Every SoC with an included GPU has enough oomph to sensibly run reasonable inference loads. And with "reasonable" I mean actually quite big, as far as I'm aware e.g. firefox's inbuilt translation runs on the CPU, the models are small enough.

Nvidia OTOH is very much in the market for AI accelerators and figured it could corner the upscaling market and sell another new generation of cards by making their software rely on those cores even though it could run on the other cores. As AMD demonstrated, their stuff also runs on nvidia hardware.

What's actually special sauce in that area are the RT cores, that is, accelerators for ray casting though BSP trees. That's indeed specialised hardware but those things are nowhere near fast enough to compute enough rays for even remotely tolerable outputs which is where all that upscaling/denoising comes into play.

[–] fuckwit_mcbumcrumble@lemmy.dbzer0.com 2 points 4 months ago (1 children)

Found it.

https://www.neowin.net/news/powercolor-uses-npus-to-lower-gpu-power-consumption-and-improve-frame-rates-in-games/

I can't find a picture of the PCB though, that might have been a leak pre reveal and now that it's revealed good luck finding it.

[–] AdrianTheFrog@lemmy.world 2 points 4 months ago (1 children)

Having to send full frames off of the GPU for extra processing has got to come with some extra latency/problems compared to just doing it actually on the gpu... and I'd be shocked if they have motion vectors and other engine stuff that DLSS has that would require the games to be specifically modified for this adaptation. IDK, but I don't think we have enough details about this to really judge whether its useful or not, although I'm leaning on the side of 'not' for this particular implementation. They never showed any actual comparisons to dlss either.

As a side note, I found this other article on the same topic where they obviously didn't know what they were talking about and mixed up frame rates and power consumption, its very entertaining to read

The NPU was able to lower the frame rate in Cyberpunk from 263.2 to 205.3, saving 22% on power consumption, and probably making fan noise less noticeable. In Final Fantasy, frame rates dropped from 338.6 to 262.9, resulting in a power saving of 22.4% according to PowerColor's display. Power consumption also dropped considerably, as it shows Final Fantasy consuming 338W without the NPU, and 261W with it enabled.

[–] nekusoul@lemmy.nekusoul.de 1 points 4 months ago* (last edited 4 months ago) (1 children)

I've been trying to find some better/original sources [1] [2] [3] and from what I can gather it's even worse. It's not even an upscaler of any kind, it apparently uses an NPU just to control clocks and fan speeds to reduce power draw, dropping FPS by ~10% in the process.

So yeah, I'm not really sure why they needed an NPU to figure out that running a GPU at its limit has always been wildly inefficient. Outside of getting that investor money of course.

[–] AdrianTheFrog@lemmy.world 2 points 4 months ago

Ok, i guess its just kinda similar to dynamic overclocking/underclocking with a dedicated npu. I don't really see why a tiny 2$ microcontroller or just the cpu can't accomplish the same task though.

[–] fuckwit_mcbumcrumble@lemmy.dbzer0.com 1 points 4 months ago (1 children)

Nvidia's tensor cores are inside the GPU, this was outside the GPU, but on the same card (the PCB looked like an abomination). If I remember right in total it used slightly less power, but performed about 30% faster than normal DLSS.

[–] AdrianTheFrog@lemmy.world 1 points 4 months ago

from the articles I've found it sounds like they're comparing it to native...