dav2d is the fastest AV2 decoder on all platforms :)
Targeted to be small, portable and very fast.
If you're out of the loop like me: AV2 is the next-generation video coding specification from the Alliance for Open Media (AOMedia). Building on the foundation of AV1, AV2 is engineered to provide superior compression efficiency, enabling high-quality video delivery at significantly lower bitrates. It is optimized for the evolving demands of streaming, broadcasting, and real-time video conferencing.
- from https://av2.aomedia.org/https://www.sisvel.com/insights/av2-is-coming-sisvel-is-prep...
yep
The big question is if AOMedia is going to make good on their Mutually Assured Destruction promise of using their patent and financial war chest to to countersue into oblivion anyone trying to go after AV1 adaptors.
Which is why they'd never sue, only threaten and try to settle.
It’s not an easy problem.
Eventually all the money and power will converge in a few sub 500, or sub 50, companies and nothing will change.
Aesthetics over function; style over substance. If that's their web design policy it's likely their policy in all other aspects.
I'm also not sure that they're aware that intellectual property rights no longer exist in the US. If AV2 was vibe coded, there would be no case.
…for copyright. Not for anything else. Patents would still apply.
Oh no. Not another one. I presume this one makes lossy better, or faster or both.
Otherwise it was under a constant DDoS by the AI bots.
For instance, MCP, static sites that are easy to scale, a cache in front of a dynamic site engine
Our documentation and a main website are not fronted by this protection, so they're still accessible for the scrapers.
What am I missing that explains the gap between this and “constant DDoS” of the site?
Even when the amount of AI requests isnt that high - generally it's in hundreds per second tops for our services combined - that's still a load that causes issues for legitimate users/developers. We've seen it grow from somewhat reasonable to pretty much being 99% of responses we serve.
Can it be solved by throwing more hardware at the problem? Sure. But it's not sustainable, and the reasonable approach in our case is to filter off the parasitic traffic.
Software written in PHP is in most cases frankly still abysmally slow and inefficient. Wordpress runs like 70% of the web and you can really feel it from the 1500ms+ TFFB most sites have. PhpBB is not much better. Pathetic throughput at best and it has not gotten better in decades now.
I don't know how GitLab became so disgustingly slow. But yeah, I'm not surprised bots can easily bring it to its knees.
The bizarre thing is that pretty much no CMS, even the "new" ones, seems to automate all of that by default. None of those steps are that difficult to implement, and provide a serious speed boost to everything from WordPress to MediaWiki in my experience, and yet the only service that seems to get close to offering it is Cloudflare.
Even then, Cloudflare's tooling only works its best if you're already emitting minified and compressed files and custom written preload headers on the origin side, since the hit on decompressing all the origin traffic to make those adjustments and analyses is way worse for performance than just forwarding your compressed responses directly, hence why they removed Auto Minify[1] and encourage sending pre-compressed Brotli level 11 responses from the origin[2] so people on recent browsers get pass-through compression without extra cycles being spent on Cloudflare's servers.
The solution seems pretty clear: aim to get as much stuff served statically, preferably pre-compressed, as you can. But it's still weird that actually implementing that is still a manual process on most CMSes, when it shouldn't be that hard to make it a standard feature.
And as for Git web interfaces, the correct solution is to require logins to view complete history. Nobody likes saying it, nobody likes hearing it. But Git is not efficient enough on its own to handle the constant bombardment of random history paginations and diffs that AI crawlers seem to love. It wasn't an issue before, because old crawlers for things like search engines were smart enough to ignore those types of pages, or at least to accept when the sysadmin says it should ignore those types of pages. AI crawlers have no limits, ignore signals from site operators, make no attempts to skip redundant content, and in general are very dumb about how they send requests (this is a large part of why Anubis works so well; it's not a particularly complex or hard to bypass proof of work system[3], but AI bots genuinely don't care about anything but consuming as many HTTP 200s as a server can return, and give up at the slightest hint of pushback (but do at least try randomizing IPs and User-Agents, since those are effectively zero-cost to attempt).
[1]: https://community.cloudflare.com/t/deprecating-auto-minify/6...
[2]: https://blog.cloudflare.com/this-is-brotli-from-origin/
[3]: https://lock.cmpxchg8b.com/anubis.html but see also https://news.ycombinator.com/item?id=45787775 and then https://news.ycombinator.com/item?id=43668433 and https://news.ycombinator.com/item?id=43864108 for how it's working in the real world. Clearly Anubis actually does work, given testimonials from admins and wide deployment numbers, but that can only mean that AI scrapers aren't actually implementing effective bypass measures. Which does seem pretty in line with what I've heard about AI scrapers, summarized well in https://news.ycombinator.com/item?id=43397361, in that they are basically making no attempt to actually optimize how they're crawling. The general consensus seems to be that if they were going to crawl optimally, they'd just pull down a copy of Common Crawl like every other major data analysis project has done for the last two decades, but all the AI companies are so desperate to get just slightly more training data than their competitors that they're repeatedly crawling near-identical Git diffs just on the off-chance they reveal some slightly different permutation of text to use. This is also why open source models have been able to almost keep pace with the state of the art models coming out of the big firms: they're just designing way more efficient training processes, while the big guys are desperately throwing hardware and crawlers at the problem in the desperate hope that they can will it into an Amazon model instead of a Ben and Jerry’s model[4].
[4]: https://www.joelonsoftware.com/2000/05/12/strategy-letter-i-... - still probably the single greatest blog post ever written, 26 years later.
Why logins, exactly? Who would have such logins; developers only, or anyone who signs up? I'm not sure if this is an effective long-term mitigation, or simply a “wall of minimal height” like you point out that Anubis is.
- AI scrapers will pull a bunch of docs from many sites in parallel (so instead of a human request where someone picks a single Google result, it hits a bunch of sites)
- AI will crawl the site looking for the correct answer which may hit a handful of pages
- AI sends requests in quick succession (big bursts instead of small trickle over longer time)
- Personal assistants may crawl the site repeatedly scraping everything (we saw a fair bit of this at work, they announced themselves with user agents)
- At work (b2b SaaS webapp) we also found that the personal assistant variety tended to hammer really computationally expensive data export and reporting endpoints generally without filters. While our app technically supported it, it was very inorganic traffic
That said, I don't think the solution is blanket blocks. Really it's exposing sites are poorly optimized for emerging technology.
I think the world gains more if the VLAN team focuses on their amazing, free contribution to the world, than if they spend the same time trying to figure out how to save you two clicks.
We all hate that this is happening, but you don't need to attack everyone that is unfortunately caught up in it.
If you have discovered such an option, you could get very wealthy: minimizing friction for humans in e-commerce is valuable. If you're a drive-by critic not vested in the project, then yours is an instance of talk being cheap.
Keep in mind that those kinds of services: - should not be MITMed by CDNs - are generally ran by volunteers with zero budget, money and time-wise
I've seen several posts on HN and elsewhere showing many bots can be fingerprinted and blocked based on HTTP headers and TLS.
For the bots that perfectly match the fingerprint of an interactive browser and don't trigger rate limits, use hidden links to tarpits and zip bombs. Many of these have been discussed on HN. Here's the first one that came to memory: https://news.ycombinator.com/item?id=42725147
it is incredibly annoying but what can you do? AI scrapers ruined the web.
That being said, so many of the plebs suck. Like 2% will ruin everything for everyone.
But whether you agree with me or not, most paradigm shifting changes come from billionaires/corps because they are the only ones with the money to pull off massive shifts. Most innovation is not grassroots and heavily funded by the “elites”. This is how most successful countries have been for atleast the last 100 years. So billionaires add a lot of value even as they cause a lot of pain.
The solution in my mind is we absolutely need uncapped billionaires but they need to be effectively taxed (not like 90% but closer to 50%) and they have to have absolutely no influence on the government.
It's rarely been the citizens that have been the problem, but the governments and companies that seek the use the network connection for their overwhelming benefit.
Re (above):
> Not on topic, but wow the internet has very quickly devolved into: click -> "making sure you're not a bot", click -> "making sure you're a human", click -> "COOKIES COOKIES COOKIES", click -> "cloudflare something something"
Then I press the X to close the all-caps banner commanding me to install the app, upon which I get sent to the app store. Users of the website refer to it as an app.
High hardware prices, locked information sources, plenty of AI slop etc.
I hate that I can't do a curl, or automate my curls to retrieve data from the web because I either see some cloudfrare protection or some captcha.
Information is blocked in walled gardens.
Wow, this gitlab instance looked so much cleaner/simpler and less clunky than my past experiences! Also loaded really fast on first page load as well as subsequent actions
https://www.deb-multimedia.org/dists/unstable/main/binary-am...
... it says "fast and small AV1 video stream decoder"
... should probably be "AV2" ?
What's the current state of of Dolby trying too attack AV1 ecosystem (Snapchat more specifically)? I hope there is an organized fight back by AOM against these trolls.
Happy, AV2 decoding already here.
:)
Since dav2d is newer it has a higher fraction of C, but not enough for it to be the main language in the codebase :)
Decoders are one of the best places for rust because they are both performance critical and security critical.
JPEG-XL couldn’t get off the ground until they recreated the decoder in Rust since none of the browsers wanted to touch it. And the apps that did utilise the C based libjxl ended up hit with vulnerabilities in it.
This is extremely misleading. Before they even started work on the Rust-based decoder, experimental JPEG XL support was added to Chrome and Firefox using the reference C++ implementation. Chrome later removed this support for very dubious claims of lack of interest and improvement over previous generation of codecs.
Around that time, Safari shipped JPEG XL support in production, still without the Rust implementation. So the assertion no one wanted to touch it is plain false.
It was actually Mozilla who, a long time after stating they were ambivalent on JPEG XL, brought up memory safety as a major consideration, for the very first time. That’s when the work on the Rust implementation started.
Since the format continued to be more and more supported and talked about, it’s hard to say what exactly were the factors which made Google reconsider their stance. The notion that somehow everyone was super worried about memory safety from the very beginning, and once the JXL team fixed that, everyone was happy to embrace it, seems to come up a lot lately, but it’s terribly distorted and simply not true.
Not necessarily. What’s annoying is these low-effort posts that bring nothing. In some contexts the discussion is worth having, but we can do better than "it’s bad because it’s not in my pet language" groupthink.
can't people coping about rust come up with something fresh? always the same dance:
- fake annoyance about <thing> not being written in rust (bonus points if nonsensical, like here)
- if merely insinuated, fake question about what do they mean exactly
- fake surprise about omg why are people like this, the rust community is so bad, wah wah wah
yawn
oh yeah, let's not forget the other classic:
- the rust community is so dumb for thinking <shit they don't think made up for an easy beatdown>
- yeah ikr haha so stupid
every fucking rust thread is like this, and it's unreadable. by intention of course, obviously.
but it's ai / corporations / the government ruining the internet guys! totally...
There's literally a DSL designed for this purpose (Wuffs) so it would be interesting to hear why they didn't use it.
They say it is 5% slower. That's close enough. I know they say it isn't but in reality that speed difference is just going to be used as an excuse by the anti-Rust crowd.
Having said that I do think you could write a DSL to generate safe performant asm for a video codec. Just not a platform-independent one. It would still have to be asm.
Did you know the US consititues about 4% of humans? When we look at adults and age range that likely ever hear of D4vd we are talking probably considerably less that 1%.
The rest of humanity has no negative association with these four letters.
It's a recurring headline on the rolling news channels on broadcast TV right now - and it's on the front-page of Reddit for me as well.
[1] https://www.empireonline.com/movies/features/book-movie-titl...
Potentially... supposing the criminal investigation into this uncovers a hitherto unknown organ harvesting scheme operating within the global music records industry; the subsequent police dragnet implicates significant proportion of the world's music stars and record labels and generates continual major headlines and criminal convictions - with all their lurid details - all for multiple decades from now on.
It's quite ridiculous when I put it that way, but this is basically the same thing as Epstein's network, just with a different crime; and Epstein was already in the news almost 20 years ago from his first conviction.
...so back in 2009, back when everyone was building their own social-network websites and online dating services, and supposing your real-name was also Epstein, so you called it "EpsteinLoveIsland.com" - would you have changed the name back then?
So no one below the age of 60 is aware of this.
[0] highest reaching uk language news site in March 2026 - https://pressgazette.co.uk/media-audience-and-business-data/...
[1] >400M visits weekly - https://www.bbc.co.uk/mediacentre/2025/bbc-response-to-globa...
Why did you feel the need to explicitly specify that you're white as one of the reasons you didn't hear the news?
I'm not american either, but the news is all over social media platforms like reddit and Twitter, it's hard to turn a blind eye on them.
dav1d - started in 2018
d4vd - started composing in 2021
One day in the mysterious future the AV3 decoder will be dav3d.
https://en.wikipedia.org/wiki/Dangerous_Dave_in_the_Haunted_...
I wonder IFF Rust had an effects system that a Jasmin MIR transform (ie like SPIRV is for shaders) would be useful?
C compilers, Rust compilers, and assemblers are all deterministic.
Modern compilers are extremely clever and will produce machine code that takes full advantage of modern CPU branch predictors, and reorder instructions to better take advantage of pipelining. This in itself will make the same code run at different speeds depending on the input data.
Then there is the whole issue of compiler version roulette. As a developer you have no idea which version of compilers your users and distros will use, and what new and wonderful optimisation they will bring.
Within a version, yes, but not cross version. Different versions of GCC/Clang etc can give you completely different code.
Compare the number of CVEs against x264 (included decoders don't count!) and FFmpeg's H.264 decoder.
There's other memory-safe languages, and there's formal verification.
e.g. seL4 favors pancake.
Really? How many codecs have your neighbors contributed money for the development of, just curious.
However for the container/extractor... those should absolutely be in a memory safe language, and those are were a lot of the exploits/crashes are, too, as metadata is more fuzzy.
As a practical example of this see something like CrabbyAVIF. All the parser code is rust, but it delegates to dav1d for the actual codec portion