To me the small web is any little website that was created to be interesting rather than to sell me something. That includes stuff like neocities, "shrine" type sites, single purpose sites, fandom portals, web experiments, etc.
Unfortunately Kagi's definition of "small web" is: blog or webcomic. You must have an RSS feed and it must have recent posts. That rules out so much interesting stuff I don't understand the point.
Heavy Kagi user and the idea behind small web was appealing; but how its implemented don't click with me
Their rules excludes an absolute gem like https://www.sheldonbrown.com/ which is, to me, the essence of what we could call the "small web".
Each times the topic pops up, I try a few random ones and never found anything interesting.
There's also novelties like https://www.howmanypeopleareinspacerightnow.com/, this probably hasn't been updated in a decade but that makes it no less interesting.
Then there's exceptionally cool demos like https://thelongestyard.link/q3a-demo/. This sort of thing just doesn't fit in a "blog" format unless you're writing a blog about how you built it and linking out to it.
If anyone knows of a directory of sites like these (preferably with a shuffle option) I'd love to hear about it (and contribute)!
Tl;dr: I feel like the long-tail web (90s) was better, but economics pushed high-update-frequency more-centralized results.
The criteria is simple: human-written (as much as I can validate myself), in English (for now), with valid RSS feed, and not a micro-blog (so, more than just feed of links or short tweet-like messages).
Similar to Kagi's Small Web viewer, or StumbleUpon-style viewer: you can get a random listing of blogs [1] or a random listing of posts from all blogs [2]. Feeds and posts are indexed, so full-text search works across all blogs. When possible and permitted by robots.txt, text is scraped for searching, so even if some text is omitted in the RSS feed by the author, search should work.
Though I do plan to implement a similar "view one random post at source" kind of view, soon.
UPD: Feel free to submit a blog, including your own! [3]
[1] https://minifeed.net/blogs/by/random
A topic that's come up before on here with others doing the complaining about a list I liked for this reason but wasn't top-loaded with tech: https://news.ycombinator.com/item?id=47015676
https://github.com/kagisearch/smallweb/blob/main/smallweb.tx...
There is also Small Comic:
https://kagi.com/smallweb/?comic
https://github.com/kagisearch/smallweb/blob/main/smallcomic....
And Small YouTube:
https://github.com/kagisearch/smallweb/blob/main/smallyt.txt
https://hcker.news/?smallweb=true
https://news.ycombinator.com/item?id=46618714 (Ask HN: Share your personal website, 2414 comments)
Jokes aside, it's really nice and I can totally see becoming addictive. Kudos to Kagi team for an other user oriented product. (as a side note, I am using Kagi daily and i didn't know about this tool)
I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for. That was back in like 2010. To me that's the old, and useful, search engine that I want.
Is Kagi still better than Google? Probably, I don't really know because I don't use Google anymore. But at this point I feel like I'm with them out of inertia more than being an avid supporter. One of these days I'll re-evaluate Google and decide whether to switch back or not.
It does occasionally surface interesting results from small sites that you wouldn't get on Google. I do find that to be useful.
Kagi definitely isn't a bad search engine by any means. Honestly if you haven't used it, try the 100 search free trial on one device. Maybe you'll like it. This feels more like a general decline of the open web.
"Better than Google" and the fact that I can choose websites to exclude from my search results are two features that I remain willing to pay for, however.
The exact same search on Kagi ('best lllm for coding') nets reddit, hacker news, and some other forum results right at the top, followed by a long dense list of links to various sites (including some of the same blogspam of course), but over all the results are hugely more rich and varied and also not at all the same.
How can you possibly say that a site that gives you 50% ads and a bunch of low quality links is remotely "only a little better" than a site that gives you zero ads and a huge number of better quality links?
I honestly would love to be able to give my Kagi key to the ChatGPT or Claude clients (or more realistically, configure a proxy) just to have it be their primary tool for searches—respecting my site rankings/lists
On that note, Kagi research is legit amazing. There have been times I’ve spent 30min searching for something without success. As a last resort I asked Kagi research and it found why I could not. More than one option even. Now intend to use almost more than normal search.
I'm thinking of trying it out Kagi, but adding another monthly commitment is what's holding me back.
A single credit top-up and occasional usage until the credits run out sounds good to me.
Also, from the Kagi privacy pass FAQ at https://blog.kagi.com/kagi-privacy-pass#faq:
*Do you plan to allow purchasing privacy pass tokens without having an account?*
Yes, this makes sense. This is possible because technically the extension does not care if you have an account or not. It just needs to be 'loaded' with valid tokens. And you can imagine a mechanism where you could also anonymously purchase them, eg. with monero, without ever creating an account at Kagi. Let us know *here* ( https://kagifeedback.org/d/6163-kagi-privacy-pass ) if you are excited about this, as it will help prioritize it.
Personally I don't like being signed in during searches, this seems like a good solution.Feels like your comment saying it was too much effort to cancel Kagi took more effort than cancelling Kagi.
If you don't use the service in a month, they just refund you. This has kept me from unsubscribing for years now. Some months I use it, some I don't.
It's more of hassle to unsub, and re-sub again when I want.
Querying for something like "snowflake json from variant?" in both engines and in google I get a sort-of-right-but-not-really-that-helpful ai summary about "parse_json" function. In Kagi I get an actually useful summary with code examples of parse_json, but also the colon-based syntax for accessing values inside nested objects without needing to parse anything.
I very rarely need to go into a page, I use Kagi quick search summary with the "?" suffix and it almost always gives me a useful answer in one-shot.
Second, if you want this kind of LLM-digested search result, Google AI studio blows everything out of water (including Google search, obviously).
[0] I've never bought into the idea that old Google was so much better. But it seems to be a very popular opinion on HN. ymmv.
I see a problem with this.
Were the models underlying those features trained on all available web content, or are they unlike any other enterprise models out there?
At any rate, you should see a bigger problem in what Google does, which you don't seem to.
"I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for"
It's funny to me that (to my knowledge) no browser (mainstream?) implement this functionality yet. Seems like a no brainer to index what the user have actually seen... (Could even be restricted based on viewport - I don't think it's that crazy of an idea)
I know there's a a number of third party programs which does though. Of course - multi-device being the norm - complicates things.
The answer to this is complicated.
Both Google Chrome and Microsoft Edge actually implement this. Behind the scenes, both will upload your browser history to the cloud. You can see it in network packet captures. It's implemented in the browser for the vendor, but not for the user.
The choice to not implement this for the user is very deliberate. It's contrary to the vendor's interests if the browser provides this capability directly to users. If a user's browser can take you to a website directly, then you are not using the vendor's search engine, meaning you are not looking at their ads, paid search results, algorithm, etc. It would severly impact their business model.
This is also the reason why browsers have:
- Adopted Google Chrome's "Omnibar" instead of a separate address bar and search bar.
- Implement only basic hierarchical organization for browser Favorites.
Directly and indirectly, Google is the central nexus of all modern browsers. Aside from Google Chrome, they also:
- Fund the vast majority of Firefox.
- Pay Apple for preferential treatment.
- Provide the same mechanisms to vendors who base their browsers on Chromium (i.e., Microsoft Edge, Brave).
I would love for this to not be the case. There is hope to be found in small independent browser and search companies/projects.
On the other hand, the additional tools in the Omnibar (calculator is the example most should be familiar with) makes the bar incredibly useful for random daily tasks. Also, it seems that there is an "omnibox" API that extensions can use, which allows them to add their own tools to the omnibar/omnibox. Would be interesting as a form of "assistant" in a way.
I'm fairly certain I've caught Firefox doing something similar (regularly sending multiple tens of MB to Google servers in the background.)
And, of course, Firefox is open source and this wouldn’t be kept a secret.
I've read all the Mozilla help pages about what automatic connections Firefox makes and it wasn't accounted for there (unless possibly something to do with SafeBrowsing.)
I wonder if the EU could fine them a couple weeks of revenue for this. Seems illegal.
Citation needed... (I'm talking about the page *content*, not the metadata like url and title)
Did even Microsoft try something like this? It's of course something you'd only want running locally
Which company would you trust with this kind of deep surveillance information on you though?
I guess because it isn't then trivial for a web browser to do, indexing every text ever rendered?
One of the reasons I love Kagi is that it respects double-quotes for exact matches. This might seem trivial except I remember being frustrated with both Google and DDG years ago for throwing irrelevant results at me even when I'm querying for an exact match. When Kagi was in beta and I got invited as an early adopter, my feedback to them was that I want a search engine that won't throw crap at me when I'm looking for an exact string match. They've honored that feedback! Even though Kagi doesn't necessarily have the most results, I want double-quotes and things like intitle to actually work as expected.
Another awesome thing about Kagi is how it lets you prioritize certain domain names. Likewise, it's great for blocking domains completely. All of this has made my search results very clean.
To each their own. I'm not saying you're wrong, but to me there's no comparison between Kagi's results and every alternative I've tried.
Oh, another thing I like about Kagi is that it's less censored than Google, Bing, and DDG these days. I used to be a fan of DDG until I noticed that results were sparse or nonexistent for anything even remotely controversial I queried. It became too PG-rated.
The assistant is a nice addition but it’s search is superior for me.
Kagi made search feel just “right” it was simple, got the job done and had some really simple but cool search features.
But over time they started doing way too much, and I kept seeing more and more features that I really didn't want. It felt like I was paying for all this while I just wanted to type something on to a text box and click search and see a bunch of results organized according to my filters.
I wish they would just dump all the other nonsense projects like ai and just focus on search only. Or give me an option to pay for search only without any limits.
Are you really saying that a company specializing in search - natural language oriented at its core - should not make use of the biggest technological revolution for processing natural language?
perhaps the best we can do is this "small web" thing which can be seen as some sort of retrofuturistic solution, but of course the siloed internet is a black hole of content and effort, and of course if the small web gets enough traction, astroturfed generative AI content will target it
Kagi I've been using and it's fine. Better than DDG for sure. But sometimes I still go back to google to find something kagi is struggling with.
Also on Kagi if you see bad results, you can flag the website to ignore it.
Kagi value proposition for me is not the $5 search but the $10 search plus whatever AI chat model you want (I originally did ultimate when I used it for coding). Controllable search and chat satisfies all my one-shot needs.
I can't really blame Kagi for the web getting bad or for the weak market for secondary search. Part of me wonders if they could use the AI search tools now on the market (now getting lots of investment) instead of the human indexes (subject to monopoly control).
It really feels either intentional or egregiously incompetent.
[1]: [The Man Who Killed Google Search](https://www.wheresyoured.at/the-men-who-killed-google/)
The reason, that Google is not like it was back in the day is that they are fighting a massive, antagonistic industry designed to game Google. The reason that chatGPT et al improves on search is that there's a effective but very expensive compute layer on top, not that they are better at the Google game. (This extra layer works out fine, because our time is more valuable and Google always came at an insane discount, also thanks to ads)
I've had a few experiences now where someone is standing over my shoulder asking me to look something up, and I search kagi, find nothing, then search google and find what they asked me to look up. Then when they ask "what was that other search engine you used first?" I don't feel compelled to vouch for kagi :(.
Is that even possible today considering there is so much more information and pages around today than in 2010? Old google worked with old Internet. The old Internet does not exist.
Im using qwant now and i feel its better.
And yes, Google's founders were right that web ads would kill that experience you want.
The main usecase for Kagi is the fact that you can personally uprank/downrank/pin/block sites. And it has a bunch of creature comforts built in like:
- Attempting to detect AI slop, concatenating listicles ("10 best ...") under one search result heade
- Attempting to block translated Reddit results
- Custom lenses that search only coding resources or recipes or whatnot
- Redirects (so x.com > xcancel.com), although I feel this should be a browser feature
- Better translate than Google
There's probably a few things I'm forgetting.
Kagi is abysmal at image search though. Just assume you will have to use Google for that.
It's interesting to hear that you can't find what you wanted easily on Kagi.
I don't even use the AI assistant much, only when there are a lot of disjointed search results and I want a quick summary.
Could just be that I’m familiar enough with google to always be able to make it work for me, could be a frog in boiling water type situation, but… as much as Kagi gets talked up on HN, I was pretty disappointed when I tried it. I was ready to get blown away, and instead I was underwhelmed.
It refreshes every 5 hours and shows you the most recent blogs published on Kagi. Check it out!
https://kagi.com/smallweb/?url=https://pliutau.com/reading-l...
> This page is auto-generated from Github Actions workflow that runs every day at night and fetches the 5 latest articles from each of my favorite blogs.
Also somehow if they are clever, they could use this for those translation system they are using, but please let us select our own language without feeding automatic translation like youtube does).
For me it says I'm blocked due to hitting a "secondary" rate limit (don't understand what that means). I don't think I've opened a page on github yet today so clearly it's a lie. Is it the referer that triggers this?
In general, freeloading the "small web" on a Microsoft service is kind of ironic. Being blocked by algorithms that try to detect if you're really human is precisely one of the things one would hope to get away from by using small, personal websites
No scrapers running on my IP address btw, at least not since it was assigned to me ~10 hours ago (I'm in one of those countries where ISPs seem to have agreed amongst each other that IP addresses must change daily so you can't reliably host things)
No browser prevents that by default, but this tip is found in pretty much every "best practices" hosting tutorial, so it's very common to stumble upon that browser error in the wild.
Previous post 7-sept-2023 https://news.ycombinator.com/item?id=37420281 185 comments. And https://news.ycombinator.com/item?id=39476015 23-feb-2023 36 comments
There are a surprising amount out there: https://blog.woblick.dev/en/2025/best-stumbleupon-alternativ...
Newsletter version if you prefer: https://randomdailyurls.com
Weird times. People are training their LLMs on my content yet people are still interested in technical content written by a human being. So I guess you just keep writing, right? I find it disheartening to know I'm training LLMs but I think I'm more encouraged knowing there are still humans reading it.
Curious what goes on behind the Next Post and Show Similar buttons.
Perhaps I'm yelling into the void here, but what would be great is when first landing at kagi.com/smallweb, the url query parameter would be somehow set, as it is when "Next Post" is clicked.
In any case, my Kagi search for the article containing the memorable phrase "rare as rocking-horse s*t" came up empty. Perhaps it's not yet been indexed.
Curious if there are any statistics on which topics people are writing about.
The worst case scenario is that AI runs everything, we have no skills, and are completely dependent on it...and it shows us crummy commercials and subtly steers us to paid placement with no recourse whatsoever. I hate this possible future, but this is where the money will lead.
> "It would hurt consumers, and we'd have to think about what we'd have to do about that, but that's really the last of our concerns right now."
I don't think it's Kagi's fault, but I guess it's depressing in a way. A lot of "small web" bloggers dream of being a part of the "big web", and when they get a cheat button, they have no second thoughts about mashing it.
:-)
You can choose similar sites by index.
But what are the criterion to have your site listed here, or how it will prevent this from just becoming a massive gamified advertising index, or anything more about "why these?" is not obvious to me.
Can anyone explain what is special about these sites specifically, or where this project is going?
Quite possible that people will come up with a solution eventually. Like Samizdat was a solution to censorship and a broken publishing system in USSR.