the wildest part is algolia just not responding. you email them saying "hey 39 of your customers have admin keys in their frontend" and they ghost you? thats way worse than the keys themselves imo. like the whole point of docsearch is they manage the crawling FOR you, but then the "run your own crawler" docs basically hand you a footgun with zero guardrails. they could just... not issue admin-scoped keys through that flow
If this happens so often, perhaps Algolia should improve their stuff to prevent this? For example, by implementing a dedicated search endpoint that doesn't accept normal API keys, but only dedicated read-only keys.
Thanks for this. I was maybe using one of these keys until this morning. When I logged in at dashboard.algolia.com and went to Settings -> API Keys, I found that none of the keys (Search, Analytics, Usage, Monitoring) matched the key I was using on a frontend. I made a decent attempt looking for that old key anywhere in their admin panels and could not find it. poof!
So perhaps at some point, they were only giving admin keys (because I don't remember there being a choice; and I would think given the choice I'd make the right one) and when called out (or sometime prior) realized the problem and made a new Settings -> API Keys page. Currently on the page the first one listed is the Search Key, with the subtext "This is the public API key which can be safely used in your frontend code. This key is usable for search queries and it's also able to list the indices you've got access to."
Having a search and having a functional search are two very different things though. To this day, the search on many sites is so bad that it's actually better to use a search engine and scope by site rather than use the site search.
Algolia really needs to make using the admin key less easy. I’ve almost copied it before when setting up a frontend. It should be tucked away and require auth to view.
This highlights a systemic problem: developers often prioritize speed of integration over security hygiene, especially when dealing with third-party services. The tradeoff is acceptable until it isn't. We need better tooling to automatically detect and flag these types of exposures before they make it to production.
Man, talk about unnecessary graphs... ok graph 2 is maybe tolerable, although it's showing the popularity of the projects, not a metric of how many errors/vulnerabilities found in those projects.
I'm not a newspaper editor, but I think if this was an article for one, they'd also say the graphs are unnecessary. It smells of "I need some visual stuff to make this text interesting"...
Dude there’s only three graphs in there. Do they really bother you that much? The third may be a bit unnecessary but I think the visuals add to the post.
If you’re “helping a kid” then I guess I can help you. Help is criticism delivered with a constructive tone. Criticism can be helpful if you look past the tone.
Fully agreed; this is something that always baffles me when it's misunderstood so often. Regardless of whether it's logical or not, tone and attitude in practice does influence whether people are convinced by something, so if your goal is to actually change how someone else acts, you will not be as effective if you don't care about how you come across. Being right is not always enough, so even if the style of communicating doesn't seem like it "should" matter, in practice it genuinely does if success is measured by whether the change happens or not.
Of course, if the goal is just to be right rather than to convince someone else about what's right, how you're saying something doesn't matter, but at that point you've already reached the goal before you started talking to them, so it's worth reexamining what you're actually looking to get out of a conversation at that point.
I liked the graphs. When skimming posts i often stop on graphical elements and decide if I want to understand the context or continue skimming. In this context, all three graphs were useful for me.
Posts with just text are sense and just not nice to read. That's why even text-only blog posts have a tendency to include loosely-related image at the top, to catch reader's eye.
It's Friday night / Saturday morning. Who wants to be reading text?
Especially on night mode themes.
Besides, can we read anymore? In the age of 'GPT summarise it me' attention spans and glib commentary not about the content of the article being all many people have to add, perhaps liberal application of visualisations adds digestive value.
Have to agree with _pdp_ on this one. I just don't see the need for an LLM agent to do a recursive grep for API keys in public repos.
Not saying people shouldn't build these tools, but the use case is lost on me.
It feels like the industry is in this weird phase of trying to replace 30-year-old, perfectly optimized shell utilities with multi-shot agent workflows that literally cost money to run. A basic Python script with a regex matcher and the GitHub API will find these keys faster, cheaper, and more reliably.
Automating these sweeps works fine until you need to escalate beyond public misconfig and start hitting rate limits or WAF traps, at that point, blending in gets harder than it looks. If you focus on fast key discovery, expect a lot of false positives unless you build context awareness for the apps those keys unlock, otherwise you just end up chasing useless tokens all day.
Great write up. Reminder that if you commit these to a Github Gist and the provider partners with GitHub for secrets scanning, they’ll rapidly be invalidated.
"If the secrets issuer partners with X-corp for secret scanning so that secrets get invalidated when you X them, then when you X them the secrets will be invalidated".
? Yes? Toomuchtodo is reminding the author (and other commenters), that github gists are one way to make sure secrets are secured / remediated before making a public post like this. Maybe not the most responsible whitehat action, but I can see it being useful in some cases where outreach is impractical / has failed.
Unfortunately, it doesn't look like Algolia has implemented this
I'm not following this at all. It seems like OP is saying if you share a secret in your (private?) gist and give Algolia permission to read the gist, they will invalidate it. But why would the secret be in a gist and not a repo? Also if you're aware enough to add that partner it seems you're aware to not do dumb things like that in the first place.
If you find an exposed token in the wild, for a service supported by GitHub Secret Scanning, uploading it to a Gist will either immediately revoke it or notify the owner.
Yes, and in the real world where Grice's Maxim of Relevance is in force, then when the secrets issuer that is the subject of the discussion isn't one of those partners, then an informative "reminder" that GitHub "has a secret scanning program" with a bunch of other partners is not actually informative. It's as superfluous and unhelpful as calling to let someone know you're not interested in the item they've posted for sale on Craiglist (<https://www.youtube.com/watch?v=xWG3jKzKcm8>).
Yes it is. Reminding somebody of this feature is useful to somebody, even if it's not completely relevant to the topic being discussed. Calling out a supposed tautology is the opposite of useful: it helps nobody and just clutters things up.