According to the many-worlds interpretation of quantum mechanics, there's bound to be one branch of universe where every UUID is the same. Can you imagine what those guys are thinking?
I'm using 16b55183-1697-496e-bc8a-854eb9aae0f3 and probably some more too.
I suppose if we all post our list here, then we can all check for duplicates?
We should all send our already-generated UUIDs to a shared database, we could just put it on Supabase with a shared username/password posted on HN, so we can all ensure that after generating a UUIDv4 locally, it's not used by anyone else. If it's in the database, we know it's taken.
It's a super simple mechanism, check in common worldwide UUID database, if not in there, you can use it. Perhaps if we use a START TRANSACTION, we could ensure it's not taken as we insert. But that's all easy, I'll ask Claude to wire it up, no problem.
Funny story no one will believe, but it’s true. A good friend of mine joined a startup as CTO 10 years ago, high growth phase, maybe 200 devs… In his first week he discovered the company had a microservice for generating new UUIDs. One endpoint with its own dedicated team of 3 engineers …including a database guy (the plot thickens). Other teams were instructed to call this service every time they needed a new ‘safe’ UUID. My pal asked wtf. It turned out this service had its own DB to store every previously issued UUID. Requests were handled as follows: it would generate a UUID, then ‘validate’ it by checking its own database to ensure the newly generated UUID didn’t match any previously generated UUIDs, then insert it, then return it to the client. Peace of mind I guess. The team had its own kanban board and sprints.
Although incredibly rare, it's not impossible so probably best to just plan for collisions. A simply retry should suffice. But I agree I feel like something is going on somewhere else ...
The math says no. UUID v4 has 122 bits of randomness, so collision probability for 15K records is N²/(2·2^122) ≈ 2·10^-29. That's somewhere around "fewer collisions per universe lifetime than atoms in your liver." Whatever you're seeing, the culprit is overwhelmingly somewhere else.
Things to check, in descending order of how likely they actually are:
1. Data import / migration / backup restore, perhaps? Did anyone load a CSV, run a seed script, restore a snapshot, or copy rows between environments at any point in the last year? This is what "duplicate UUID" is in 99% of cases. Check git on migrations, ops history on the DB, and ask anyone who might have been moving data around.
2. Application retry / rollback bug maybe? Code path that generates a UUID, attempts insert, fails on constraint violation, retries with the same UUID variable still in scope. Check whether UUID generation lives inside or outside the retry boundary.
3. Older versions of the uuid package in certain bundler environments would fall back to Math.random() instead of crypto.getRandomValues(). What version are you on? Anything <4.x is suspect; modern v8+/v9+ uses crypto everywhere correctly.
4. Could also be a process fork bug. If a UUID generator runs in a child process spawned from a parent that already used the PRNG, the entropy state can get copied. Rare in Node specifically, more historical in old Python/Ruby setups.
If you've ruled all of those out and the row really was generated independently a year apart via crypto.getRandomValues, go buy a lottery ticket. But it's almost certainly cause #1.
Statistically speaking, does extremely unlikely mean impossible? If it were replicable I'd raise my eyebrow, otherwise it's fair game, no?
As someone that enjoys the unterminable complaints about RNG in the video game scene, I would never trust any human's rationalization of random outcomes.
> Statistically speaking, does extremely unlikely mean impossible?
No, it means extremely unlikely. Collisions can occur, as op just found out, but the chances are so abysmally small that most people don't care.
Any application I have worked on, I always had a pre-save check to see if the UUID was already present and generate a new one if it was. Don't think it ever triggered unless a bug was introduced somewhere but good practice anyway.
I did not. Post-conditioning by your comment and the other one,I can see some signs such attempting to be unusually comprehensive. The 'atoms in your liver' could be an awkward human trying to be poetic about scales.
I still don't see idiomatic markers of AI so that's scary if your claim is correct.
Interesting enough, I skipped it when scrolling through the comments the first time. I think I instinctually do that to most karma whoring comments, no matter if manual or LLM generated.
Only noticed it because I did another pass and saw the replies talking about "AI".
Reminds me of some code I saw running in production. Every time we added a new entry, we were pulling all the UUIDs from this table, generating a new UUID, and checking for collisions up to 10 times.
The only guesses I'm having is that we originally generated UUIDv4s on a user's phone before sending it to the database, and the UUID generated this morning that collided was created on an Ubuntu server.
I don't fully know how UUIDv4s are generated and what (if anything) about the machine it's being generated on is part of the algorithm, but that's really the only change I can think of, that it used to generated on-device by users, and for many months now, has moved to being generated on server.
user-generated (as in: on the user's phone) was only at the very early stages of this product, and we've since moved to on-server. It's a cash-register type of app, where the same invoice must not be stored twice. So we used to generate a fresh invoice_id (uuidv4) on the user's device for each new invoice, and a double-send of that would automatically be flagged server-side (same id twice). This has since moved on to a server-only mechanism.
The database flagged it simply by having a UNIQUE key on the invoice_id column. First entry was from 2025, second entry from today.
The UUIDv4 collision is statistically extremely unlikely. What is more likely is both systems used the same seed. This might be just a handful of bytes, increasing the chance of collision to one in billions or even millions.
I've always looked at it the the other way - being that lucky would mean you have even less chance of something else lucky happening, good time to save your money
Would the UUID v7 be more collision proof? Hard to say because it takes time into account but then the number of entropy bits are reduced hence the UUID generated exactly at the same time have more chance of a collusion because number of entropy bits are a much smaller space hence could result in collusions more easily.
Just a stupid question, but why not append the date, even in seconds as hex. It's just a few bytes and would guarantee that everything OK now will be OK in the future?
You can just use a different UUID variant which includes timestamp data instead (e.g. v1 or v7), there are also variants which include the MAC address.
yeah, any sort of additional semi-random data could've helped prevent this, I'm sure. That, however, is also kind of the idea of UUIDv4, it has lots of randomness and time built in already.
The chance of a UUIDv4 collision is very low, but it is never zero.
If everything is done properly, then this is very likely the one and only time anyone involved in the telling or reading of this account will ever experience this.