I don't think the deal described here is even that egregious. It's basically a labeled data scrape. Any entity capable of training these LLMs are able to do this.
That said, I do believe there ought to be more restrictions on private use of these technologies.
A private company can 100% do this in many ways. They already do this buy putting up and using their technology in minority areas, for example.
We should ban the government from accessing data gathered by private companies by default, perhaps. I need to mull on it.
The government shouldn't be able to buy data that would be unconstitutional or unlawful for them to gather themselves.
On the other hand if a company is just aggregating something benign like weather data, there's no need to bar the government from buying that instead of building it themselves.
Now that sounds like a good argument to make in court! How do we do it?
That is trickier to decide on and surely there's room to debate.
What specific legal measures you do to enforce this, I don't know, there's some room for debate there.
I don't think tying the hands of the government is a viable solution. The sensitive data needs to not be collected in the first place via technical and social solutions, as well as legislation to impose costs on data collection.
- Teaching that "the cloud is just someone else's computer"
- E2EE cloud
- Some way of sharing things that don't involve pushing them to the whole internet, like Signal's stories.
- GDPR type legislation which allows deleting, opting out, etc
This isn't actually true (it varies by type of "cloud data", like content vs metadata, and the circuit you're in), and there are multiple recent carveouts (eg geofence warrants) that when the Supreme Court bothers to look at it again, suggests they don't feel it's as clear as it was decades ago. Congress can also just go ahead and any time make it clear they don't like it (see the Stored Communications Act).
It's also, just to be clear, an invented doctrine, and absolutely not in the constitution like the fourth amendment is. Don't cede the principle just because it has a name. Technical and social solutions are good, but we should not tolerate our government acting as it does.
Neither is there an expectation that automation would slurp it up and build a database on you and everyone else. Maybe the HN crowd is one thing, but most normies would probably say it shouldn't be allowed.
> Even the government doing the scraping directly I believe would not violate the 4th amendment.
Every time I see someone make a statement like this I think of the Iraq war era when a Berkeley law professor said torture is legal. Simply saying something that clearly violates the spirit of our rights is ok based on a technicality, I would not call that a moral high ground.
> The sensitive data needs to not be collected in the first place via technical and social solutions,
At this point and points forward I think your comment is much more on the mark.
> normies would probably say it shouldn't be allowed
Despite knowing about this, most continue supporting the various companies doing exactly that, like Facebook and Google.
> Neither is there an expectation [...]
Expectation is not law, and it cuts both ways. The authors of the 4th and 5th amendments likely did not anticipate the existence of encryption - in their view, the flip side of the 4th amendment is that with a warrant, the government could search anything except your mind, which can't store that much information. We now get to enjoy an almost absolute right to privacy due to the letter of the law. You might feel that we should have that right anyway, but many other governments with a more recent/flexible constitution do not guarantee that, and in fact require key disclosure.
> Expectation is not law.
It is in this case.
Expectation of privacy is a legal test based literally on on what "normies would probably say". If, as a society, we're moving more and more of our private effects to the cloud, there is a point where there's an expectation of privacy from the government there, regardless of the shadiness of the company we trusted for it, and regardless of what's convenient for the government.
https://www.law.cornell.edu/wex/expectation_of_privacy
Carpenter v. United States is a great example of this, where a thing once thought as obviously falling under the third party doctrine (cell tower location information) was put definitively within protection by the fourth amendment because of ongoing changes in how society used and considered cell phones.
And I forgot about this but just saw it referenced in the wikipedia article: it's notable that Gorsuch's dissent on the case argued for dropping the third party doctrine completely:
> There is another way. From the founding until the 1960s, the right to assert a Fourth Amendment claim didn’t depend on your ability to appeal to a judge’s personal sensibilities about the “reasonableness” of your expectations or privacy. It was tied to the law. The Fourth Amendment protects “the right of the people to be secure in their persons, houses, papers and effects, against unreasonable searches and seizures.” True to those words and their original understanding, the traditional approach asked if a house, paper or effect was yours under law. No more was needed to trigger the Fourth Amendment....
> Under this more traditional approach, Fourth Amendment protections for your papers and effects do not automatically disappear just because you share them with third parties.
The company doesn't have that power, but the government can compel companies to provide them with the same data as long as it exists, and then abuse it in the same way as if they had collected it themselves.
The government should be held to higher standards in terms of being able to appeal its actions, fairness, evidentiary standards. But the government shouldn't necessarily be prevented from acquiring and using information (which is otherwise legally obtained).
I don't disagree that we should perhaps more restrictions on private processing of data though -- GDPR style legislation that imposes a cost on data collection is probably sufficient.
I really don't understand why people treat it with such sacrosanct reverence.
It reminds me of a cup and ball street scam. Opportunistic people move things around and there's a choir of true believers who think there's some sacred principles of separation to uphold as they defend the ornamental labels as if they're some divine decree.
I mean come on. Know when you're getting played.
We put higher standards on the government because companies have the biggest propaganda coffers.
It’s not some rational principle. Money goes in, beliefs come out.
What's worse, is that third party doctrine kills your rights worse than direct police surveillance.
Imagine if you will, back in the day of film cameras: The company developing your film will tell the police if you give them literal child porn but otherwise they don't. But imagine if they kept a copy of every picture you ever took, just stuffed it into a room in the back, and your receipt included a TOS about you giving them a license to own a copy "for necessary processing". Now, a year after you stopped using film cameras, the cops ask the company for your photos.
The company hands it over. You don't get to say no. The cops don't need a warrant, even though they 100% need a warrant to walk into your home and grab your stash of photos.
Why is this at all okay? How did the supreme court not recognize how outright stupid this is?
We made an explicit rule for video rental stores to not be able to do this! Congress at one time recognized the stupidity and illegal nature of this! Except they only did that because a politician's video rental history was published during his attempt at confirmation.
That law is direct and clear precedent that service providers should not be able to give your data to the cops without your consent, but this is America so precedent is only allowed to help businesses and cops.
A private company can surely link its own cameras and data to create a private use database of undesirables. I’m certain that Walmart and friends do exactly this already. It’s the large scale version of the Polaroids behind the counter.
[^1]: https://arstechnica.com/tech-policy/2025/09/court-rejects-ve...
The government could legally create its own facial recognition technology if it wanted to. They're not avoiding the law, facial recognition isn't illegal.
This is not and example of the government sidestepping laws through a third party. You just don't like the existing laws, and would prefer to make certain things illegal that are presently legal.
That is, the US banking laws force private actors, under color of law, to systematically inspect the papers of those opening an account, which conveniently sidesteps the 4th amendment implication of the government searching the papers themselves at everyone opening an account at the bank. And then allows the government to act on the information of that forced search, even without a warrant.
---------- re: below due to throttling -------
I'm referring to this:
>The government cannot have a third party take action on its behalf to do something that would be illegal for the government to do itself.
It is illegal for the government to violate the 4th amendment, whether or not a 'law' beyond what is written in the constitution is present.
Clearly the government would love to just take all your information directly when you open an account, as that would be even better for them, but due to the 4th amendment they can't do that. But just asking or without a warrant requiring the bank to act on it or reveal it is almost as easy, so they just sidestep that by just requiring via the law the bank to search your papers instead. It's effectively a government imposed search but carried out by a 3rd party.
--------------------
>This is just factually wrong. The Bank Secrecy Act specifically requires that banks to provide this info. The 4th amendment does not prohibit this. If a bank refused to provide this required information, the government would go in and get that information directly.
>Again, no law is being avoided. You just don't like the
This is not 'just factually wrong.' The bank is doing the search instead of the government. A blanket search of everyone, even without a subpeona, even without an individualized notice, even without any sort of event that would require reporting to the government under the BSA, even then they still are required to search the information even in the instances that it doesn't end up being required to be transmitted to the government.
This is just factually wrong. The Bank Secrecy Act specifically requires that banks to provide this info. The 4th amendment does not prohibit this. If a bank refused to provide this required information, the government would go in and get that information directly.
Again, no law is being avoided. You just don't like the law.
> Always easier when you can avoid the law and just buy it off the shelf. (Emphasis mine)
No law is being avoided, neither in your banking example nor in the situation with Clearview. To be sure, people can have whatever opinion on the law that they want. But I do want to make it clear the the government is not "avoiding" any law here.
I 'member people who warned about something like this having the potential to be abused for/by the government, we were ridiculed at best, and look where we are now, a couple of years later.
Well, if that's true then employees of the companies that build the tools for all this to happen can also be held responsible, no?
I'm actually an optimist and believe there will come a time whena whole lot of people will deny ever working for Palantir, for Clearview on this and so on.
What you, as a software engineer, help build has an impact on the world. These things couldn't exist if people didn't create and maintain them. I really hope people who work at these companies consider what they're helping to accomplish.
What do you mean by this? If a government conscripts "average citizens" into its military then they become valid military targets, sure.
I'm not why you think this implies that developers working for Palantir or Clearview would become military targets. Palantir builds software for the military. But the people actually using that software are military personnel, not Palantir employees.
Yeah we typically call those people terrorists or war criminals.
I have friends who are otherwise extremely progressive people, who I think are genuinely good people, who worked for Palantir for many years. The cognitive dissonance they must've dealt with...
Hannah Arendt coined the term “the banality of evil”. Many people think they are just following orders without reflecting on their actions.
Only if an anonymous person or their property is caught in a criminal act may the respective identity be investigated. This should be sufficient to ensure justice. Moreover, the evidence corresponding to the criminal act must be subject to a post-hoc judicial review for the justifiability of the conducted investigation.
Unfortunately for us, the day we stopped updating the Constitution is the day it all started going downhill.
The problem is when the government changes the definition of 'bad actor'.
That is a myth spread by control freaks and power seekers. Yes, bad actors prefer anonymity, but the quoted statement is intended to mislead and deceive because good actors can also prefer strong anonymity. These good actors probably even outnumber bad ones by 10:1. To turn it around, deanonymization is where the bad actors play.
Also, anonymity can be nuanced. For example, vehicles can still have license plates, but the government would be banned from tracking them in any way until a crime has been committed by a vehicle.
Both good and bad actors benefit in the current system from anonymity. If bad actors had their identities revealed, they'd have a lot harder time being a bad actor. Good actors need anonymity because of those bad actors.
That said, the recent waves vaguely in the direction of the US government has demonstrated the weakness of legal restrictions on the government. It's good to have something you can point to when they violate it, but it's too easily ignored. There's no substitute for good governance.
If my memory serves me, we had a PCA and LDA based one in the 90s and then the 2000s we had a lot of hand-woven adaboosts and (non AI)SIFTs. This is where 3D sensors proved useful, and is the basis for all scifi potrayals of facial recognition(a surface depth map drawn on the face).
In the 2010s, when deep learning became feasible, facial recognition as well as all other AI started using an end to end neural network. This is what is used to this day. It is the first iteration pretty much to work flawlessly regardless of lighting, angle and what not. [1]
Note about the terms AI, ML, Signal processing:
In any given era:
- whatever data-fitting/function approximation method is the latest one is typically called AI.
- the previous generation one is called ML
- the really old now boring ones are called signal processing
Sometimes the calling-it-ML stage is skipped.
[1] All data fitting methods are only as good as the data. Most of these were trained on caucasian people initially so many of them were not as good for other people. These days the ones deployed by Google photos and stuff of course works for other races as well, but many models don't.
Frankly, I never imagined when I read that decades ago, that it could be underselling the horror.
Most Americans don’t pay for news and don’t think they need to - https://news.ycombinator.com/item?id=46982633 - February 2026
(ProPublica, 404media, APM Marketplace, Associated Press, Vox, Block Club Chicago, Climate Town, Tampa Bay Times, etc get my journalism dollars as well)
I subbed to Wired last year during a sale and uh... I was never given a premium account linked to my email and support would never answer me. I signed up for the print edition as well and never received any of those. I was getting their newsletter though and that was new. Then I emailed to cancel when I got a billing notification to my email and they were able to cancel it just fine so apparently I did have an account? And then like two weeks ago I received the latest print edition.
Truly have no idea what that was about, but anyway glad to see someone else out here supporting almost all the same news orgs as me (404media is amazing!)
They've sold out for years already, maybe decades. Why fund them now when the corruption is out in the open?
AP is really one of the few places I'd even consider donating to at this point.
There's a vast gulf between "Clearview AI was founeded by white supremacists" and "Smartcheckr, which later merged with Clearview AI, employed for 3 weeks someone who posted white supremacist content under a pseudonym, unbeknownst to the Clearview AI founders".
In fact, neither the Buzzfeed article nor the NYTimes piece accuse anyone of white supremacy.
Other notable white supremacists with material ties in the article:
Chuck Johnson [1] collaborated with Ton-That and "in contact about scraping social media platforms for the facial recognition business." Ran a white supremacist site (GotNews) and white supremacist crowd funding sites.
Douglass Mackey [2] a white supremacist who consulted for the company.
Tyler Bass [3] an employee and member of multiple white supremacist groups and Unite the Right attendee.
Marko Jukic [4], employee and syndicated author in a publication by white supremacist Richard Spencer.
The article also goes into the much larger ecosystem of AI and facial recognition tech and its ties to white supremacists and the far-right. So there are not just direct ties to Clearview AI itself, but a network of surveillance companies who are ideologically and financially tied to the founders and associates.
[0] https://en.wikipedia.org/wiki/Clearview_AI
[1] https://en.wikipedia.org/wiki/Charles_C._Johnson
[2] https://en.wikipedia.org/wiki/Douglass_Mackey
[3] https://gizmodo.com/creepy-face-recognition-firm-clearview-a...
[4] https://www.motherjones.com/politics/2025/04/clearview-ai-im...
But you wrote that "Clearview AI was founded by white supremacists". Even after your new set of links, this remains unsubstantiated. None of your links allege that the Clearview founders are white supremacists, they make an attempt at guilt by association.
[1] https://img.huffingtonpost.com/asset/5e8cc7922300005600169bd...
For example:
- every technology has false positives. False positives here will mean 4th amendment violations and will add an undue burden on people who share physical characteristics with those in the training data. (This is the updated "fits the description."
- this technology will predictably be used to enable dragnets in particular areas. Those areas will not necessarily be chosen on any rational basis.
- this is all predictable because we have watched the War on Drugs for 3 generations. We have all seen how it was a tactical militaristic problem in cities and became a health concern/addiction issues problem when enforced in rural areas. There is approximately zero chance this technology becomes the first use of law enforcement that applies laws evenly.
I think that is pretty unlikely
Crimes aren't solved, despite having a literal panopticon. This view is just false.
Cops are choosing to not do their job. Giving them free access to all private information hasn't fixed that.
The thing you're missing is our system is working exactly like it's supposed to for rich people.
For example, Deepseek won't give you critical information about the communist party and Grok won't criticise Elon Musk
The main problem with the law not being applied evenly is structural - how do you get the people tasked with enforcing the law to enforce the law against their own ingroup? "AI" and the surveillance society will not solve this, rather they are making it ten times worse.
>people inevitably respond to one part of your broken framing, and then they're off to the races arguing about nonsense.
I agree that this unproductive. When people have two very different viewpoints it is hard for that gap to be bridged. I don't want to lay out my entire world view and argument from fist principals because it would take too much time and I doubt anyone would read it. Call it low effort if you want, but at least discussions don't turn into a collection of a single belief.
>how do you get the people tasked with enforcing the law to enforce the law against their own ingroup?
Ultimately law enforcement is responsible to the people so if the people don't want it then it will be hard to change. In regards to avoiding ingroup preference it would be worth coming up with ways of auditing cases that are not being looked into and having AI try to find patterns in what is causing it. The summaries of these patterns could be made public to allow voters and other officals to react to such information and apply needed changes to the system.
You answered your own question - it's straight up bait.
Go lick boots elsewhere.