“Any commercial LLM that is out there, that is learning from the internet, is poisoned today,” Jennifer Swanson said, “but our main concern [is] those algorithms that are going to be informing battlefield decisions.”

Even as the Pentagon makes big bets on big data and artificial intelligence, the Army’s software acquisition chief is raising a new warning that adversaries could “poison” the well of data from which AI drinks, subtly sabotaging algorithms the US will use in future conflicts.

The fundamental problem is that every machine-learning algorithm has to be trained on data — lots and lots of data. The Pentagon is making a tremendous effort to collect, collate, curate, and clean its data so analytic algorithms and infant AIs can make sense of it. In particular, the prep team needs to throw out any erroneous datapoints before the algorithm can learn the wrong thing.

Commercial chatbots from 2016’s Microsoft Tay to 2023’s ChatGPT are notorious for sucking up misinformation and racism along with all the other internet content they consume. But what’s worse, Swanson argued, is that the military’s own training data might be deliberately targeted by an adversary – a technique known as “data poisoning.”

Breaking Defense - More Here

Think of it this way; if AI is learning from Campfire Forum posts, it going to be pretty messed-up, huh.

[Linked Image from i.postimg.cc]

Last edited by SupFoo; 04/19/24.