From the Washington Post news section:
The company is overhauling its algorithms that detect hate speech and deprioritizing hateful comments against Whites, men and Americans.
By Elizabeth Dwoskin, Nitasha Tiku and Heather Kelly
Dec. 3, 2020 at 8:00 a.m. EST
Facebook is embarking on a major overhaul of its algorithms that detect hate speech, according to internal documents, reversing years of so-called “race-blind” practices.
Those practices resulted in the company being more vigilant about removing slurs lobbed against White users while flagging and deleting innocuous posts by people of color on the platform.
The overhaul, which is known as the WoW Project and is in its early stages, involves re-engineering Facebook’s automated moderation systems to get better at detecting and automatically deleting hateful language that is considered “the worst of the worst,” according to internal documents describing the project obtained by The Washington Post. The “worst of the worst” includes slurs directed at Blacks, Muslims, people of more than one race, the LGBTQ community and Jews, according to the documents.
As one way to assess severity, Facebook assigned different types of attacks numerical scores weighted based on their perceived harm. For example, the company’s systems would now place a higher priority on automatically removing statements such as “Gay people are disgusting” than “Men are pigs.”
Facebook has long banned hate speech — defined as violent or dehumanizing speech — based on race, gender, sexuality and other protected characteristics. It owns Instagram and has the same hate speech policies there. But before the overhaul, the company’s algorithms and policies did not make a distinction between groups that were more likely to be targets of hate speech versus those that have not been historically marginalized. Comments like “White people are stupid” were treated the same as anti-Semitic or racist slurs.
In the first phase of the project, which was announced internally to a small group in October, engineers said they had changed the company’s systems to deprioritize policing contemptuous comments about “Whites,” “men” and “Americans.” Facebook still considers such attacks to be hate speech, and users can still report it to the company. However, the company’s technology now treats them as “low-sensitivity” — or less likely to be harmful — so that they are no longer automatically deleted by the company’s algorithms. That means roughly 10,000 fewer posts are now being deleted each day, according to the documents.
The shift is a response to a racial reckoning within the company as well as years of criticism from civil rights advocates that content from Black users is disproportionately removed, particularly when they use the platform to describe experiences of discrimination.
Some civil rights advocates said the change was overdue.
“To me this is confirmation of what we’ve been demanding for years, an enforcement regime that takes power and historical dynamics into account,” said Arisha Hatch, vice president at the civil rights group Color of Change, who reviewed the documents on behalf of The Post but said she did not know about the changes.
“We know that hate speech targeted towards underrepresented groups can be the most harmful, which is why we have focused our technology on finding the hate speech that users and experts tell us is the most serious,” said Facebook spokeswoman Sally Aldous. “Over the past year, we’ve also updated our policies to catch more implicit hate speech, such as content depicting Blackface, stereotypes about Jewish people controlling the world, and banned Holocaust denial.”
Because describing experiences of discrimination can involve critiquing White people, Facebook’s algorithms often automatically removed that content, demonstrating the ways in which even advanced artificial intelligence can be overzealous in tackling nuanced topics.
A white man called her kids the n-word. Facebook stopped her from sharing it.
“We can’t combat systemic racism if we can’t talk about it, and challenging white supremacy and White men is an important part of having dialogue about racism,” said Danielle Citron, a law professor specializing in free speech at Boston University Law School, who also reviewed the documents. “But you can’t have the conversation if it is being filtered out, bizarrely, by overly blunt hate speech algorithms.”
In addition to deleting comments protesting racism, Facebook’s approach has at times resulted in a stark contrast between its automated takedowns and users’ actual reports about hate speech. At the height of the nationwide protests in June over the killing of George Floyd, an unarmed Black man, for example, the top three derogatory terms Facebook’s automated systems removed were “white trash,” a gay slur and “cracker,” according to an internal chart obtained by The Post and first reported by NBC News in July.