Meghan Finnerty | February 01, 2022
Student Uses Skills in Hopes of Reducing Hate on Social Media
Throughout 2020, users on mainstream social media saw more content moderation, warnings about misinformation, flagged or removed posts, than ever before – especially related to the U.S. Presidential Election and the COVID-19 pandemic.
“It is increasingly observable that social media presents enormous risks for individuals and communities as it is used as a medium for cyberbullying, trolling, spreading fake news, and privacy abuse,” said Adita Kulkarni, assistant professor in the Department of Computing Sciences.
Despite increased moderation, much still went undetected – spurring a student research project at SUNY Brockport.
For about eight months, Matthew Morgan ’23, an honors and computing science student under the direction of Kulkarni, has been designing platform-agnostic machine learning models to detect Sinophobia on social media.
“Although there has been other research on discrimination and hate on social media, this one is more particular to Sinophobia, which is discrimination towards Chinese people,” Morgan explained. “Ever since the pandemic began, they have experienced a lot more hate, for no reason. This is a problem that should be looked into, considered, and hopefully (we) find some sort of way to make it better.”
Morgan’s research focus includes posts on Reddit, a mainstream site that promotes content that is voted on by users, and Parler, an alternative platform that has gained popularity by promoting itself as a free speech platform with few restrictions.
“We are considering current machine learning models and methods, fine tuning them, tweaking them, and seeing how we can get them to work better for our data,” Morgan explained. The project’s data set includes a couple hundred million posts gathered from both platforms from January 2020 to April 2020.
Using text analysis tools, Morgan created a code that would sift through the entire data set looking for keywords that were Sinophobic or anything that included Chinese-related topics. That left about 10,000 posts that he had to process manually.
That manual process is like a big wake up to what the internet really is. Bad stuff can happen, and you can’t let it go undetected.
While some posts were not racist — instead benefiting the model by providing context — Morgan found others were blatant bullying or harmful. “Going through them manually you’re seeing terrible things that you would never believe people would even say,” Morgan said. “That manual process is like a big wake up to what the internet really is. Bad stuff can happen, and you can’t let it go undetected.”
Kulkarni agrees, seeing a clear benefit of the use of machine learning due to the popularity of social media and volume of online bullying and disinformation.
“Online abuse is not always clear and explicit. Abusers are subtle and they intentionally use coded language, symbols, and euphemisms to slip past generic filters. Internet slang and acronyms are constantly changing. Context, sarcasm, and socioemotional cues in electronic communications is difficult to detect,” Kulkarni explained. “But with machine learning, our goal is to come up with a solution that would benefit the society.”
With each post evaluated, machine learning uses the data to better predict future posts and their appropriateness. Morgan plans to continue to work on this model for his honors thesis over the next semester or two and would like to publish his research. He plans to present at the 2022 National Conference of Undergraduate Research in April about its outcomes.
“At the end of the day AI (artificial intelligence) still needs to improve,” Morgan said. “Even if the machine learning isn’t perfect, we should still research them. Because a lot of stuff is going undetected, and people are getting away with saying terrible things to one another.”