When was the last time you actually saw the Facebook log-in page? Other than those few times you might be accessing the social network on another computer, there's a good chance that you don't really see this page all that often. I know that for me, between mobile apps and just staying logged in on … Continue reading Social Media Log-in Pages
In early December, the micro-blogging website and popular social media platform Tumblr announced that it was making a significant change to its website content policies. In a blog post titled "A better, more positive Tumblr" the company explained that effective December 17, posts containing adult content would no longer be allowed on the Tumblr platform. … Continue reading The Tumblr Porn Ban
"We don't use the expression 'IRL.' We say 'AFK.' But that's another issue. We think that the internet is for real."
I've been passively gathering data from 4chan's /pol/ board to keep tabs on the "Qanon" conspiracy, and the communities that were promoting it. I had started out with a "set it and forget it" sort of a deal for my 4chan /pol/ scraper. The problem is, though, I set it and the I forgot it.
Read on about some of the challenges of studying online communities and ephemeral content...
It’s difficult to talk about 4chan without addressing the idea of Anonymous. Though the imageboard includes a name field for individual users, doing so is not required, and the feature is rarely used. This is a fundamental quality of the 4chan platform; you don’t have offer any form of identification whatsoever. Virtually every users posts … Continue reading Breaking Through 4chan’s Anonymity
Of course, researching and studying online communities can be incredibly difficult. Contrary to popular belief, once something is posted on the Internet, it isn't necessarily "there forever." When I was in elementary school, I was constantly told that once something was online, it was impossible for it to ever be removed. The reasoning behind this is sound—encouraging young people to be cognizant of what information they share is incredibly important. However, the truth is that there is plenty of online content that has simply disappeared. People stop paying their web hosting bills, links fail to get updated, or perhaps in the countless petabytes of data old content simply gets forgotten. And in the case of 4chan, threads are regularly pruned and "content is usually available for only a few hours or days before it is removed." This ephemerality, combined with the anonymity afforded by the website, challenge traditional conventions of research. It isn't necessarily possible for someone to visit the same URL and access the same content.
Given these challenges, I decided to work on creating an automated system to scrape 4chan content and save a local copy.
For anyone who's ever had to call a large company about seemingly anything, you are almost certainly highly aware of automated phone tree systems. Rather than hiring a real person to answer and direct phone calls, a computerized system presents a menu of options to direct the caller to an appropriate department or individual that … Continue reading Everyone But You is a Robot