It’s difficult to talk about 4chan without addressing the idea of Anonymous. Though the imageboard includes a name field for individual users, doing so is not required, and the feature is rarely used. This is a fundamental quality of the 4chan platform; you don’t have offer any form of identification whatsoever. Virtually every users posts … Continue reading Breaking Through 4chan’s Anonymity
Of course, researching and studying online communities can be incredibly difficult. Contrary to popular belief, once something is posted on the Internet, it isn't necessarily "there forever." When I was in elementary school, I was constantly told that once something was online, it was impossible for it to ever be removed. The reasoning behind this is sound—encouraging young people to be cognizant of what information they share is incredibly important. However, the truth is that there is plenty of online content that has simply disappeared. People stop paying their web hosting bills, links fail to get updated, or perhaps in the countless petabytes of data old content simply gets forgotten. And in the case of 4chan, threads are regularly pruned and "content is usually available for only a few hours or days before it is removed." This ephemerality, combined with the anonymity afforded by the website, challenge traditional conventions of research. It isn't necessarily possible for someone to visit the same URL and access the same content. Given these challenges, I decided to work on creating an automated system to scrape 4chan content and save a local copy.