Mustafa Yavas, Graduate student, Sociology
This position is for web scraping op-eds from 15-20 Turkish newspapers, dating between April 2012 – September 2017. The aim is to create a database of op-eds that include 5 dimensions: author name, date, article URL, the op-ed title, and the op-ed text. The person who is hired will be expected to include documentation to go along with the web-scraping code.
There is already a Python script that was used to successfully scrape one newspaper. This code will serve as a benchmark script that will need adjustment for each particular newspaper’s website (particularly for the HTML parsing portion of the task). Scraping will be performed for each newspaper’s website separately.
Here is an example of the URLs to be scraped:
Applicants should ideally have a familiarity with:
Closing date: June 15, 2018
Contact: Mustafa Yavas
This project is funded with a Digital Humanities Lab Seed Grant.
Earn extra money while benefiting music cognition research at Yale. Subjects are needed for an EEG study of musical perception. Participants should be 18 years old or older and should...Learn More »
Fellowship opportunity to digitize Dickens The Beinecke Rare Book & Manuscript Library will be funding a graduate student fellowship, the official announcement of which is on the Graduate School’s website....
If you have programming skills and an interest in humanistic data, apply to work in the Digital Humanities Lab! We are looking to hire Yale undergraduate and graduate students to...