DATA SCIENCE FOR LINGUISTS
Data science is a fast-growing professional and academic discipline that is highly interdisciplinary in nature. Its practice centers on domain expertise: this course will introduce linguistics majors to core methods and practices in data science as it pertains to linguistic inquiry. Students will first learn the fundamentals of structuring, manipulating and sharing various forms of linguistic data; be given hands-on training on practical aspects of data processing, including handling large quantities of text data ('big data') and creating statistical language models through machine learning; and get acquainted with the emerging field of knowledge engineering and ontology. Additionally, they will be given a chance to apply data-intensive methods to a term project of their choice. Upon successful completion of this course, students will be able to: identify the best methods for representing and analyzing linguistic data for a given purpose; transform and process linguistic data in large volumes; and understand how statistics-driven text analytics and machine learning methods operate.