Section outline

  • Course Overview

    On this course you will learn how to use R to manage data in a wide variety of formats, in a reproducible manner, and at scale.

    Learning Outcomes

    By the end of this course you will have gained:

    1. An understanding of Basic R commands and data structures for manipulating data
    2. The ability to read data from multiple formats in and out of R
    3. Proficiency using loops, conditional statements, and functions to automate common data management tasks
    4. Familiarity with R’s package system for extending its functionality
    5. The skills to clean and manage multiple complex datasets
    6. The ability to clean and manipulate textual data
    7. An understanding of basic web scraping techniques, for both standard web pages and the Twitter API
    8. An overview of the techniques and hardware necessary to manage large datasets efficiently

    • Course Instructor: Matt Denny

      • Matt Denny

        Matt Denny is a PhD Candidate in Political Science and Social Data Analytics, and an NSF Big Data Social Science IGERT Fellow at Penn State University. Matt spends most of his time developing methods for analysing text and network data, which he applies to a wide range of projects related to: U.S. congressional and bureaucratic politics, organizational dynamics, international finance, and more recently, neuroscience. As part of his research, Matt works with a variety of large and complex datasets on a daily basis. Matt has also taught dozens of workshops on data management, high performance computing and “big data” analytics in R. To learn more about Matt Denny, check out his website: www.mjdenny.com

        View Bio for Matt Denny
      • Course Resources

        You will need to access certain files and resources throughout the course to get the most out of the activities. You can find them all here. 

      • Video Transcripts

        You can access all video transcripts here. 

      • Pre-Course Self Assessment

        Before you dive into this course, spend a few moments reflecting on your familiarity with the topic and your current level of skills confidence. 

        You will then re-visit the same questions in our Post-Course Self Assessment and reflect on how the course has helped you develop in confidence and grow your skills.

        • Module One: Introduction to R and RStudio

          In this module you will learn about:

          1. Installing R and RStudio
          2. R Programming
          3. Five Basic Data Structures
        • Module Two: R Programming Fundamentals

          In this module you will learn about:

          1. Data I/O and Packages
          2. Looping and Conditional Statements
          3. Functions
        • Module Three: Data Management

          In this module you will learn about:

          1. Managing Multiple Data Sets by Example
          2. Working with Text Data
          3. Converting Long- and Wide-Format Data
          4. Dealing with Messy Data
        • Module Four: Automated Data Collection

          In this module you will learn about:

          1. Overview of Web/Text Scraping and Legal Considerations
          2. Basic Web Scraping
          3. Scraping Twitter
        • Module Five: Performance and Scalability

          In this module you will learn about:

          1. High Performance Computing (HPC) and Big Data
          2. Performant Programming
        • Post-Course Self Assessment

          Now you’ve completed the course, spend a few moments reflecting on where your familiarity with the topics and your confidence skills le vels are at now. 

          Has the course helped you develop new skills and grow your confidence?

          You'll need to complete the Post-Course Self Assessment in order to download your certificate. If you didn't do the Pre-Course Self Assessment before starting the course, please go to the top of the page and reflect on your familiarity with the topic and your level of skills confidence before you started the course.

          • Completion: Certificate

            Completing all modules (plus the pre and post-course assessments) will unlock the course certificate, which you can then download here. Your course certificate will only be made available once you have completed all these sections.

            If you have difficulty accessing your certificate, please contact the Sage support team at: onlinesupport@sagepub.co.uk. You can also check out this FAQ page which may be helpful.

            • Give Feedback About This Course

              Did you enjoy the course? Please take two minutes to share your feedback. We use learner feedback in future course updates and developments to provide an excellent learning experience.

            • Accessibility

              We have high standards of accessibility on Sage Campus and as of May/June 2024 all activities within this course are keyboard and screen reader compatible. For more details on accessibility standards, please see the Sage Campus Accessibility Guide.

              For those using assistive technology, please note that within this course:

              • Tab components: JAWS and NVDA behave slightly differently. For NVDA to keep reading, it is best to exit focus mode and go back to browse mode. 
              • Matching: JAWS does not read out question label on dropdown focus.