class: center, middle, inverse, title-slide # Tools for collaborating in teams --- class: inverse ## A test case This is a two-part exercise: **Part 1:** Analyze + document **Part 2:** Swap + discuss --- ## Part 1: Analyze + document Introduce yourself to your neighbor. Solve the following problem as a team. Assume you are a team within a larger project - i.e. keep your collaborators in mind Starting with the original dataset ([`data/gapminder-5060.csv`](data/gapminder-5060.csv) or [`data/gapminder-5060.xlsx`](data/gapminder-5060.xlsx)), complete the following tasks and **write instructions / documentation** for your collaborator to reproduce your work --- class: yourturn .center[ # Your turn: analyze + document (20 mins) ] 1. Visualize life expectancy over time for Canada in the 1950s and 1960s using a line plot. 2. Something is clearly wrong with this plot! Turns out there's a data error in the data file: life expectancy for Canada in the year 1957 is coded as `999999`, it should actually be `69.96`. Make this correction. <br/> **Pro goal**: Add lines for Mexico and the United States. --- class: yourturn .center[ # Part 2: swap + discuss (20 mins) ] Introduce yourself to the team closest to you. 1. Swap instructions / documentation with the other team, and try to reproduce their work, first **without talking to each oher**. > If your collaborator does not have/know the software they need to reproduce your work, walk them through it on your computer in a way that would emulate the experience. (Remember, this could be part of the irreproducibility problem!) 2. Then, talk to each other about challenges you faced (or didn't face) or why you were or weren't able to reproduce their work.</font> ??? .footnote[Thanks to Jenny Bryant and Mine Cetinkaya-Rundel for the idea for the original example!] --- ## How did that go? <img src="images/desperation-0.jpg" height=300> or <img src="images/exaltation.jpeg" height=300> ? <br/><br/> Where did things get problematic? --- ## Reflection - Were you successful in reproducing each others' work? - What tools did you use? - What would happen if your collaborator is no longer available to walk you through their analysis? - What made it easy / hard for reproducing your partners' work? - What would have to happen if: - you had to swap out the dataset or extend the analysis further? - you caught further data errors and had to re-create the analysis with corrections? - you had to revert back to the original dataset? --- ## Summary Everyone struggles with reproducibility and it is a hindrance to moving science forward Even with a fairly simple analysis, challenges were faced in four main areas: - organization - documentation - automation - dissemination --- ## Reproducibility checklist Near-term goals: - Are the tables and figures reproducible from the code and data? - Does the code actually do what you think it does? - In addition to what was done, is it clear **why** it was done? (e.g., how were parameter settings chosen?) Long-term goals: - Can the code be used for other data? - Can you extend the code to do other things? .footnote[source: datasciencebox.org]