This README_Code.txt file was generated on 2020-11-09 by Elsa Loissel GENERAL INFORMATION 1. Title of Dataset: Code associated with the “The experiences of those who support researchers struggling with their mental health” 2020 report 2. Author Information A. Principal Investigator Contact Information Name: Dr Lucy Cheke Institution: University of Cambridge Address: Email: lgc23@cam.ac.uk B. Associate or Co-investigator Contact Information Name: Elsa Loissel Institution: eLife Address: Email: e.loissel@elifesciences.org C. Alternate Contact Information Name: Institution: Address: Email: 3. Date of data collection (single date, range, approximate date): 23rd October 2019 to 05th January 2020 4. Geographic location of data collection: Worldwide 5. Information about funding sources that supported the collection of the data: NA SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: 2. Links to publications that cite or use the data: Loissel E*, Tsang E*, Müller S*, Deathridge J, Perez Valle H, Yehudi Y, Cheke L, The experiences of those who support researchers struggling with their mental health, 2020 3. Links to other publicly accessible locations of the data: NA 4. Links/relationships to ancillary data sets: NA 5. Was data derived from another source? yes/no NO A. If yes, list source(s): 6. Recommended citation for this dataset: Loissel E*, Tsang E*, Müller S*, Deathridge J, Pérez Valle H, Yehudi Y, Cheke L, The experiences of those who support researchers struggling with their mental health, 2020 DATA & FILE OVERVIEW 1. File List: All basic stats and graphs.R: Code for all descriptive statistics Statistical Analyses.Rmd: Code for all statistical analyses Statistical-Analyses.html: Results of the code for all statistical analyses Reference: How to navigate the “Analyses_Graph_Rcode Cheat Sheet” and “Labels and Questions Cheat Sheet” Analyses_Graph_Rcode Cheat Sheet: Where to find the analyses described in the report associated with the code (by section and panel) in the code files Labels and Questions Cheat Sheet: explains the name of the variables in the dataset, how they are connected to the questions in the survey, and how they were obtained (what type of questions) README.md: What to do to re-run the code sessionInfo.txt: Information on the session used to run the code Licence: Licensing for this work 2. Relationship between files, if important: See above 3. Additional related data collected that was not included in the current data package: NA 4. Are there multiple versions of the dataset? yes/no NO A. If yes, name of file(s) that was updated: i. Why was the file updated? ii. When was the file updated? METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: The survey was created in collaboration and reviewed by academics and university officials before being released. It received ethical approval from the University of Cambridge Psychology department (2018-19/35), and was created on the Survey Monkey platform. It was open between the 23rd October 2019 and 05th January 2020, but only actively advertised until early December 2019. eLife is an open journal in the life sciences. Recruitment took place through eLife’s social media channels (Twitter, Facebook, LinkedIn); eLife’s newsletters to authors and early career community (voluntary subscriptions); directly emailing corresponding authors who have published with eLife; and by contacting groups of interest and institutions that may be willing to share the survey amongst their networks (e.g. New PI Slack, Mid Career PI Slack, Women in Academia Support Network, SMaRteN, Graduate and Postdoc unions in the US and the UK). Certain tweets advertising the survey were boosted (paid promotion) to reach an academic audience, in particular early career researchers. Overall, the most successful recruitment channel was direct contact with the eLife community (emails and newsletters: 944 responses collected), followed by Twitter (650 responses). The advertising around the survey focused on whether the potential respondents knew researchers who had struggled with their mental health. It did not explicitly call for academics – because support can also come from non-academics – or for respondents who had provided help – so that people who have never helped a researcher could take the survey and be compared to supporters. Still, as for most surveys, it is possible that respondents were more likely than the rest of their community to be aware of mental health problems, and to have provided help themselves.This bias could limit the conclusions drawn from the survey. The survey did not require identifiable information from the respondents, such as email addresses, institutions etc. However, it comprised a number of open questions where respondents could share their thoughts and experiences. As the data are to be released openly, individuals who did not consent to their open answers being made public have had their open answers removed from the public dataset. In addition, answers to open questions that could have allowed individuals or institutions to be identified were redacted. 2. Methods for processing the data: Data were extracted from the SurveyMonkey software, and then cleaned manually according to the following instructions: For data to be included, there were no upper or lower limits put on the number of questions answered or time taken to complete the survey. No outliers were identified in terms of number of questions answered, or finishing the survey too fast. Overall, 94 individuals completed the survey unusually slowly (taking over an hour compared with the group mean of 31 minutes), however these individuals were retained as this was not deemed to indicate poor data quality. Moreover, we excluded a small number of individuals (N = 56) who indicated that they did not understand the instructions properly. For example, the survey aimed to collect information on the supporter’s most recent experience helping someone else. Answers that indicated the respondents were relying on several experiences (as mentioned in open answers) were removed so it would be possible to draw conclusions based on the career stage of the supporters at the time of support. Certain multiple-choice questions allowed respondents to contradict themselves, for example checking “prefer not to answer” or “I am not sure” to questions they have actually answered. This was kept, as we could not be sure that these options were not checked by accident, or because respondents wanted to add that, in addition to their answers, there was something they were unsure about or preferred not to share. Certain forced or multiple choices questions allowed respondents to add an open answer. In those cases, some respondents gave responses that matched existing options they could have checked. Such answers were recoded into existing categories within the demographic questions (e.g. ‘Research assistant’ was recoded in the following category “A technician, research assistant, lab manager, field coordinator or similar lab staff”). In total, there were 21 instances of recoding across questions 4 (career stage during supporting relationship), 9 (career stage of the person supported), 28 (gender) and 30 (current career stage of the respondent). Where open responses could not be recoded into existing categories, new categories were created for demographic-related questions (Q4, Q9 and Q30): “Other academic staff (research & teaching)” for lecturers or assistant professors who do not oversee a research group and “Academic admin staff” for deans, directors of graduate studies etc. Only a very limited number of respondents belong to these new categories. Finally, a new variable was added to capture the proportion of respondents who identify as a minority for at least one reason besides gender (that is, ethnicity, sexual orientation, socio-economic status, disability status, or a reason specified by the individual). Answers outside of demographics, however, were not recoded, as this would risk wrongly interpreting a personal experience. 3. Instrument- or software-specific information needed to interpret the data: Analyses were conducted using R. See “README.md” file for precise instructions as to how run the code. 4. Standards and calibration information, if appropriate: See “README.md” file for precise instructions as to how run the code. 5. Environmental/experimental conditions: N/A 6. Describe any quality-assurance procedures performed on the data: See “Methods for processing the data” 7. People involved with sample collection, processing, analysis and/or submission: Loissel Elsa, Tsang Emmy, Müller Sandrine, Deathridge Julia, Perez Valle Helena, Yehudi Yo