A talk from the OSS & Tooling 1 session at RSECon23 at Swansea University, on 2023-09-07.
Programme link: https://virtual.oxfordabstracts.com/#/event/4430/submission/51
Slides: https://docs.google.com/presentation/d/1EG-rzLm6_Y5A-xs3mkvkJvjWuOuls9-m/edit?usp=drive_link&ouid=114557667953982071875&rtpof=true&sd=true
Abstract:
As RSEs, we aim to develop research software in a way that makes it reusable by others. Sometimes, the community of users and contributors around our reusable software grows quickly, while the number of people maintaining the software stays the same, and is often very small to begin with. In this situation, our decisions about the software and the project may affect many more people than anticipated. This calls for a formalization and communication of how we make and enforce decisions within the project, i.e., governance. When we devise governance, there may be many open questions, some of them rather philosophical: What is the actual mission of the project, what are its aims and scope? What are the foundational principles of our project, especially for interacting with different roles in our community? What requirements do we need our model of governance to support? Practical questions follow: What organizational structures do we need? How should these structures interact with each other? What are the processes we need, and how do we implement them? Who gets to decide what? What are the risks, and what do we do if something goes wrong? How can we ensure that governance does not add too much to the already heavy workloads of the people that are involved? And finally: How do we document and enforce all of this? In this talk, we share experiences from recent work to design and implement governance for the Citation File Format project, and some challenges we faced.
And I'd like to start off by thanking uh the two reviewers for this talk who I think should actually be co-authors on the slides because I think their conss have made this talk much better hopefully more interesting to use when applicable and I'd like to thank Sarah
Because I could just stand here and say do what she said but also write down the way you make decisions in your project so I'm not going to do that I'd like to take you on a we journey that I've been on and we've been on recently um with a
Little project of mine um I'll try and address questions such as uh what is governance in the first place when and why may RC's need governance and I'll try to come up with a we list of ingredients if you want to set up governance for your own project and give
You an example as well and share some lessons learned in the end and these were all questions that we were asking ourselves over the last uh year and a half or so um the example project I'll be talking about mainly is not really a research software project it's a thing called the
Citation file format which is basically um a file format for software citation metadata um it's implemented on yaml and if you want to make your research software citable you can create a file called citation. cff in that format put in the correct metadata and leave it in
Your repository so other people know how to site their software give you credit I'm not going to talk about the citation file format and it's tooling um that it has to offer but two more details I think would it' be important in the context of this talk to give you some
Background info um the citation file for my project as of now has been a loose collection of sap projects in a GitHub organization full stop and also the project started out originally um with exactly one hobby developer myself um until I met this beautiful individual depic here yurian spaxx from the
Netherlands East Science Center who for some reason was very interested in that project and agreed to join me well drafted in to join me um in in leading the project and he immediately set out to make the format and the whole project much better um and then what happened to
The project is that GitHub announced in 2021 based on work made in the forest 11 software citation implementation working group that they would use the format to provide their repositories with citation information um so Then zenodo followed by saying that they would use that uh information from GitHub and just you
Know use it in their records and some reference managers followed suit and said they would reuse that information to you know make you click and point at specific publication and import the metadata into your reference manager so that was great um everything explored though because sudden we had users and
This is why this talk was called and then there were users so of course in theory the project was actually originally built to have as many users as possible because we thought it would be a great idea if people um provided the correct and complete citation information for the software but
Honestly we weren't really prepared um for such a relatively spontaneous growth and due to how it happened also we're taken by surprise of it so having users is really fantastic but I find that is also really scary especially if you lack governance and that's what I trying is
You know set set out to to find out what what actually is so what is governance um I didn't know it's an English word I'm not an English English is not my first language so I had to go on Wikipedia and find out and Wikipedia says something very interesting and very
Helpful and so what I made of this is that governance is a decision process and it's there for the purpose of this talk um as the process of making and enforcing this decisions within an open research software project more precisely what I think is as the process of interactions
Through a set of rules of social norms of vision of principles social political power and communication um within that system so within the project um and over the system now words like rules and norms and power and communication don't really exist may make your heart sing
Sure didn't make my heart sing but I think we can all agree that these are things that are real and they're essential in our work our especially if we if we lead a project or have a project so one of the main questions that came out from the review was when
And why do rs's actually need governance for resarch software projects because it may just be a very you know small project for your own purposes you know data analysis script etc etc and in my opinion governance becomes important as soon as more than one individual interact within this project and the
Reason for this is that in every software project we have to make decisions for example starting off with what technology do we use how do we Implement something um what changes will we merge into our projects and what changes will we leave alone how and what
The test etc etc and in solar projects we may be able to make these decisions just by ourselves and for ourselves and you know there's nobody else involved but as soon as we interact even with one other person within the project there's a question of who has the power to make
The decision what are the rules and are the Norms based on which a decision is made and who has the power to make and enforce these rules and these are really hard questions to answer so the good thing is that governance can answer these questions um and there is much to
Gain by making this kind of governance explicit um so what I was hoping that governance could do for us was basically um formalizing and making explicit and transparent if how and when the project can be interacted with in the first place um it sets expectations so people will know from the get go
Whether it will be we time to to invest it into the into the project and if this is something that you want for your project having having such an explicit governance documented somewhere may also open up the project to a larger Community um which may have an impact on
The sustainability of the project and it will help you Safeguard the mission um reach the goals of the project Etc um it finally defines responsibilities and that was a big part of the work we put into setting up a governance because it may lift burdens from many a
Shoulder um some words about how to get there so the big question is is how do you actually go big question for me was how do I actually get from having nothing but a very fuzzy idea but what what we want to do to having a formalized um implemented design
Well-designed governance for a specific project and I think that this depends on basically four things the four things that are listed here and these are the things that Sarah talked about really I mean you know the project aims what do you want the project to do the project
Scope um what you want the project uh to do or Implement and who who who do you want to do it with um a vision or a mission a mission or a vision if you have such a thing and this is closely related to the aims but it also includes
Um the the community you are expecting to work with you hope to work with you are working with um and your own Community as well um so this is all pretty abstract um I will give you some pointers and examples in just a second but just to mention that this is not my
Thinking um I've been lucky enough to to be a part of the core for Science and Society digital infrastructure incubator cohort and this is um a a project that was started in 2021 um an initiative to build sustainable and Equitable practice in digital infrastructure in the Commons
And it really gave me the opportunity to take some time out from my day job think about these topics dive in into reading about governance learning about governance um and most importantly Exchange with other people in the cohort who had similar problems with trying to set up governance for their projects so
That was really good if you want to know more about um what the incubator has to offer this a great page with resources um and these are the resources we basically use to to learn what to do um so yeah follow follow the link here Le s
These are these where DOI so you can get to the get to the links in the end yeah examples so the end and scope may be relatively simple to express um but a really important framework to provide kind of the the Baseline thinking about your project and just writing down these
Two questions and sitting down for a couple minutes thinking about you know what are the actual aims of the project what do you want what do you want to achieve and what's not in the scope of the project what do we not want to do um it was really helpful for us to
Establish that Baseline so taking the citation file format for example the a can be described as just written down here citation makes it easier for research software engineers and researchers to make the software citable and for their users to cite it and the scope follows Along by saying that the
Citation format specifically focuses on citation metadata for software and I've put in this little sense here as well to say that we know we're aware that this is still an ongoing process of defining what we actually need to site software so uh is within the scope of the project to adapt
Uh the project to the um needs and to the knowledge that is still currently under development um but we will stick to the original scope as well so that's basically the Baseline um and governance really comes in because it's important to um um to to set you know to to
Organize collaboration and Community to build community be to be uh inclusive but exclusive at the same time if you have things that are definitely out your scope for your project so um this is where really helps to think about the mission of the project and uh
Or maybe a vision that you have for the project and it doesn't have to be something very um esoteric or very Grand it can just be uh one sentence and we have been given during the incubator and exercise which actually comes from the Milla Open Leadership series um that
Really helped me kind of pin down what the mission and the vision of the citation for format project was and was really a very simple excise that encourage you to use if you want to do the same thing if you want to set up governance for your project um so
Basically what you do is you brainstorm um answers to these four questions who are you working with you hope to work with what are you doing who are you doing for and why are you doing it what's the impact or change that you hope to make and then you take the
Answers you come up with and this put them in the backs really so I'm working with Community or contributors to build something so that my users my community members and the audience can do something different and achieve their own goal and for the citation file format looks like this we're working
With software citation enthusiasts research software engineers and researchers to create a software citation metadata format and respective tooling so that researchers and research software Engineers can make software citable and cited more easily and the very useful thing I found about this relatively simple exercise is that it gives you uh it becomes clear
Who the roles are for your governance and for your project that you have to that you have to consider so in the case of cff this would be software citation enthusiasts res software Engineers researchers and then as the audience researchers and research of Engineers large overlap so um all of these people
Can contribute to the citation file format sub projects this would be the schema format the documentation the tooling and some of them may be or may become maintainers of one or more of the sub projects and um all of them can be users of the project right so this gives
Gives at least three roles that we thought our governance document or governance design must consider and that's contributors maintainers and users and so now having the Baseline a and scope having the mission this gives us a really good idea about like what our project is all about in the first
Place and what the roles are that make up that social system from the original um definition that we are creating governance for but there are other important aspects to consider um when you set up governance or we at least so in our work with rses we may be subject
To a number of legal requirements for example and that's may depend on the institution you work in um there may be export regulations you have have to adhere to there may be specific sign of steps you have to go through etc etc so we're not we may not be as free as we
Want to um set like setting up the governance and the processes um for our project uh there may be cases where your line manager will have to sign off on any changes or any Publications you make if you want to make your public for example and so all these things
Influence the governance model that we want to choose but there are I think probably more important principles as well and these may be um but the values your project has the values you have personally uh could be social could be ethical could be political and you you
Know this also has a great impact on the kind of governance model you choose in the end um could be that you're FAL opposed to having a a dictatorship a benevolent dictatorship it could be that you think that you have to exact some sort of control and you want to have a
Hierarchy of Delegation um so this is really things that you need to think about uh um in the case of the citation power format um there weren't just principles that we really had in place but we also had uh very concrete issues that we wanted to
Solve and so governance is a tool that may help you solve these issues for your own project um in the case of cff this was the very fuzzy and complicated structure of what the pro project actually was so a project of sub projects and different maintainers that
May or may not be part of the the core team of CF so to speak we wanted to balance the control over the original vision and mission of the project thank you um with maximum autonomy and inclusivity in terms of the sub project so we want to give maintainers who were
Um you know who came to the project and set up tooling and um contributed their project to the organization so to speak um we wanted them to be free in their choices um but we also found that responsibility was become a major issue because there were two people really and
There were the number of users which is growing so there's about there about maybe 15,000 files on GitHub as far as we can tell as far as GitHub API tells us so this really was about solving concrete issues for projects and just to take a St back um at this point we
Established that we have the Baseline for what we want or governance to do we have defined the roles that we want to embed in our governance and we have made clear some of the requirements that were important to us so we had all the ingredients now what we wanted was a
Recipe or temper to get started and this was actually an outcome of the incubator as well is that it's easier to start with something that's already there take a template adopt it Etc um there are very many governance models and it can be overwhelming if you just see a list
Of them it's a short list on the right hand side and that's from a project called Community rule who actually um help you will provide a toolkit to help you set up governance but still just going through this list and thinking of all the implications that all all these
Models have can be difficult so really um three things I would suggest you can do to help you set set this up is um make use of the resources have a look at the the the literature that's been um from like for example from the open source Community that's been out there
About governance and governance models websites such such as Community rule which is a kind of click and click and point thing you can use and you can still um always go out there talk to people in your domain uh adjacent projects projects you know of projects
You look up to uh and look what they've done for the governance and in our case we talk to an open project called open carb um which is a German research software project and actually was really helpful in just having taking some time out to explain what they've done why
They've done it and the principles um behind their governance so um the most important thing with this is you be remain free in what you do and keep the vision of what you want to do in the back of your head uh so take things from
Uh different places and adapt uh them to your to your situation so how does this work for the station plat format we've come up with a role we will set up steering committee and we'll have sub project maintainers we have a very mixed model of things and that's just because of project structure
So we will have this overarching steering committee that look up on the project as a whole but we'll make make sure that the the sub projects keep their maximum autonomy um the steering committee will be made up of software citation experts one of them is sitting
In the back thank you D for joining us in this effort um we'll have an elected Community representative or more of them it's not quite clear yet because we want to bring the steering committee first to discuss things such as the actual decision- making process this is a
Complicated thing in our case because we can't we're not one uh single project that can talk about Las consensus and number of hours to wait until you click the merge button it's bit more complicated than that so we have to put in some thought in this point to to make
Sure we have a good decision-making process um yeah so what we've done is we've gone back we've taken the template um created by and attende conference about 10 years back Gabriel hangan who's put up the um the meritocratic meritocratic governance model but our project isn't really a meritocracy so we
Painstakingly went through the whole document adapted it um came up with the preliminary uh adoption of the existing C of conduct that we had was going to be for the whole project we invited uh candidates for the steering committee who thankfully agreed and then you have to document all that kind of stuff
Somewhere so we ended up with a with a govern government governance model that we think will hopefully work for this for the citation for project um and the there isn't one place where you can document your governance it could be in a contributing MD it could be just in
Your head and you uh pass it out in in issue comments for example for smaller projects but what we going to do we going to try and really set it in stone although we think of it as a living document and kind of put it on the
Website make sure that everyone can read through it and and and ask questions about it and also give input about how to change it because we think of it as a living do it it'll go through iterations and it'll have some impact on the project infrastructure itself so we
Will have to think about how to communicate governance within the structure Etc communicate with the the sub projects etc etc so um as some sort of conclusion a Lessons Learned we think or I think that formalizing governance is useful for projects of different sizes it depends on your project what
Form this will take um it'll definitely help to increase the transparent and openness of your project you can manage expectations for you and for your users your team members um C control if you feel you need to but the lesson to take away from me for me was there's no one
Sight SS all you can't just take a template and adopt it and that's you um there's also danger of developing against the worst case scenario when going doing going through this and as usual ask ask a friend there are so many people so many much experience out there
And uh just ask them they'll help you hopefully thank you so I'll get started with the first question which is um lack of respect recognition for research software for example in rep or in job applications is an of discussed issue do you think this um this will help change that assuming
That this is the citation F proper so I'm going to be quick about answering this um I hope so I mean this is this is why we did this right so that people can um provide the correct and complete citation metadata and users of the software will use that metadata to site
The software including the version that here a specific version or an identifier uh but we I mean I think I think cff is really just a small Cog in a very large machine and of course it's helping the process to adapt to an existing system
Which may not be the best system for uh giving credit um it seems to work out fairly okay I think for understanding the parts of research that have been used but I'm sure that there must be better systems out there as well so I think yeah it's a small
Step on the on the long road to making um this situation better in terms of credit but also in terms of um reproducibility and understanding what what parts of play a role in research thank you and do you have any advice about how governance can be designed to
Facilitate the transition from a user to a contributor I think in the specific case of the citation format we'll just have to see um we're still in the process of setting up governance um but what we've made sure is that the first role we describe is that of the user and a clear
Pathway to become becoming a contributor um with a very low threshold because users are what this project is all about and they know what they want and what they need and so um making a tiny contribution uh to something like just telling us about a problem they seen or
Asking a question is something that um that will definitely help on that pathway um in terms of we're actually thinking about in in terms of how to how to implement this governance better we we we'll hope to set up some sort of kind of a communication Channel that's
Better than GitHub issues where people can go and just ask a question and and become part of the community and help us deliver in a way um so making making clear what we think we expect from the different roles in the governance model may help in that
But also making clear or or trying to assess what people think the the project will be all about is is really important and we've had many many comments that will going to deeply inform the next version of the citation PL format specifically in terms of contribution
Rool which which is a great thing that's unsolved in the format and it's still in discussion in the greater communities so yes um I I think that governance should hopefully help in that way and so what are the tradeoffs between formal governance and informal FL organization how do you balance them
Small and Rapid develop um so the CIT for format definitely is not a rapidly developing agile project we're pretty much slowo that's just because we've been to people in the past and we have day jobs um but I think there are some tradeoffs um well in my case I wasn't
Really happy with having this kind of internal or implicit fluid uh governance we knew about our principles and we knew what we thought would work well for the project but it was a real pain point for me to sit down write this out so that other people can have the say that as
Well and I don't have to be the person together with uan to decide things uh just just just on the goal um there is overhead in in formalizing governance um but that again that that depends on the way you go I think there'll be quite some overhead in establishing the governance solitation
For for project because we'll have at least a steering committee a year probably um a me a meeting a year we'll have votes um we'll have hopefully more communication with the community um so that's a trade of that you have to be prepared for um but I think it's
Definitely worth it in in terms of our project but that's again that's something that your project needs to decide for themselves and we've done fairly well I think for the first couple years by just uh winging it really and trying to trying to trust our gut feelings and and what we learn about
Community and the needs of the community along the way thanks yeah the next question is kind of similar but but how do you have a sense of how much governance is needed for different size um my sense is and it really is a gut feeling is that um you can start
Very small um start by writing down the engine scope or the mission um somewhere in your notepad and you can already already then act on them um if you especially if you have a team talk to your team about this have a discussion right and you know try and come to um
Just a paragraph text that you keep kind of you know to stuck to your screen or somewhere um you can take in steps I think so um there is a good practice of establishing uh contributing MD in your in your repository for example this is I
Think a very good place to start with governance and manage expectations what you tell tell people when and when not you will for example consider merging their changes and you can grow it from there we just had to kind of go from zero to this um now because it was just such
A surprising mass of growth and growth and usage I see next one is more of a comment but yes I've talked to him at the bar last night and I thanked him for his work on in ois ois watch and the governance model yeah so as a as a last
Question what might there was funding Associated your there is funding associated with the project um and that's over me being a freelancer for the Netherlands e Science Center um working on the cff and urian being employed by the science center and in the past we've had different um smaller
Grants from code for Science and Society from the SSI um my employer the German Aerospace Center paid me or agreed to uh look away when I was working working on the citation power format project in my in my work time so um there maybe I'll I'll try and come at
This from a different angle there are a number of stakeholders in in the project GitHub is a big one um Zeno and Zoro Z zoto and many different infrastructure providers are one and we're also thinking about setting up uh another another body in the governance um as an
Advisory Board because we don't think that GitHub should be the ones to de side um what will happen to the cation platform but we also cherish especially Aron Smith's um knowledge and experience in that in that respect and he's helped us greatly in the past so it'd be great
To bring in people that are interested and are potentially also in the position to uh grant us funding but then we would have to take the next step and think about how to funnel this funding into the project and what to do with it so yeah
source