Collaboration - digital infrastructure

We are one month and three meetings into the project.

Having had several discussions on the subject of digital infrastructure (understand as the research data storage, system for communicating across different google and outlook identities and the need to keep each other efficiently in the loop but also keep our workplace informed) at SDAM, our team often struggles with different opinions and different priorities when it comes to discussing and deciding on digital infrastructure. Decision making is difficult because we are facing multiple unknowns at once. We do not know local (Danish) resources well, we don’t know each other well, and in the realm we each are not certain of each others expertise in this murky digital area. We are basically still stocktaking in many aspects of our eresearch. My push towards decision making is perhaps counterproductive as we are still in the information gathering and trialling stage of the innovation adoption process. It may be too early to commit as we don’t understand all the costs and benefits of the different available infrastructures. We are circling around the basic principles (be exemplary, promote open source, collaboration-friendly online data storage systems that are modular and let us connect data and analytical scripts) but have not committed fully to the principles of openness (ie. github seems too open for the scripts/datasets) nor have we found cloud data storage in DK that would tick both the boxes of functionality (plugin tools), responsiveness (answering email queries) or sufficient data storage shareable with international collaborators. Perhaps, data storage is tricky and needs a bit more reflection and testing. Other matters, such as team management and coordination are split into several different tools, such as trello and slack. While I can live with Slack I am less enthused about Trello, mostly because it is yet another system to keep an eye on. As project director, I am happy to go along with any new app that someone installs on my computer and sets to notifications and provides instructions and support when asked. As project director, I am most anxious about getting a group email address set up so I type one address and everyone gets the memo. The same applies to calendars, I would like to know when is who available and to make my calendar visible to team members see where I am and what I am doing without me sending lots and lots of invites constantly. This makes sense as at the moment I sit on a lot of logistical and event information while feeling time-strapped. Basic project management and day to day operation sits on Google at the moment. While I had suggested we purchase membership for business apps, and have own institutional space, this suggestion was voted down at the initial meeting in favor of a better Danish cloud platform (e.g. data DeiC or sciencedata.dk). I suspect if I had purchased the business apps before asking everyone, we would be using that space now instead of private personal space. I see no problem with using a business Google apps as a temporary placeholder before we move to something else; as currently we are using all kinds of different storage (my fedarch Gdrive, MQ Gdrive, Cloudstor, Vojtech’s Gdrive, Github, et al) and it’s going to be a hell to streamline this down the line. I suppose the source of tension for me here is that we as a project should not be using personal digital resources but institutional ones (to be exemplary). Another source of tension is my preference to commit to an imperfect tool with the understanding that we switch as soon as we identify a better tool for the job. I think it poor practice to resort to a non-committal limbo stage where we sort of play with datasets from different parts without having or at least trying to build any containing structures. Maybe I exaggerate, but it sure feels more straightforward to manage the provenance and move resources from one space to another as long as they are organized to begin with. All in all, I perceive distinct lack of engagement on the digital infrastructure part of our work compared to research proper. If I put my foot down regarding GDrive three weeks ago, we’d probably be using it now despite concerns over new unnecessary emails, etc. Despite reservations, the team mostly happily delegates decision making and implementation to whoever volunteers for the task. It’s low hanging fruit for whoever is just a bit more comfortable with the tools, or knowledgable, or feels more strongly about the subject, or simply has a more discrete idea of desired outcome. I wonder whether this is because of the differences in technical expertise or because for researchers digital infrastructure is, after all, secondary. There is a reason for this situation. Most postdocs today have been trained as individual researchers who by dissertation stage have developed effective survival working habits and strategies and are justified in believing that the infrastructure they have operated within is sufficient. For individual tasks, it most probably is. For collaborative group work, however, is most probably isn’t.

Collaborative projects are more complex than individual projects as the needs and ideas of the different participants are more differentiated. Successful operation of a collaborative project requires frequent expression and juxtaposition of everyone’s needs, constant testing of these needs (to combat ‘language games’) and their prioritisation. Decision-making process is thus not a linear, informed judgement that follows directly from the information gathering and trialing stage, but a trade of resulting from the juxtaposition of individual opinions and balancing of governing principles of the project. Following the motto of The Bazaar and the Cathedral, this more complex process has the advantage of producing higher quality, better informed and less buggy solutions. The disadvantage is that it is slowed down by the complexity of decision-making amidst language games. For me, the complexity of communicating and operating in a collaborative project is precisely the reason why we need to emphasise and prioritize development of proper digital infrastructure. In contrast to individual projects, the potential for technical debt increases dramatically in collaborations. Technical debt, as defined by wikipedia, is the “implied cost of additional rework caused by choosing an easy (limited) solution now instead of using a better approach that would take longer”. This concept, first coined by Ward Cunningham, applies both to software development and collaboration management (e.g. Stewart-Lowndes). Tolerating poor data storage now commits the team to time-consuming data transfer and documentation later. Poor/weak team coordination may cause a bit of frustration here and there today, but leads to missed opportunities in a week or month exacerbated by new pressures or additional stress. Tackling technical debt requires the refactoring of systems for more clarity and streamlined performance as these systems (be they programs or infrastructures) grow more complex. Inevitably, we are humans and therefore imperfect and so are our decisions. This is doubly true with digital infrastructure which is a constantly moving landscape, or better a constantly changing tip of an iceberg. Mindful of the imperfection, we can nonetheless mitigate it and ward off technical debt whenever possible. One such strategy is to invest considerable effort into scoping, selecting and implementing infrastructure solutions that will make the future work of the entire team easier. Settling down for a limited temporary solution or regressing into individual solutions is tantamount to shooting ourselves in the foot.

Decision making - Am I turning into a Dane?

I suspect my team think we discuss digital infrastructure too much. Wouldn’t it be better to play with data first? It might be more fun. It would certainly relieve some of the stress from deliverables. Yet we need to build the project slowly. Perhaps I should have instituted the infrastructure by fiat and started playing. However I wish the decisions to be the consensual product of joint deliberations over individual and project needs and usecases. Likewise I hope the implementation will consist of divisible parts and present a shareable load for us all.

“I wish to play with the data first”

The wish to jump into research is justified by the need to know what it is we need stored and contained. It is perhaps also good to relieve some of the pressure and novelty of the project by resorting into individual zones of comfort, with familiar data analysis, visualisation or similar. It lets each of us bring something to the table, establish value and carry on developing as a project and a team. Imagining what we as a project need now and might need in the future is a trying exercise, and it is good to have some safety valves and comfort generating activities.

Digital research and the solitary historian

This is what I actually sat down to write about today inspired by a discussion with a colleague. The topic needed such a long introduction about the pains of collaboration that I may need to finish this topic in another reflection. In respect to the language games and complex workings of team needs and priorities, it seems that digital projects should be much more straightforward if one is an individual researcher. In a team we struggle with how to work together and how to share authority/decide what technical solution to commit to /adopt. In a team, however, we have four pairs of eyes for information gathering and testing available technical solutions. This is a tremendous advantage. Unless the single researcher has gotten a degree in history after a long career in IT, then the time saved by not figuring out a team is lost in searching for the right digital tool. If we as a team of intelligent IT savvy people struggle to decide on shared digital infrastructure, how is a single researcher supposed to navigate this terrain?