Participant Perspectives | Kyle Demes from OurResearch
Kyle Demes, OurResearch
Please explain a little about your background and why you’re interested in persistent identifier (PID) metadata and its enrichment.
My academic background is in marine ecology, but a decade ago I transitioned to a career in research management. I brought the same robust analytical approach to my professional practice that I used in my academic practice but was shocked to learn that Universities can't even get a simple list of THEIR research outputs. PIDs have promised to provide researchers, governments, funders, and research organizations with the information they need to manage their research, but the previous approach of each PID working in isolation has kept that promise from materializing (for instance, when an ORCID isn't verified when it is attached to a publication, the community loses the ability to accurately match research outputs to an author). Through open collaborations of PID organizations, we can work to collectively connect, clean, and enrich PID records to finally have reliable open information about the world's research ecosystem.
What excites you most about the potential for collaborative enrichment of PID metadata (e.g. to improve research discoverability, impact tracking, better reflect the global nature of scholarly communications)? What do you think will be the most challenging aspect to address?
My career is in this field so lots of things excite me about enriched PID metadata! As a research manager though, I'm particularly excited by how much time researchers and research organizations currently spend on administrative tasks that we can save through an open scientific knowledge graph: finding the right grant opportunities for the right researcher at the right time; finding collaborators who have never met that together are positioned to solve grand challenge; outputs reporting and tracking; etc.). Multiplied across the world, I see massive potential to catalyze more and better research just by reducing administrative burden. I think the most challenging aspect here is the same with all initiatives in the research space: change management within the research community. Academia as an institution predates many current government structures and new faculty learn practices from their supervisors and so the pace of change is glacial while the field of research metadata advances every day. Solutions already exist for most problems Universities face in the research metadata space, but they struggle to implement them because of lack of awareness, capacity, and internal political will to change. Our approach is to keep developing and refining solutions through active engagement on explicit pain points to make them as simple as possible to implement.
What successful examples of community collaboration in scholarly infrastructure have you witnessed that could inform the proposed COMET model’s development?
OpenAlex exposes a lot of open research metadata, including errors that we introduce as well as upstream errors from other open infrastructures and PID agencies. One of my favourite examples is a user letting us know that thousands of works had the wrong, and same, publication date. All of the publications had Crossref DOIs so we wrote to Crossref. They noticed that all of the records had the same instance of OJS who we reached out to. Everyone was committed to fixing the issues so OJS made sure the encoding bug was fixed, Crossref fixed the date issues, and the records got updated in OpenAlex automatically. It was brilliant. I also see Universities around the world invest significant resources (in terms of cash and in-kind person hours) in curating their research metadata in proprietary databases like Scopus, Web of Science, Dimensions, etc. So there is already expertise at Universities in curating metadata and they currently have to replicate their work for each closed data source. A model where universities divert the resources they spend on curating proprietary databases to an open infrastructure who curates the metadata openly for all downstream providers to use (including the proprietary systems!) is particularly exciting. Another example that I think has been very successful is Thoth. They provide DOI minting services to ensure that books have accurate and high quality metadata to make them easier to find. They're pros at making sure works are discoverable and the metadata is reusable and high quality. Services like this could be scaled for institutions and journals to ensure they get high quality metadata when they might be lacking the capacity internally to ensure that.
How could better and more complete PID metadata, derived from the proposed COMET model, help to advance your goals, those of your organization, or your communities?
There is definitely fatigue within Universities when it comes to research metadata. Universities invest time and resources in PID adoptions through membership models, consortia, committees, working groups, dedicated staff, and distributed administration pushed down to Faculty. And they expect that those investments will result in high quality research metadata throughout the connected open infrastructures, but historically that hasn't always been necessarily the case. By working together to collectively enrich research metadata within and across PID agencies, we can rebuild that trust with the research community and incentivize even broader adoption of PIDs.
What benefits do you envision enriched PID metadata enrichments, such as is being aimed for through COMET, will have on the broader research ecosystem?
Without doubt, it will increase the use of open research information throughout the global research ecosystem because currently metadata quality is one of the major impediments for organizations to use open research information. But we will also see broader adoption of PIDs as various stakeholders see their value multiplied by the increased use of open research metadata.
Why do you think organizations interested in PID metadata enrichment should consider contributing resources to fund the first phase of development for the proposed COMET model?
Sustainability should be a core consideration in all conversations around open infrastructure, including the COMET initiative. Each stakeholder will derive value through the enrichment of open research information and, for the sake of sustainability, conversations with each should focus on that value. Referencing the examples I already mentioned: organizations and publishers are willing to pay Thoth to ensure their PID metadata is accurate and enriched and institutions are willing to pay proprietary databases for custom curation. There is a clear value case in these situations already established, so expanding on those models initially makes the most sense to me. If adoption of a PID service is owing to certain metadata standards not being met, there is a clear value to those PID agencies for enriching their records. It doesn't take much creativity to imagine other examples. Let's elucidate what the value of enriched metadata is with each stakeholder and they'll be excited to contribute to get that value.