Commit 2440a289 authored by Jonathan Minz's avatar Jonathan Minz
Browse files

upload DMP

parent 4dbf663d
Loading
Loading
Loading
Loading
(6 KiB)

File changed.

No diff preview for this file type.

+287 −0
Original line number Diff line number Diff line
Aims and Objectives:

The key aims and objectives of the Research Data Management (RDM) team,
within the broader context of the LAFI project, are as follows

- prepare harmonized, standardized, and easily accessible datasets to
facilitate effective scientific collaboration within LAFI and beyond.

- publish these datasets in accordance with the FAIR principles
(Findable, Accessible, Interoperable, and Reusable), thereby supporting
Earth System Science research, education, and evidence-based
environmental policy-making.

- document RDM workflows and research software, archive datasets, and
develop tutorials---establishing benchmarks for FAIR data practices
within the wider Earth System Science (ESS) community.

Tasks:

The aims and objectives of the LAFI RDM team can be organized into the
following high-level focus areas:

Data Harmonization and Standardization

FAIR data storage, archiving and publication.

Development of a web-based user interface for the LAFO server

Documentation

![](media/image1.png){width="6.6929757217847765in"
height="3.230184820647419in"}

These categories can be thought of as broad focus areas that require
attention for realization of the intended data management aims. Each
category is further divided into specific tasks designed to ensure that
all data generated within the LAFI project aligns with the FAIR
principles
([go-fair.org/fair-principles](https://www.go-fair.org/fair-principles/)).

TIMELINES, MILESTONES & DELIVERABLES:

The Gantt chart above outlines the timeline for RDM tasks over the
2025--2027 period, which corresponds to the remaining duration of the
LAFI project. It is anticipated that a proposal for a potential
extension of LAFI will need to be finalized by the end of Q2 2027. This
represents a key deadline for RDM-related activities (Milestone 5 --
M5). Other significant milestones include the proposed Fall Schools in
2025 (M1) and 2026 (M3), the planned commencement of converting LAFI
data from all contributing groups into obs4MIPs-compliant netCDF files
(M2), and the launch of the FLAIR project (M4), with LAFI serving as the
Living Use Case (LUC) for NFDI4Earth.

During 2025 and early 2026, the primary focus will be on preparatory
work required to enable the conversion of LAFI datasets into
obs4MIPs-compliant netCDF formats. These efforts will involve developing
and testing Python scripts to generate CF-compliant netCDF files from
native machine outputs, incorporating error estimates, implementing
quality flag as well as aligning metadata with CF and obs4MIPs
standards. The subsequent project phase will emphasize data conversion,
the development of a RESTful API or web portal for direct access to LAFI
data, and the publication of datasets across multiple platforms,
including [NFDI4Earth's
OneStop4All](https://www.nfdi4earth.de/2facilitate/onestop4all) ( a
centralized web portal providing unified access to NFDI4Earth data,
tools, services, and training resources),
[obs4MIPs](https://pcmdi.github.io/obs4MIPs/), [World Data Center for
Climate (WDCC)](https://www.wdc-climate.de/ui/), and
[PANGAEA.](https://pangaea.de/)

While the authoring of user-facing documentation and tutorials is
expected to begin in earnest following the start of the FLAIR project,
the internal documentation of Python scripts will be an ongoing activity
throughout the 2025--2027 period to facilitate effective code sharing
within the LAFI consortium.

Each milestone is associated with a concrete deliverable. For Milestone
1 (M1), a real-world use case involving Doppler Lidar (DL) data
conversion and its documentation is planned. This will allow testing of
the script with data generated by other groups at the 2025 Fall School.
A working script and associated documentation for DL data conversion has
been created and successfully tested. This is already available on a
working Gitlab repository. By Milestone 2 (M2), it is expected that the
DL datasets will have been successfully processed using an updated CMOR
script, enabling the historical LAFO Doppler Lidar data to be converted
into the obs4MIPs format. By Milestones 3 and 4 (M3 and M4), a dedicated
LAFI GitLab repository, along with script documentation and initial
tutorials, should be available. Furthermore, a web interface will be
developed. By the final milestone (M5), selected LAFI datasets should be
published via the NFDI4Earth service portfolio and obs4MIPs, accompanied
by comprehensive documentation and user tutorials as well as web access.

TOOLS and STANDARDS:

A successful outcome of the tasks described above depends on the
consistent use of specific tools and standards. In general, data
analysis, processing, and conversion scripts developed by the LAFI RDM
team will be written in Python. These scripts will be stored and
version-controlled using GitLab, which will also serve as a platform for
documenting their functionality and usage. Additionally, GitLab is
expected to support software issue tracking throughout the project.

Data within the LAFI project is generated by a wide range of
instruments, each producing output in various plain text formats. This
heterogeneity necessitates the conversion of these datasets into a
standardized format, ensuring broader usability within the Earth System
Science (ESS) community. The ultimate objective of LAFI RDM is to
convert all relevant datasets from their original text formats into
netCDF files, which are widely supported and easily manipulated using a
variety of programming tools and languages.

- facilitate broad access, the converted datasets will be published
online, ensuring availability to researchers, policymakers, educators,
and the general public. We will adopt the latest versions of key data
standards---Climate and Forecast (CF) metadata conventions v1.13 and
obs4MIPs (Observations for Model Intercomparison Projects) Data
Specifications ODS 2.5---both of which provide detailed guidelines for
metadata, variable naming, and file structure. Compliance with these
standards is required by most data publishing and archiving platforms,
including obs4MIPs, OneStop4All, WDCC, DOKU, and PANGAEA.

- verify adherence to the CF conventions, we will use the CEDA
CF-checker tool
(<https://help.ceda.ac.uk/article/4160-cf-checker-command-line-tool>).
Final conversion into obs4MIPs-compliant netCDF files will be performed
using the CMOR tool (https://pcmdi.github.io/obs4MIPs/cmor.html).

All finalized LAFI datasets will be temporarily stored on the LAFO/I
server at the University of Hohenheim before being published online.
However, due to memory limitations on this server, individual LAFI
research groups are responsible for managing the storage of their raw
and intermediate data. Groups based outside the University of Hohenheim
may access the LAFO/I server via the university VPN, which requires the
creation of a guest account, university-affiliated email address, and
two-factor authentication. The complete access procedure has been
documented separately (see Protocol for LAFI RDM Meeting -- 27.05.2025).

Each research group is responsible for the conversion and publication of
its own datasets, with the LAFI RDM team providing standardized scripts
and guidance on the application of CF and obs4MIPs conventions. While
the RDM team can offer support and advisory services, it bears direct
responsibility only for the data produced by the Institute of Physics
and Meteorology at the University of Hohenheim.

KEY STAKEHolderS - roles and engagement:

In addition to employing the right tools and standards, achieving the
objectives of LAFI RDM---and contributing to the establishment of
research data management best practices within the broader Earth System
Science (ESS) community---requires active collaboration with key
stakeholders, both within and beyond the LAFI project.

Decision-making authority regarding data management within LAFI resides
with the LAFI Speaker and Principal Investigators (PIs). The LAFI RDM
team, in collaboration with NFDI4Earth technical support, is responsible
for developing data conversion scripts, clarifying the application of
CF, obs4MIPs, and FAIR standards, maintaining the LAFI GitLab, and
authoring documentation and tutorials. The primary users of these
outputs are the various LAFI research groups, who will apply them to
convert and publish their own datasets.

- ensure alignment and continuity across all stakeholder groups,
regular communication will be maintained. This includes progress updates
through Data Management (DM) meetings, tutorials during the Fall
Schools, and up-to-date online documentation for LAFI participants. For
external audiences---including the broader ESS research community,
educators, and policy-makers---key findings and updates will be
disseminated via conference presentations and posters (e.g., at EGU,
AGU, PyData, or PyCon) and publications in both peer-reviewed data
science journals and general scientific publications.

It is envisioned that open access to LAFI datasets will be facilitated
through a web server or RESTful API, in addition to publication on
established online data repositories such as obs4MIPs, WDCC, PANGAEA,
and OneStop4All. This will enable a broad spectrum of users---from
researchers to the public---to access and work with the data.

Maintaining close contact with the CF and obs4MIPs working groups is
essential---not only to ensure that LAFI datasets remain compliant with
evolving standards, but also to contribute meaningful feedback toward
improving those standards. This engagement may take the form of direct
communication via email or GitHub discussions, as well as presentations
at steering committee meetings of these respective working groups.

LAFI is expected to generate numerous best practices in research data
management throughout its duration. To ensure that these insights are
captured and contribute to future standards, they should be shared
regularly with international organizations such as GLASS, GLAFO, and
ESMO, as well as local initiatives like AI & Data Science Certificate
Hohenheim (AIDAHO) at the University of Hohenheim. AIDAHO provides a
low-effort opportunity to leverage excellent AI/ML and Data Science
expertise within University of Hohenheim to develop applications using
the finalized LAFI datasets.

Engagement may include presentations at key organizational meetings,
conference participation, and publications in both peer-reviewed
journals and science communication outlets.

A summary of the key stakeholders, their roles, and modes of engagement
is provided below.

  -----------------------------------------------------------------------------
  **Stakeholder**   **Role**          **Responsibilities /   **Type of
                                      Interests**            Engagement**
  ----------------- ----------------- ---------------------- ------------------
  **LAFI Speaker &  Project           Oversee data           Participation in
  PIs**             Leadership /      management strategy,   Data Management
                    Decision-makers   approve standards,     (DM) meetings,
                                      guide overall RDM      strategic planning
                                      direction              discussions,
                                                             feedback loops

  **LAFI RDM Team** Implementation &  Develop data           Internal
                    Coordination      conversion scripts,    collaboration,
                                      ensure                 GitLab management,
                                      CF/obs4MIPs/FAIR       DM meetings, Fall
                                      compliance, maintain   School tutorials
                                      GitLab, produce        
                                      documentation and      
                                      tutorials              

  **NFDI4Earth      Technical         Provide expertise on   Coordination
  Technical         Guidance &        FAIR principles,       meetings, feedback
  Support**         Infrastructure    metadata standards,    on tools and
                    Support           and web-based data     documentation
                                      publication platforms  

  **LAFI Research   Primary Data      Apply RDM scripts and  Use of GitLab
  Groups**          Producers & Users standards to process   resources, Fall
                                      and publish their      School
                                      datasets               participation, DM
                                                             meetings, access
                                                             to online
                                                             documentation

  **CF & obs4MIPs   Standards         Define and update      GitHub
  Working Groups**  Authorities       metadata and data      discussions,
                                      formatting standards;  direct email
                                      receive feedback from  communication,
                                      data users             presentations at
                                                             steering group
                                                             meetings

  **External ESS    Broader           Use LAFI data for      Access through
  Community (e.g.,  Scientific Users  Earth system science   data repositories
  researchers)**                      applications,          and APIs, uptake
                                      reproducibility, and   of published
                                      meta-analysis          datasets,
                                                             conference
                                                             sessions

  **Educators &     Indirect Users /  Use LAFI data for      Open-access
  Policy Makers**   Beneficiaries     teaching, public       platforms,
                                      communication, and     simplified
                                      policy decisions       documentation,
                                                             presentations at
                                                             broader science
                                                             forums

  **International   Global Knowledge  Promote adoption of    Presentations,
  Organisations     Exchange Networks data management best   white papers,
  (GLASS, GLAFO,                      practices and          participation in
  ESMO)**                             incorporate feedback   working groups and
                                      into global RDM        meetings
                                      frameworks             

  **Local           Institutional     Share learnings        Workshops,
  Initiatives       Collaboration &   locally, integrate     internal seminars,
  (e.g., AIDAHO)**  Outreach          LAFI RDM practices     collaboration
                                      into institutional     through
                                      policies               institutional
                                                             forums
  -----------------------------------------------------------------------------

  : ### Table 1: Stakeholder Roles, Responsibilities and Engagement

| Stakeholder                     | Role                                | Responsibilities / Interests                                                                 | Type of Engagement                                                                 |
|---------------------------------|-------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|
| LAFI Speaker & PIs              | Project Leadership / Decision-makers | Oversee data management strategy, approve standards, guide overall RDM direction             | Participation in DM meetings, strategic planning discussions, feedback loops       |
| LAFI RDM Team                   | Implementation & Coordination        | Develop scripts, ensure CF/obs4MIPs/FAIR compliance, maintain GitLab, produce documentation  | Internal collaboration, GitLab management, DM meetings, Fall School tutorials     |
| NFDI4Earth Technical Support    | Technical Guidance & Infrastructure  | Provide expertise on FAIR principles, metadata standards, and data publication platforms     | Coordination meetings, feedback on tools and documentation                        |
| LAFI Research Groups            | Primary Data Producers & Users       | Apply RDM scripts and standards to process and publish datasets                             | Use of GitLab resources, Fall School participation, DM meetings, online docs       |
| CF & obs4MIPs Working Groups    | Standards Authorities                | Define/update standards; receive feedback from users                                         | GitHub discussions, email communication, presentations at steering group meetings |
| External ESS Community          | Broader Scientific Users             | Use LAFI data for research, reproducibility, meta-analysis                                   | Access via repositories and APIs, uptake of datasets, conference sessions          |
| Educators & Policy Makers       | Indirect Users / Beneficiaries       | Use LAFI data for teaching, public communication, policy decisions                          | Open-access platforms, simplified documentation, presentations at science forums   |
| International Organisations     | Global Knowledge Exchange Networks   | Promote best practices, integrate feedback into global RDM frameworks                       | Presentations, white papers, participation in working groups and meetings          |
| Local Initiatives (e.g. AIDAHO) | Institutional Collaboration & Outreach| Share learnings locally, integrate LAFI RDM into institutional policies                     | Workshops, internal seminars, collaboration through institutional forums           |