• Skip to primary navigation
  • Skip to main content

Office of Research Initiatives and Facilities

University of Southern CaliforniaResearch
  • Funding
  • Limited Submissions
  • Shared Resources
  • Training
  • Announcements
  • Contact
You are here: Home / Limited Submissions / Department of Energy DE-FOA-0002725: Management and Storage of Scientific Data

Department of Energy DE-FOA-0002725: Management and Storage of Scientific Data

Slots: 2

Deadlines

Internal Deadline: Contact ORIF.

LOI: May 5, 2022

External Deadline: June 13, 2022, 5pm PT

Award Information

Award Type: Grant / Cooperative Agreement

Estimated Number of Awards: Depends on number of meritorious applications

Anticipated Award Amount: $300,000 per year

Who May Serve as PI:  Individuals with the skills, knowledge, and resources necessary to carry out the proposed research as a Principal Investigator (PI) are invited to work with their organizations to develop an application. Individuals from underrepresented groups as well as individuals with disabilities are always encouraged to apply.

Link to Award: https://science.osti.gov/grants/FOAs/-/media/grants/pdf/foas/2022/SC_FOA_0002725.pdf

Process for Limited Submissions

PIs must submit their application as a Limited Submission through the Office of Research Application Portal: https://orif.usc.edu/oor-portal/.

Materials to submit include:

  • (1) Single Page Proposal Summary (0.5” margins; single-spaced; font type: Arial, Helvetica, or Georgia typeface; font size: 11 pt). Page limit includes references and illustrations. Pages that exceed the 1-page limit will be excluded from review.
  • (2) CV – (5 pages maximum)

Note: The portal requires information about the PIs and Co-PIs in addition to department and contact information, including the 10-digit USC ID#, Gender, and Ethnicity. Please have this material prepared before beginning this application.

Purpose

Modern scientific computing relies on processing a deluge of data coming from both experiments and simulations, with even relatively modest scientific activities generating petabytes of data. Planned upgrades of experimental facilities in the foreseeable future, combined with the increased computing capabilities of DOE’s exascale supercomputers and other state-of-the-art computing capabilities coming online over the next few years, promise to compound the many challenges in storing and managing data such that it can be effectively used to fuel scientific discovery [2-12].
Traditional large-scale scientific data management has relied on the use of file formats optimized for simple access patterns on parallel, distributed file systems. These files have tended to be metadata poor and complicated to access, lacking flexible indexing for efficient searching, where enabling new kinds of analysis often requires writing new, low-level code [2-5]. Scientific workflows have also become increasingly complicated, integrating both simulation and the analysis of data from experiments, exploiting advanced machine-learning techniques [4,8-10], and requiring distributed, multi-stage processing [5-7]. Additionally, significant opportunities exist to enhance trust and aid scientific reproducibility by enhancing our ability to record data provenance and verify data integrity. Fortunately, through a combination of past scientific-datamanagement investments and leveraging the growing ecosystem of big-data and database technologies, scientific endeavors have made significant improvements in their data management and use. While the ever-increasing scale of scientific data threatens that progress, new “smart” storage and networking technologies that provide embedded computational capabilities; novel methods for indexing, representing, and distributing data; and advanced techniques for interfacing with data management systems and integrating into programming environments promise significant breakthroughs. Moreover, new techniques for scientific data management can help integrate data into large scientific-data and computational ecosystems that embody the FAIR principles of Findability, Accessibility, Interoperability, and Reuse [Error! Reference
source not found.], thereby enabling collaborative, responsive science at yet-unprecedented scales [2-5].
Priority Research Directions
As highlighted by the recent ASCR Workshop on the Management and Storage of Scientific Data
[2,3], building on the outcomes of prior community activities, including Storage Systems and
I/O: Organizing, Storing, and Accessing Data for Scientific Discovery [5] and the Office of
Science Roundtable on Data for AI [4], and aligned with needs highlighted by interagency
planning [11,12], important priority research directions are:

  1. “High-productivity interfaces for accessing scientific data efficiently” – Innovative
    interfaces to data-management capabilities allowing for flexible, high-performance
    access to large data sets, potentially federated across different kinds of memory, edge
    devices, and repositories, capturing relevant usage statistics, provenance, and other
    metadata.
  2. “Understanding the behavior of complex data management systems in DOE science” –
    Understanding how the behavior of users, application and system algorithms, and
    hardware can be combined and exploited to improve performance and resilience of
    scientific-data-management systems, recognizing that the relevant behaviors can change
    over time.
  3. “Rich metadata and provenance collection, management, search, and access” –
    Innovative methods for collecting and managing provenance and other metadata to
    support FAIR principles, resilience, and scientific reproducibility and discovery.
  4. “Reinventing data services for new applications, devices, and architectures” – Innovative
    methods to design scientific-data-management services for state-of-the-art storage and
    networking devices, including those providing computational capabilities.
    Each pre-application and application must address, as its primary focus, one or more of these
    priority research directions. As specified in Section IV B and Section IV D, each pre-application
    and application must explicitly list the priority research direction(s) primarily motivating the
    proposed work.
    Note that this FOA places requirements on the Data-Management Plan (DMP) appendix in
    Section IV D that supplement the standard requirements found in Section VIII.

Visit our Institutionally Limited Submission webpage for more updates and other announcements.


Office of Research Initiatives and Facilities
orif@usc.edu

University of Southern California   Content managed by ORIF
  • Privacy Notice - Notice of Non-Discrimination