This repository contains data for the experiments run in the paper "Understanding Generative AI Content with Embedding Models" ( https://arxiv.org/abs/2408.10437 ). DataBase POC: Max Vargas (max.vargas@pnnl.gov) The data is separated by experiment: A. The `stack_exchange` dataset contains a...
Filter results
Category
- (-) Computing & Analytics (17)
- Scientific Discovery (369)
- Biology (258)
- Earth System Science (161)
- Human Health (112)
- Integrative Omics (73)
- Microbiome Science (47)
- National Security (31)
- Computational Research (25)
- Chemical & Biological Signatures Science (12)
- Energy Resiliency (12)
- Weapons of Mass Effect (12)
- Chemistry (10)
- Data Analytics & Machine Learning (9)
- Computational Mathematics & Statistics (7)
- Materials Science (7)
- Atmospheric Science (6)
- Data Analytics & Machine Learning (6)
- Renewable Energy (6)
- Visual Analytics (6)
- Coastal Science (4)
- Ecosystem Science (4)
- Energy Storage (3)
- Plant Science (3)
- Solar Energy (3)
- Bioenergy Technologies (2)
- Cybersecurity (2)
- Distribution (2)
- Electric Grid Modernization (2)
- Energy Efficiency (2)
- Grid Cybersecurity (2)
- Transportation (2)
- Computational Mathematics & Statistics (1)
- Grid Analytics (1)
- High-Performance Computing (1)
- Subsurface Science (1)
- Terrestrial Aquatics (1)
- Wind Energy (1)
Content type
Extreme weather events, including fires, heatwaves(HWs), and droughts, have significant impacts on earth, environmental, and power energy systems. Mechanistic and predictive understanding, as well as probabilistic risk assessment of these extreme weather events, are crucial for detecting, planning...
HDF5 file containing 10,000 hydraulic transmissivity inputs and the corresponding hydraulic pressure field outputs for a two-dimensional saturated flow model of the Hanford Site. The inputs are generated by sampling a 1,000-dimensional Kosambi-Karhunen-Loève (KKL) model of the transmissivity field...
ProxyTSPRD profiles are collected using NVIDIA Nsight Systems version 2020.3.2.6-87e152c and capture computational patterns from training deep learning-based time-series proxy-applications on four different levels: models (Long short-term Memory and Convolutional Neural Network), DL frameworks...
This year’s VAST Challenge focuses on visual analytics applications for both large scale situation analysis and cyber security. We have two mini-challenges to test your analytical skills and confound your visual analytics applications. In the first mini-challenge, (the imaginary) BankWorld's largest...
Category
The VAST 2010 Challenge consisted of three mini-challenges (MC) and one Grand Challenge (GC). Each MC had a data set, instructions and a number of questions to be answered. The GC required participants to pull together information from all three data sets and write a debrief summarizing the...
Category
The VAST 2009 Challenge scenario concerned a fictitious, cyber security event. An employee leaked important information from an embassy to a criminal organization. Participants were asked to discover the identity ofthe employee and the structure of the criminal organization. Participants were...
Category
Mini Challenge 1: Wiki Editors The Paraiso movement is controversial and is having considerable social impact in a specific area of the world. We have extracted a segment of the Paraiso (the movement) Wikipedia edits page. Please note this is not the Paraiso Manifesto Wiki page which is part of the...
Category
It is Fall of 2004 and one of your analyst colleagues has been called away from her current tasks to an emergency. The boss has given you the assignment of picking up her investigation and completing her task. She has been asked to pursue a line of investigation into some unexpected activities...
Category
Dataset The dataset will consist of: About 1200 news stories from the Alderwood Daily News plus a few other items collected by the previous investigators A few photos A few maps of Alderwood and vicinity (in bitmap image form) A few files with other mixed materials, e.g. a spreadsheet with voter...
Category
This data was generated by the organization IvySys. Activities can be phone calls, transactions, or any other type of communications. Most of the files are of the type .edges, .rdf, or .csv; but all can be opened in a text editor. A good introduction to this data can be found in \Tutorial1\MAA...
Category
A template to document AI prompts. There are four files associated with this DOI. There is a version of the template with and without examples. There is a PDF and Word copy of both versions. Please cite as: Sheridan, S. 2025. "AI Prompt Documentation Template." https://doi.org/10.17605/OSF.IO/K7FUZ...
Category
The Human Islet Research Network (HIRN) is a large consortia with many research projects focused on understanding how beta cells are lost in type 1 diabetics (T1D) with a goal of finding how to protect against or replace the loss of functional beta cells. The consortia has multiple branches of...
Category
Datasets
1
Category
Datasets
7
Category
Datasets
1