Thursday, April 17, 2025
ONLINE SESSION – all times CDT
|
|
---|---|
10:00—11:15 |
Brigitte Raumann, Globus This is a high-level survey of the extensive research capabilities available on the Globus platform, aimed at researchers. We will describe common use cases and demonstrate how to get started with data transfer and sharing, using Globus Connect Personal on your laptop, as well as remote computation using Globus Compute. |
11:15—11:30 | break YOUR SPACE |
11:30—12:30 |
Greg Nawrocki, Globus Introduce use cases, run flows, and mention search and portals in the context of end-to-end solutions. |
12:30—13:30 | break YOUR SPACE |
13:30—14:30 |
Lev Gorenstein, Globus We will provide an overview of the process for installing and configuring Globus Connect Server to make your storage system(s) accessible via Globus. This is aimed at system administrators who will be responsible for their institution's Globus deployment. |
14:15—14:30 | break YOUR SPACE |
14:30—15:45 |
Lev Gorenstein, Vas Vasiliadis, Globus We will provide an overview of the process for installing and configuring Globus Compute multi-user to make your compute resources(s) available for remote function execution. This is aimed at system administrators who will be responsible for their institution's Globus deployment.
|
Agenda and times subject to change.
Monday, April 21, 2025
Gleacher Center, 450 N. Cityfront Plaza Drive, Lower Level, Room 40
|
|
---|---|
Gleacher Center Parking Information | |
14:00—17:30 |
We will present advanced topics targeted at administrators of Globus Connect Server (GCS) deployments. Topics will include managing multi-node deployments, troubleshooting GCS, deployment best practices, containerizing GCS deployments, managing roles on endpoints and collections, custom identity mapping, managing storage gateway and collection options, and performance tuning. Time will be reserved at the end of the session to address questions and provide guidance tailored to your specific GCS deployment environment and requirements. |
Tuesday, April 22, 2025
Sessions will be held at The Gwen Hotel in The Gallery Ballroom
|
|
---|---|
8:00—17:00 | registration desk open 6th Floor Prefunction |
8:00—9:00 | breakfast The Grand Salon |
9:00—10:30 |
Josh Bryan, Globus Abstract coming soon. |
10:30—10:45 | break 6th Floor Prefunction |
10:45—12:00 |
Ada Nikolaidis, Globus Abstract coming soon. |
12:00—13:00 | lunchThe Grand Salon |
13:00—14:45 |
Ian Foster, Globus Co-Founder Rachana Ananthakrishnan, Executive Director, Globus Abstract coming soon. |
14:45—15:00 | break 6th Floor Prefunction |
15:00—16:15 |
Tobin Magle, Lead Data Management Specialist, Northwestern University Scott Friedman, Principle Technical Business Development Manager, AWS This presentation details Northwestern University's implementation of Globus Connect Server (GCS) on Amazon Web Services (AWS) to facilitate the migration of a research data archive from on-premises storage to AWS S3. We will walk through the complete deployment process, including prerequisites, configuration requirements, and architectural decisions for both the on-premises and AWS environments. The talk examines key aspects of security, networking, and identity management, while sharing practical insights on performance optimization and scalability considerations encountered during the implementation. We'll discuss operational approaches, cost considerations for different usage patterns, and lessons learned from our deployment experience. To help other institutions undertaking similar projects, we will share our AWS CloudFormation template and configuration guidelines for GCS deployment. This technical deep-dive will benefit system administrators and research computing professionals planning to leverage Globus for cloud data transfer and management. |
Scott Friedman, Principle Technical Business Development Manager, AWS We introduce The S3 Connection Wizard: Globus, an easy-to-use tool that transforms how researchers share and access data across institutions using the AWS S3 Connector for Globus. By simplifying a manual multi-step process to an intuitive wizard interface, this tool enables researchers to more easily leverage the combination of Globus' data sharing capabilities and AWS cloud storage. |
|
James Dorff, Technical Lead, Duke University We have developed a solution for cold data management using Globus, with a Globus Flow as the primary interface. For large datasets, users can invoke a Globus Flow that calls an open-source tool, suitcasectl. This tool chunks data into TAR or ZIP files and generates an inventory with associated metadata. All data is stored as S3 objects, with inventories being separately searchable. For more information, visit: sites.duke.edu/fastresearchstorage/suitcasectl. |
|
Dimitrios Bellos, Research Software Engineer, The Rosalind Franklin Institute The Rosalind Franklin Institute has been using Globus for 3 years to transfer high-resolution multimodal biological data from our instruments to our analysis platforms. Approximately Franklin is generating around 70 terabytes a month, which requires high speed and robust transfers. Distributed data transfer solutions, such as Globus can facilitate this goal, however optimally they should be used in automated pipelines that do not require a human in the loop. We have been working towards this in 2 ways. Firstly, we have developed and released FlowCron [1] which is a Function-as-a-Service software tool that can facilitate users to access HPCs to process their data using a Globus Flow to accommodate any data transfers. This significantly reduces the time to science. Secondly, we would like to present our next step with release of Rosalind Franklin Institute’s GlobusAPI. Our RFI-GlobusAPI container image provides an intuitive interface with Globus Python SDK, enabling the easy deployment of Globus microservices, which facilitates the incorporation of Globus within automated data processing pipelines. RFI-GlobusAPI offers multiple useful high in abstraction commands, and it has already been tested as part of ArgoWF workflows. Furthermore, RFI-GlobusAPI is open source and available to all. Our future goals include the expansion of RFI-GlobusAPI with further utility, allowing us to incorporate it into different parts of our data lifecycle to create a secure, scalable, reproducing and efficient automated data infrastructure that can provide FAIR [2,3] data for all our science. [1] doi.org/10.12688/wellcomeopenres.23491.1 |
|
Rick Wagner, CTO, University of California San Diego Abstract coming soon. |
|
16:15—17:15 |
Office Hours
Grand Salon North & South The Globus development team will be available to answer all your questions about the Globus service. Table topics include data transfer and sharing, Globus Connect and Premium Connectors, the web app, CLI and SDKs, Automate, Compute and Authentication |
17:15—18:45 | Welcome Reception Gallery Terrace |
Wednesday, April 23, 2025
Sessions will be held at The Gwen Hotel in The Gallery Ballroom
|
|
---|---|
8:00—17:00 | registration desk open 6th Floor Prefunction |
8:00—9:00 | breakfast The Grand Salon |
9:00—10:30 |
Kyle Chard, Co-lead, Globus Labs; Research Professor, University of Chicago Abstract coming soon. |
Zhao Zang, Assistant Professor, Rutgers University Diamond is a service designed to facilitate large model training across GPU clusters for scientists. It exposes web user interface to build container images, manage training jobs, monitor job progress, and manage data and provenance across clusters. Diamond relies on Globus Auth, Globus Transfer/Search, and Globus Compute for authentication, data management, and container/job management. So far, Diamond has been tested on NCSA Delta, TACC Frontera, and Lonestar6. We aim to release the alpha version of Diamond in early April and to support all NAIRR Pilot GPU resources. |
|
Ravi Madduri, Senior Scientist, Argonne National Laboratory Abstract coming soon. |
|
Ryan Jacobs, Scientist, University of Wisconsin-Madison This work addresses a critical need in the materials science community, which is the availability of persistent, easily accessible and useable machine learning models of materials properties that provide the user with predictions, uncertainties on those predictions (i.e., error bars), and guidance on domain of applicability to inform the user whether the model is reliable. We develop random forest models for 33 materials properties spanning an array of data sources (computational and experimental) and property types (electrical, mechanical, thermodynamic, etc.). All models have calibrated ensemble error bars and domain of applicability guidance enabled by kernel-density-estimate-based feature distance measures. All models are hosted on the Garden-AI infrastructure, providing an easy-to-use, persistent interface for model dissemination callable with only a few lines of python code. We demonstrate the power of our approach by using our models in a fully ML-based materials discovery exercise to search for stable, highly active perovskite catalyst materials. |
|
10:30—10:45 | break 6th Floor Prefunction |
10:45—11:15 |
Preston Smith, Executive Director, Research Computing, Purdue University Abstract coming soon. |
Geoffrey Lentner, Lead Research Data Scientist, Purdue University The Rosen Center for Advanced Computing (RCAC) at Purdue University enables and facilitates research computing and data (RCD) workflows both on campus and nationally with our Anvil supercomputer as part of the NSF-funded ACCESS program. Following the general availability of Globus Compute Engine with multi-user support we deployed GCE on Anvil (and soon all campus clusters) allowing for researchers to target Anvil as a compute endpoint and Globus Flows action provider using their ACCESS credentials. Several opportunities are in the pipeline to build robust, scalable solutions on top of Globus Flows for both research projects as well as core facilities here at Purdue. For the past few years our staff have collaborated with Danny Milisavljevic (notable Purdue Astronomer) on his global recommendation system (refitt.org) forecasting supernovae events using a distributed pipeline. Very soon, LSST will begin publishing events for which we will subscribe and stream data into our landing zone which we can build an automated system to trigger our inferencing steps. Additionally, there is a project to rebuild the data infrastructure surrounding one of Purdue’s core facilities that operates some 40 instruments for life sciences. This represents a prototypical use-case for Globus Flows, including data transfers, transformations, archiving, sharing, and sign offs. |
|
11:15—12:00 |
Timothy Pasch, Professor of Communication (Endowed); Director UND ARCTIC Research Lab; Associate Director UND AI Research Institute, University of North Dakota Aaron Bergstrom, Advanced Cyberinfrastructure Manager, Computational Research Center, University of North Dakota The A-KBS (Arctic Knowledge Based System) and Defense Resiliency Platform Against Extreme Cold Weather (DRP), are two DoD funded Arctic-focused projects in collaboration with CRREL, the Cold Regions Research and Engineering Laboratory (USACE). Over the past three years, our combined academic research teams across 5 Universities have been gathering data in Alaska using Globus to share data seamlessly via our portal. Now, we are expanding our work using Globus Compute towards embedding HPC into our Science Gateway leveraging our new Kubernetes cluster, AI/ML workflows, and supercomputing center collaborations. This presentation will share engaging media from our Alaskan fieldwork, discuss our predictive analytics and decision support tools, and provide an overview of how Globus serves as a key component of our high-performance computational cyberinfrastructure. |
Sandra Gesing, Sr. Researcher, San Diego Supercomputer Center As AI-driven research grows, so do the challenges of managing, accessing, and sharing data across institutions. The National AI Research Resource (NAIRR) [1] aims to provide equitable access to data, computing, and software resources, but how well do current solutions align with researchers' needs? To answer this question, SGX3, the NSF Center of Excellence for Science Gateways [2], led a comprehensive effort to gather insights into the requirements for a NAIRR Portal. The effort included a large-scale survey with over 1100 participants, six online focus groups each two hours and a two-day in-person workshop. This presentation will share key findings from this engagement, revealing major challenges researchers face in data management, including barriers to FAIR (Findable, Accessible, Interoperable, Reusable) data implementation, difficulties with complex data transfer workflows, and the ongoing challenge to balance security with usability. Institutional silos complicate collaboration, making it clear that a more seamless, interoperable approach is needed. This community-driven effort informs the needs how we can support better data management strategies, including the role of platforms like Globus in simplifying secure, high-performance data transfer. Additionally, we will highlight gaps that still need to be addressed, along with potential solutions in policy, infrastructure, and community engagement. Attendees will gain insights into the evolving needs of the AI research community and how a well-designed NAIRR Portal can serve as a critical tool for advancing open, accessible, and efficient research data management. [1] The NAIRR Pilot Portal nairrpilot.org |
|
Thomas Cram, Software Engineer, NSF The Research Data Archive (RDA; rda.ucar.edu) at the NSF National Center for Atmospheric Research (NSF NCAR) contains a large collection of meteorological and oceanographic observations integrated with NSF NCAR High Performance Compute resources to support atmospheric and geosciences research. Containing more than 700 dataset collections, the RDA supports the varying needs of a large and continually growing user community. The RDA launched its integration of Globus data management services into its data portal in late November 2014, beginning with support of Globus transfer. In the ten years since, RDA users have used Globus to transfer more than 64 million files from the RDA, totalling greater than 15 petabytes of data. The RDA also leverages Globus Auth to enable user login with ORCIDs, and development is underway to use Globus Search to improve data search and discovery on the RDA web portal. This talk will 1) provide a brief retrospective on how Globus has enhanced the RDA user experience by simplifying data management strategies and enabling scalable workflows, and 2) give an overview of current plans to transition the RDA into an open geoscience research data commons. This vision is guided by a recent workshop held at NSF NCAR to develop community requirements to modernize community-accessible data science infrastructure, to better connect geoscience datasets with geoscience-focused analytics environments, and to support researcher needs in meeting data sharing expectations in alignment with the FAIR (doi.org/10.1038/sdata.2016.18) principles. |
|
12:00—13:00 | lunchThe Grand Salon |
13:00—14:15 |
Joe Bottigliero, Globus Abstract coming soon. |
14:15—15:30 |
We will demonstrate how instrument data management and computation tasks can be automated at scale using the Globus platform. This session will combine multiple Globus services into an end-to-end solution that can serve as a blueprint for most common instrument use cases. |
15:30—16:30 |
Office Hours
Grand Salon North & South The Globus development team will be available to answer all your questions about the Globus service. Table topics include data transfer and sharing, Globus Connect and Premium Connectors, the web app, CLI and SDKs, Automate, Compute and Authentication |
Thursday, April 24, 2025
Gleacher Center, 450 N. Cityfront Plaza Drive, Lower Level, Room 40
|
|
---|---|
Gleacher Center Parking Information | |
8:00—9:00 | continental breakfast |
09:00—12:00 |
The Customer Forum is an opportunity for Globus subscribers to discuss their experiences with the service, to learn about our product development plans, and to provide input on future product directions. Attendance at the customer forum is by invitation only. If you would like to represent your institution/community please contact us for an invitation. |