Cloud computing is ideal for running flexible, scalable applications on demand, in periodic bursts, or for fixed periods of time. UVA Research Computing works alongside researchers to design and run research applications and datasets into Amazon Web Services, the leader among public cloud vendors. This means that server, storage, and database needs do not have to be estimated or purchased beforehand – they can be scaled larger and smaller with your needs, or programmed to scale dynamically with your application.
Service Oriented Architecture
A key advantage of the cloud is that for many services you do not need to build or maintain the servers that support the service – you simply use it.
Here are some of the building blocks available using cloud infrastructure:
- Containers / Docker
- Analytics / Data Management
- Continuous Integration
- Sensor / IoT Data Streaming
- Message Queues / Brokers
- SMS / Push Integration
- Alexa Skills / Speech Integration
- Serverless Computing
- Code Build / Validation
Researchers Using the Cloud
|Serverless Web||UVA faculty and researchers can share data, findings, tools and other resources from static HTML content published to object storage. This simple method for publishing can cost only a few dollars a month and requires no server management.|
|Data Lakes||A new paradigm in data storage and processing, data lakes help researchers by providing a central repository for both structured and unstructured data, of any type or size. These data can then be siphoned off for processing, either in real-time streams or in queues for later analysis.|
|Services in Support of HPC||Users of HPC usually have more than enough computing power to run their jobs. But what if you need a relational or NoSQL database, a messaging service, or offsite storage? Researchers have begun integrating the cloud into their HPC jobs to create, use, and manage external services like these.|
|HIPAA-Compliant Computing||Researchers working on clinical datasets use Ivy, our private virtualized platform to perform HIPAA compliant analytics and compute jobs. This platform offers virtual machines, an R/Python data analytics tool, and Hadoop/Spark for larger analytics projects. Many users in Ivy work with EPIC clinical data alongside other highly-sensitive datasets for their investigations.|
|Workflows & Pipeline Management||Researchers need flexibility for where they run their data pipelines -- it might be on a personal computer, a lab server, an HPC cluster, or a cloud instance. We are working with faculty to extend some commonly-used pipeline tools so that they can create and push jobs to cloud-based resources, regardless of the cloud vendor.|
|Long-term Cold Storage||AWS Glacier and Google Nearline/Coldline offer researchers "cold" offsite storage for long-term backups of infrequently-accessed data. Many researchers use Glacier to store terabytes of source data to fulfill grants and federal research project compliance.|
Other Common Use Cases
Proofs of concept - To verify a system or design works, to benchmark processing speeds, we may use short-lived instances to learn from before building a production system.
Test / Development environments - For installing test packages, trying new ideas, and testing design patterns.
Dynamic / flexible / scaling application stacks - When future traffic or load cannot be determined beforehand, deploying into a dynamic environment means the infrastructure is not locked into any set type of CPU/RAM or scale.
Short-term or fast deployment projects - For almost immediate computing needs, existing users can create new instances as needed.
Container deployments - Run microservices (such as Docker containers) in an environment that can load-balance their traffic and maintain container health.
To get an idea of how public or private cloud resources are used in real-world and research scenarios, visit one of these Solution Architecture References:
- AWS Architecture Center.
- Google Cloud Solutions Architecture Reference | GCP Builder Tutorials.
- Azure Solution Architecture | Azure Reference Architectures.
Some examples from AWS:
Build auto-scalable batch processing systems like video/image/datastream processing pipelines.
Large Scale Processing and Huge Data sets
Build high-performance computing systems that involve Big Data.
Cloud Services at UVA
As an Internet2 institution, the University of Virginia has access to AWS accounts through a reseller, DLT. This program offers a few key advantages for researchers:
- First, it allows for billing through purchase orders (P.O.’s) rather than credit cards;
- Second, it gives a slight (~3%) discount on services; and
- Third, it removes the required minimum costs for AWS support. Read more about the Internet2/AWS program.
Requesting an Account
Researchers or labs who would like to use AWS for their computing infrastructure should contact us to help set up an account through DLT. In order to set up your account you will need a “standing” annual P.O. from the UVA Procurement office equal to or greater than your estimated annual costs. For example, if you estimate your costs will be $300 per month, you might want to request a $4000 standing P.O. Your monthly AWS bills are then charged against that P.O. for the year. Note that these purchase orders must be renewed each year that you continue to use AWS.
Training & Implementation
With an AWS account in hand, you will need some training. We offer regular, free workshops on cloud computing. In addition, we have weekly availability to answer your questions during our office hours, or we can schedule an in-person, hands-on training with your research group or lab.
If you need help in designing your infrastructure in a cloud environment, or thinking through how to migrate your existing projects, contact us for a consultation.
Sensitive Data in the Cloud
If your cloud-based project involves any sensitive data (HIPAA, PHI, etc.) you must request approval from the Information Security office at UVA. You will be required to verify that your application, infrastructure, and staff can meet all minimum requirements for the secure transfer and handling of sensitive data.
Solution Architecture / Consulting
We have experience designing and delivering solutions to the public cloud using industry best practices. If you have a project and would like to discuss options, pricing, design, or implementation, we are available for consultation. Our staff includes an AWS certified solution architect, and the RC team uses AWS for our own internal systems and development.
We also offer in-person, hands-on workshops and sessions on working with the cloud. Workshops cover a number of topics, from creating object storage buckets and simple compute instances to more complex data-driven workflows and Docker containers, If you have an idea for a workshop or would like to schedule training for your lab or group, please contact us.
|01/30/20||Using Rivanna from the Command Line||Gladys Andino|
|01/31/20||Fundamentals of Matlab||Ed Hall|
|02/06/20||Parallel Computing with Matlab||Ed Hall|
|02/11/20||Shiny Web Apps in R||Christina Gancayco|
|02/13/20||Software Containers for HPC Environments||Ruoshi Sun|
|02/13/20||Statistical Methods in Matlab||Ed Hall|
|02/18/20||Intro to Image Processing with Fiji/ImageJ||Karsten Siller|
|02/19/20||Moving R to HPC||Jackie Huband|
|02/20/20||High Performance Python||Karsten Siller|
|02/20/20||Optimization Methods in Matlab||Ed Hall|
|02/25/20||Automation of Image Processing with Fiji/ImageJ||Karsten Siller|
|02/26/20||Optimizing R Code||Jackie Huband|
|02/27/20||C/C++ and Fortran on Rivanna||Ruoshi Sun|
|02/27/20||Deep Learning in Matlab||Christina Gancayco|
|03/04/20||Parallelizing R||Jackie Huband|
|03/05/20||Image Processing in Matlab||Christina Gancayco|
|03/18/20||Parallel R with MPI||Jackie Huband|
|03/26/20||Julia on Rivanna||Ed Hall|