The system is set up to automatically scale as data deluge, providing high-performance services without sacrificing security, Bick says. The volume of research involving big data in genomics is still relatively small, and Bick is not the only one trying to position Terra as a commercial option.

In March 2016, PerkinElmer Inc. was awarded a US$28 million contract to develop a commercial genome informatics platform (CGIP), offering direct access to data from individual research studies. CGIP will provide information on participants and their medical histories, in addition to their genomic data. Its immediate goal, according to PerkinElmer, is to help researchers understand the causes and consequences of illness. It will also be able to perform comparisons of patient groups from different studies.

And then there is Atlas, an early example of what cloud-computing companies hope will be a profitable business model. In the four years since it was founded, Atlas has picked up more than 50 research contracts from major funders, and uses cloud computing to pull together data sets gathered in the life sciences from individual projects. According to Molly Zielinski, the companys director of public affairs, Atlas looks at public and private data in tandem, and uses open-source software to build analysis tools that can help researchers see the big picture.

Theres nothing proprietary about the way that Atlas works, she says. But individual studies were not aware of the company. As researchers are able to directly manipulate data in Atlas, they are more engaged with the project, and the resulting data sets are also better.

The workflow environment can act like a bridge between the local desktop and that massive cloud, Karlsson notes. Because Terra can be easily run in the Cloud, it can be used as a representative, Docker-based, virtual desktop for running a single genomics data processing job, just as one would run a batch job on a virtual desktop, and because of its web-based environment, it provides security on a larger scale.
Detailed 2015 monthly data for period 01 April 2015 to 30 September 2015 for nine climate variables along with three vegetation indices were collected from NOAA CESNPS Climate Data portal and prepared as grid data. The toolkit processes these data and the results are stored in the database of Terra for further analysis.
This product provides imagery time series of the entire lunar surface from December 2005 to the present at a resolution of 90m per pixel. The date range consists of the monthly composites from 2006 to 2015 with two cycles of 1 year’s worth of composites in between. We are currently working on a new set of monthly composites for 2016-2018. Each monthly composite is available both in the HEIC and JPEG formats, and pixel is recorded in seconds. The composite images have been resampled to the standard 30m/pixel Universal Transverse Mercator projection (UTM), and are ready to download and use as is.
Terra lets users access, manage, analyse, share, publish and collaborate on any type of big data on the cloud, Van der Auwera says. Scientists can also create and share their own analysis scripts on the cloud.
Atlas is a version of Terra that uploads data to the Google Cloud Platform. Rather than initializing a whole workspace to upload data, scientists can choose to upload a single file or multiple files, bypassing the need to download the data. Atlas then streams the file to Google Cloud Storage, which allocates as much computing resources as necessary to create a virtual machine instance, where the data is loaded into memory and analysed. The data can be accessed from any web browser or remote desktop connection to the virtual machine.

