openpower.foundation/content/blog/big-data-and-ai-collaborati...

51 lines
3.6 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
title: "Big Data and AI: Collaborative Research and Teaching Initiatives with OpenPOWER"
date: "2020-02-13"
categories:
- "blogs"
tags:
- "ibm"
- "power"
- "hpc"
- "big-data"
- "summit"
- "ai"
- "oak-ridge-national-laboratory"
---
[Arghya Kusum Das](https://www.linkedin.com/in/arghya-kusum-das-567a4761/), Ph.D., Asst. Professor, UW-Platteville
![](images/Blog-Post_2.19.20.png)
In the Department of Computer Science and Software Engineering (CSSE) at the University of Wisconsin at Platteville, I work closely with hardware system designers to improve the quality of the institutes research and teaching.
Recently, I have engaged with the OpenPOWER community to improve research efforts and also to help build collaborative education platforms.
## **Accelerating Research on POWER**
As a collaborative academic partner with the OpenPOWER Foundation, I have participated and led sessions at various OpenPOWER Academic workshops. These workshops gave me an opportunity to learn about various features around OpenPOWER and also provided great networking opportunities with many research organizations and customers.
As part of this, I submitted a research proposal to [Oak Ridge National Laboratory](https://www.ornl.gov/) for allocation in the Summit supercomputing cluster to accelerate my research. With this allocation, I focus on accurate, de novo assembly and binning of metagenomic sequences, which can become quite complex with multiple genomes in mixed sequence reads. The computation process is also challenged by the huge volume of the datasets.
Our assembly pipeline involves two major steps. First, a de Bruijn graph-based de novo assembly and second, binning the whole genomes to operational taxonomic units utilizing deep learning techniques. In conjunction with large data sets, these deep learning technologies and scientific methods for big data genome analysis demand more compute cycles per processor than ever before. Extreme I/O performance is also required.
The final goal of this project is to accurately assemble terabyte-scale metagenomic leveraging IBM Power9 technology along with Nvidia GPU and NVLink.
## **Building a Collaborative Future**
One of our collaborative visions is to spread the HPC education to meet the worldwide need for experts in corresponding fields. As a part of this vision, I recognized the importance of online education and started working on a pilot project to develop an innovative, online course curriculum for these cutting-edge domains of technology.
To further facilitate these visions, Im also working on developing a collaborative, online education platform where students can receive lectures and deepen their theoretical knowledge, but also get hands-on experience in cutting edge infrastructure.
Im interested in collaboration with bright minds including faculties, students and professionals to materialize this online education goal.
## **Future Workshops and Hackathons**
As a part of this collaborative initiative, I plan to organize big data workshops and hackathons, which will provide a forum for disseminating the latest research, as well as provide a platform for students to get hands-on learning and engage in practical discussion about big data and AI-based technologies.
The first of these planned events is the OpenPOWER Big Data and AI workshop taking place on April 7th, 2020. Attendees will hear about IBM and OpenPOWER partnerships, cutting-edge research on big data, AI, and HPC, including outreach, industry research, and other initiatives.
You can register for the workshop [**here**](https://www.uwplatt.edu/big-data-ai).
Cant wait to see you there!