Summary
I develop cloud-based web-applications to support PSC in serving the clients’ needs, such as building automated data pipeline systems, wrangling huge datasets and creating data warehouses and dashboards, building APIs, and doing full-stack development on cloud infrastructure.
Products
- Meeting Finder: A near real-time pipeline that lets users find different types of recovery meetings - designed for Iowa Department of Public Health (IDPH) - Technical steps and technologies are:
- 24/7 streaming the data from the source websites of recovery meetings to GCP SQL server
- Writing python scripts to automatically clean the collected data, including validating the addresses by Google Maps API
- Visualizing the data in a user-friendly web app
Available: http://public-science.org/meetingfinder/ More info: http://public-science.org/meetingfinder/about.php
- Recovery and community well-being resources:
- Transcriber: (Approved demo can be provided in person if desired.)
It lets users to get the transcription of large audio/video files with the ability to edit and export the text to
different formats. This tool utilizes Google cloud, such as Compute Engine to host the application, speech-to-text
API and Google storage API for storing long-length audio/video files to be transcribed. - Persona Profile Development Tool: http://public-science.org/persona/ (Session ID: test)
It is a tool for collaborative development of personas in PSC’s workshops with stakeholders. - Text analysis tool: a tiny automated statistical and sentimental text analysis tool.
Other activities
Co-supervised 12 interns for 3 month in a project called Data Science for Public Good, building the followings:
- DSPG R Package: scraped, cleaned, and harmonized 30+ public datasets around public health and published them
in a R package: https://dspg-isu.github.io/DSPG/authors.html - Scraping tutorials: Build a few tutorial videos to teach the interns in DSPG program: https://www.youtube.com/channel/UCFbTK1kSP0h7QPvbEWdPtNQ