Great Expectations Case Study: Komodo Health | Great Expectations

Komodo Health is a healthcare technology startup that provides data-driven software applications that empower its customers to improve patient care and reduce the burden of disease. The company’s Healthcare Map delivers patient-level insights by dynamically analyzing the broadest array of data across patients, practitioners, and health systems. The engineering team at Komodo Health utilizes a range of data engineering tools, such as Spark, Snowflake, Airflow, Amazon AWS EMR, AWS Glue, etc.


This is a companion discussion topic for the original entry at https://greatexpectations.io/blog/komodo-case-study/

I am curious how the connection from EMR to glue is working in this case.

I also have GE expectations running on EMR but couldn’t use the QueryBatchKwargsGenerator yet. I created a PR for setting up pyspark with Glue:

Hey @Dandandan! It turns out, that was a fair amount of work for @alexsherstinsky to effectively get Great Expectations working with EMR and AWS Glue and it might be a bit much for explaining in the comments here. We’ll write a blogpost at some point to explain how we did it and get into all the nuances. In the meantime, if you have a use case where you need to use Great Expectations with EMR and AWS glue, @alexsherstinsky offered to have a conversation to try to help.