Data Warehouse Advisory Group: 08-10-20

Data Warehouse Advisory Group: 08-10-20

Meeting Details

Meeting Date:

Aug 10, 2020


Data Warehouse Advisory Group

Zoom Recording:


Password: u5@%xK86


Crystal Hernandez, Mark Cohen, Steve Klein, Denice Inciong, Alex Jackl, Craig Hayward, Dulce Delgadillo, Dustin Tamashiro, Jake Kevari, Jenni Allen, Louis Delzompo, Z Reisz, Barney Gomez







Discuss the prioritization rubric: CCC Data: Data Source Prioritization - Draft and the List of current and proposed data sources

  • Clarifying question asked by Barney Gomez on the intent of the data source list. We discussed the intent for the proposed data sources spreadsheet is to collect the data sources that the Advisory Group (and other stakeholders) have expressed interest in accessing through the Data Warehouse.

  • Discussed that the prioritization rubric was to define which data sources are a priority to the members of the Data Warehouse advisory group, which would then be subject to CO approval and data governance before we would move forward to bring them into the CCC Data Warehouse.

  • Clarification that sources on (the second tab of the) spreadsheet are not currently available in the data lake or data warehouse, this reflects data that workgroup would like to make available.

  • Decision: CCC Data: Data Source Prioritization spreadsheet adjusted to include a column for "level of access required", so that this information could be captured in the MOU’s necessary to bring this data in to the DW and share it with colleges/districts. As we're prioritizing the data sets, we will be capturing what is the level of access that the community is looking for.

  • Discussed the issue of data quality, decision to adjust the draft prioritization rubric to remove data quality from the scoring for prioritization; instead addressing data quality as a separate issue from prioritization. 

  • Discussion on how to best leverage the IR community knowledge of working with data elements that are newer or seen as poor quality to share that knowledge; along with conversation on need of a staging area or vetting place where people with expertise to work with the data.

  • Barney Gomez introduced the MDM program as a strategy and tool that would be extremely important for the integrity of these data.

  • Discussed splitting out data sources in the spreadsheet to identify different rows for raw from processed data, such as CASAS TOPS PRO data which may exist as raw data, which is then used by two initiatives, and then placed in the WestEd Launchboard. Alex Jackl voiced an opinion that the minute you derive values you create a different data set. The IR community is regularly asked to recreate calculations and are looking for raw data as well as calculated data. Alex Jackl spoke to the data harmonization work and need for dictionaries that document the derived values or metrics.

  • Lou Delzompo spoke of Data Lake used to collect data regardless of quality, with quality to be addressed before moving data to the Data Warehouse, or Data Marts, where these data need to be high quality and usable. Lou also spoke of different technologies that may be leveraged, which address a different technical need based on the needs identified, including the potential for AI to help address data quality.

  • Barney Gomez discussed additional resources being brought on board to support MIS work which may be directed as additional bandwidth to support this work. Barney may discuss these resources at the next meeting.

Proposed agenda for September

  1. Prioritization of proposed data sources

  2. Discussion on how prioritized data sources are to be used

  3. Discuss Change Management


Issues/Questions Resolved



Date Resolved/Answered




Date Resolved/Answered



Barney Gomez asked a clarifying question about the intent for the prioritization rubric.

Prioritization rubric was developed to define which data sources are a priority to the members of the Data Warehouse advisory group, which would then be subject to CO approval and data governance before we would move forward to bring them into the CCC Data Warehouse.

Aug 10, 2020

Steve Klein, Mark Cohen


Barney Gomez asked a clarifying question about the intent for the proposed data sources spreadsheet.

Clarification that the sources on the spreadsheet are not currently available in the data lake and reflects data that the advisory group would like to make available.

Aug 10, 2020

Steve Klein, Mark Cohen


Action Items/Next Steps








Identify and prioritize the data sources that should be brought in to the DL and DW.  This is the exercise we have been working on through the data sources spreadsheet and prioritization rubric

Mark Cohen is working on a survey format to collect ranking of proposed data sources using the prioritization rubric.

DW Advisory Group


Identifying how these data would be used in order to address issues of data quality and identify the appropriate applications to be used to access these data. 

This will be an ongoing conversation to identify how these data will be used and whether there are data issues to be resolved prior to making data available in the data warehouse.

DW Advisory Group


Request to address Change Management as an agenda item on the next call.

added to Data Warehouse Advisory Group: 09-14-20 agenda.

Mark Cohen, Crystal Hernandez