Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Item

Notes

1

Discuss the prioritization rubric: CCC Data: Data Source Prioritization - Draft and the List of current and proposed data sources

  • Clarifying question asked by Barney Gomez on the intent of the data source list. What we believe or what you guys believe to be some sort of rubric around accessing the data?”We discussed the intent for the proposed data sources spreadsheet is to collect the data sources that the Advisory Group (and other stakeholders) have expressed interest in accessing through the Data Warehouse.

  • Discussed that the prioritization rubric was to define which data sources are a priority to the members of the Data Warehouse advisory group, which would then be subject to CO approval and data governance before we would move forward to bring them into  the into the CCC Data Warehouse.

  • Clarification that sources on spreadsheet is (the second tab of the) spreadsheet are not currently available in the data lake or data warehouse, this reflects data that workgroup would like to make available.

  • Decision: CCC Data: Data Source Prioritization spreadsheet adjusted to include a column for "level of access required", so that this information could be captured in the MOU’s necessary to bring this data in to the DW and share it with colleges/districts. As we're prioritizing the data sets, we will be capturing what is the level of access that the community is looking for.

  • Discussed the issue of data quality, decision to adjust the draft prioritization rubric to remove data quality from the scoring for prioritization; instead addressing data quality as a separate issue from priorityprioritization

  • Discussion on how to best leverage the IR community knowledge of working with data elements that are newer or seen as poor quality to share that knowledge; along with conversation on need of a staging area or vetting place where people with expertise to work with the data.

  • Barney Gomez introduced the MDM program as a strategy and tool that would be extremely important for the integrity of this these data.

  • Discussed splitting out data sources in the spreadsheet to identify different rows for raw from processed data, such as CASAS TOPS PRO data which may exist as raw data, which is then used by two initiatives, and then placed in the WestEd launchboardLaunchboard. Alex Jackl voiced an opinion that the minute you derive values you create a different data set. The IR community is regularly asked to recreate calculations and are looking for raw data as well as calculated data. Alex Jackl spoke to the data harmonization work and need for dictionaries that document the derived values or metrics.

  • Lou Delzompo spoke of Data Lake used to collect data regardless of quality, with quality to be addressed before moving data to the DW or data marts where it needs Data Warehouse, or Data Marts, where these data need to be high quality and usable. Lou also spoke of different technologies that may be leveraged, watch which address a different technical need based on the needs identified, including the potential for AI to help address data quality.

  • Barney Gomez discussed there being additional resources being brought on board to support MIS work which can may be directed as additional bandwidth to support this work. Barney may discuss these resources at the next meeting.

Proposed agenda for September

  1. Prioritization of proposed data sources

  2. Discussion on how prioritized data sources are to be used

  3. Discuss Change Management

...

Issue/Question

Resolution/Answer

Date Resolved/Answered

Owner

1

Barney Gomez :What we believe or what you guys believe to be some sort of rubric around accessing the data?”asked a clarifying question about the intent for the prioritization rubric.

Prioritization rubric was developed to define which data sources are a priority to the members of the Data Warehouse advisory group, which would then be subject to CO approval and data governance before we would move forward to bring them into  the into the CCC Data Warehouse.

Steve Klein, Mark Cohen

2

Barney Gomez :

“So from a source point of view. Are we saying that this, this information is readily available behind the data warehouse, the data lake or are we saying that we can make this available?”

Clarification that sources on spreadsheet is asked a clarifying question about the intent for the proposed data sources spreadsheet.

Clarification that the sources on the spreadsheet are not currently available in the data lake , this and reflects data that workgroup the advisory group would like to make available.

Steve Klein, Mark CoheCohen

Action Items/Next Steps

Item

Notes

Owner

1

Identify and prioritize the data sources that should be brought in to the DL and DW.  This is the exercise we have been working on through the data sources spreadsheet and prioritization rubric

Mark Cohen is working on a survey format to collect ranking of proposed data sources using the prioritization rubric.

DW Advisory Group

2

Identifying how these data would be used in order to address issues of data quality and identify the appropriate applications to be used to access these data. 

This will be an ongoing conversation to identify how these data will be used and whether there are data issues to be resolved prior to making data available in the data warehouse.

DW Advisory Group

3

Request to address Change Management as an agenda item on the next call.

added to Data Warehouse Advisory Group: 09-14-20 agenda.

Mark Cohen, Crystal Hernandez