Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Item

Notes

1

Discuss the prioritization rubric: CCC Data: Data Source Prioritization - Draft

  • Clarifying question asked by Barney Gomez on the intent of the data source list.What we believe or what you guys believe to be some sort of rubric around accessing the data?”

  • Discussed that the prioritization rubric was to define which data sources are a priority to the members of the Data Warehouse advisory group, which would then be subject to CO approval and data governance before we would move forward to bring them into  the CCC Data Warehouse.

We adjusted the data source list
  • Clarification that sources on spreadsheet is not currently available in the data lake, this reflects data that workgroup would like to make available.

  • Decision: CCC Data: Data Source Prioritization spreadsheet adjusted to include a column for "level of access required", so that this information could be captured in the MOU’s necessary to bring this data in to the DW and share it with colleges/districts

.We discussed
  • . As we're prioritizing the data sets, will be capturing what is the level of access that the community is looking for.

  • Discussed the issue of data quality,

and decided
  • decision to adjust the draft prioritization rubric to remove data quality from the scoring for prioritization; instead addressing data quality as a separate issue from priority. 

There was a good discussion
  • Discussion on how to best leverage the IR community knowledge of working with data elements that are newer or seen as poor quality to share that knowledge; along with conversation on need of a staging area or vetting place where people with expertise to work with the data.

  • Barney Gomez introduced the MDM program as a strategy and tool that would be extremely important for the integrity of this data.

We discussed
  • Discussed splitting out data sources in the spreadsheet to identify different rows for raw from processed data, such as CASAS TOPS PRO data which may exist as raw data, which is then used by two initiatives, and then placed in the WestEd launchboard. Alex Jackl voiced an opinion that the minute you derive values you create a different data set. The IR community is regularly asked to recreate calculations and are looking for raw data as well as calculated data. Alex Jackl spoke to the data harmonization work and need for dictionaries that document the derived values or metrics.

Lou spoke of Data Lake used to collect data regardless of quality, with quality addressed before moving data to the DW or data marts where it needs to be high quality and usable. Lou also spoke of different technologies that may be leveraged, watch which address a different technical need based on the needs identified, including the potential for AI to help address data quality.

We ended the meeting identifying the need to address

  1. Identify and prioritize the data sources that should be brought in to the DL and DW.  This is the exercise we have been working on through the data sources spreadsheet and prioritization rubric, and

  2. Identifying how these data would be used in order to address issues of data quality and identify the appropriate applications to be used to access these data. 

There was a request to address Change Management as an agenda item on the next call.

Barney discussed there being additional resources being brought on board to support MIS work which can be directed as additional bandwidth to support this work. Barney may discuss these resources at the next meeting.

Proposed agenda for September

  1. Prioritization of proposed data sources

  2. Discussion on how prioritized data sources are to be used

  3. Discuss Change Management

...