Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Yes, it does.

Item

Notes

1

Discuss the prioritization rubric: CCC Data: Data Source Prioritization - Draft and the List of current and proposed data sources

Issue/Question

Resolution/Answer

Date Resolved/Answered

Owner

1

Dulce Delgadillo:

  •  Does the CCCApply application data include CCCApply Noncredit application data as well?

2

Do we know which data will be included for the Canvas Data?

We do, Canvas has an API that we are leveraging. We will be leveraging API key provided by administrator from local college to pull data set out and into the data warehouse. It is a very robust data set, and we do have data dictionary for the Canvas data to have an idea of what is there.

Canvas_Data_Domains 20170801.xls

3

In terms of Canvas Data, understanding is all that is needing is Canvas API key, is additional effort needed?

All that is needed is Canvas API key, additional effort on the college side is not needed. How to obtain key is pretty straightforward.

4

Steve Klein:

Is the Employment Development Department Unemployment Insurance Wage File data, what is currently being collected as part of the MIS data that CO does collect? The EDD data?

Dulce Delgado:

  • It is being matched by social security numbers and is part of Adult Ed launch board and Strong Workforce and possibly student success metrics.

Craig Howard:

  • The CO gets a direct match with EDD, but it is not part of MIS.

  • The EDD data shows up in Salary Surfer.

  • But colleges only see the aggregated data.

  • It's already a CO tool.

  • https://salarysurfer.cccco.edu/SalarySurfer.aspx

  • It contains EDD data but the issue is that the MOU does not allow the CCCCO to distribute that EDD data to colleges at a level that is more granular (i.e., appropriate to include in the DW).

5

Mark Cohen:

Should we be tracking Salary Surfer as a potential data set for CCC Data?

At this time there is no need for it, Advisory feedback.

6

Dulce Delgadillo:

Can you please send out that Data dictionary please? That would be great!

Action Item noted. Data Dictionary will be shared via CCC Data Services Program Public Documentation.

Mark Cohen

7

Dulce Delgadillo:

Are we saying that all of this data (data currently listed under Potential Data Sources) will be at a student level?

That is the goal.

Valerie Lundy-Wagner:

That is the goal, but part of the data sharing and governance that will need to be worked out.

8

Valerie Lundy-Wagner:

Mark, will you summarize the questions, concerns, and share them with the CO?

We need some documentation of what's being raised.

Google Docs perhaps to list those questions

Action Item noted. DW Advisory Meeting notes will be documented and shared via CCC Data Services Program Public Documentation, all meeting minutes can be found at DW Advisory Meeting Notes, FY 20/21.

Mark Cohen, Crystal Hernandez

9

The unitary data would be at a district level? Because data is submitted at a district level.

Yes, Student Center Funding formula is allocated at a district level.

For Multi-college district it would be meaningful to have way to drill down

Need college codes, at student-enrollment level would need college code

  • Clarifying question asked by Barney Gomez on the intent of the data source list. What we believe or what you guys believe to be some sort of rubric around accessing the data?”We discussed the intent for the proposed data sources spreadsheet is to collect the data sources that the Advisory Group (and other stakeholders) have expressed interest in accessing through the Data Warehouse.

  • Discussed that the prioritization rubric was to define which data sources are a priority to the members of the Data Warehouse advisory group, which would then be subject to CO approval and data governance before we would move forward to bring them into  the into the CCC Data Warehouse.

Issues/Questions Resolved

  • Clarification that sources on (the second tab of the) spreadsheet are not currently available in the data lake or data warehouse, this reflects data that workgroup would like to make available.

  • Decision: CCC Data: Data Source Prioritization spreadsheet adjusted to include a column for "level of access required", so that this information could be captured in the MOU’s necessary to bring this data in to the DW and share it with colleges/districts. As we're prioritizing the data sets, we will be capturing what is the level of access that the community is looking for.

  • Discussed the issue of data quality, decision to adjust the draft prioritization rubric to remove data quality from the scoring for prioritization; instead addressing data quality as a separate issue from prioritization. 

  • Discussion on how to best leverage the IR community knowledge of working with data elements that are newer or seen as poor quality to share that knowledge; along with conversation on need of a staging area or vetting place where people with expertise to work with the data.

  • Barney Gomez introduced the MDM program as a strategy and tool that would be extremely important for the integrity of these data.

  • Discussed splitting out data sources in the spreadsheet to identify different rows for raw from processed data, such as CASAS TOPS PRO data which may exist as raw data, which is then used by two initiatives, and then placed in the WestEd Launchboard. Alex Jackl voiced an opinion that the minute you derive values you create a different data set. The IR community is regularly asked to recreate calculations and are looking for raw data as well as calculated data. Alex Jackl spoke to the data harmonization work and need for dictionaries that document the derived values or metrics.

  • Lou Delzompo spoke of Data Lake used to collect data regardless of quality, with quality to be addressed before moving data to the Data Warehouse, or Data Marts, where these data need to be high quality and usable. Lou also spoke of different technologies that may be leveraged, which address a different technical need based on the needs identified, including the potential for AI to help address data quality.

  • Barney Gomez discussed additional resources being brought on board to support MIS work which may be directed as additional bandwidth to support this work. Barney may discuss these resources at the next meeting.

Proposed agenda for September

  1. Prioritization of proposed data sources

  2. Discussion on how prioritized data sources are to be used

  3. Discuss Change Management

Issues/Questions Resolved

Issue/Question

Resolution/Answer

Date Resolved/Answered

Owner

1

Barney Gomez asked a clarifying question about the intent for the prioritization rubric.

Prioritization rubric was developed to define which data sources are a priority to the members of the Data Warehouse advisory group, which would then be subject to CO approval and data governance before we would move forward to bring them into the CCC Data Warehouse.

Steve Klein, Mark Cohen

2

Barney Gomez asked a clarifying question about the intent for the proposed data sources spreadsheet.

Clarification that the sources on the spreadsheet are not currently available in the data lake and reflects data that the advisory group would like to make available.

Steve Klein, Mark Cohen

Action Items/Next Steps

Item

Notes

Owner

  •  Send out Data Dictionary to Advisory Group

Action Item noted. DW Advisory Meeting notes will be documented and shared via CCC Data Services Program Public Documentation, all meeting minutes can be found at DW Advisory Meeting Notes, FY 20/21.

Mark Cohen

  •  Create Spreadsheet to monitor questions brought up during Advisory meetings

This is documented above.

1

Identify and prioritize the data sources that should be brought in to the DL and DW.  This is the exercise we have been working on through the data sources spreadsheet and prioritization rubric

Mark Cohen is working on a survey format to collect ranking of proposed data sources using the prioritization rubric.

DW Advisory Group

2

Identifying how these data would be used in order to address issues of data quality and identify the appropriate applications to be used to access these data. 

This will be an ongoing conversation to identify how these data will be used and whether there are data issues to be resolved prior to making data available in the data warehouse.

DW Advisory Group

3

Request to address Change Management as an agenda item on the next call.

added to Data Warehouse Advisory Group: 09-14-20 agenda.

Mark Cohen, Crystal Hernandez