|
Intersegmental need to build a module into CCCApply that interfaces with the California Department of Education's SSID database to pull SSID data from the k-12 world and populate it into our world (SB1298 says we need to do this).>> SSID data would be stored in Apply database, and various data from the application would be passed back to CDE.
Scope of Data Sharing
The data sharing exchange contemplated in this Scope of Data Sharing shall be referred to as the “SSID Data Lookup.” The CCCCO will use its online system “CCCApply” to connect to the CDE’s master look-up service in the CDE’s statewide data management system, the California Longitudinal Pupil Achievement Data System (CALPADS). When students input certain personally identifiable data elements (detailed below) into CCCApply, CALPADS will then attempt to match the information with the K-12 data and, if a match is found, the CDE will share back with the CCCCO the student’s K-12 Statewide Student Identification Number (SSID). Once the student completes enrollment with CCCApply, the CCCCO will share back with the CDE verification of the student’s completed enrollment.
This exchange of data will allow the CDE and the CCCCO to be able to create a way to link information on students that are or were enrolled in the CDE’s CALPADS system with the information on students enrolled in the California Community College system. While the actual data elements to be shared for now are only those data elements set forth in Section III, it is contemplated that the parties will amend this Scope of Data Sharing in the future to allow the CDE and the CCCCO to share additional data elements between the agencies.
Justification for Data Sharing
The exchange of the data is the first step in being able to conduct cross-sector data sharing between the CDE and the CCCCO. This will allow both agencies to be able to audit or evaluate particular federal and/or state-supported education programs and/or to enforce or comply with federal legal requirements related to those programs.
Specifically, the CDE requires the linkage between the two data systems in order to prepare and conduct the following work:
Similarly, the CCCCO requires the linkage between the two data systems in order to prepare to conduct the following work: (set forth the specific reporting requirements that it needs to meet pursuant to meet federal legal requirements or audits or evaluations that it needs to conduct, etc. citing specific statutory requirements where applicable[JI1] )
# | Title | Importance | Notes |
---|---|---|---|
1 | Interface with State API | Must Have |
|
2 | New SSID field will be stored in Apply Submitted apps database |
Data Elements to be Shared
For now, the CCCCO will share the following information with the CDE, as provided by students applying through the CCCCO’s common electronic application, CCCApply:
The CDE will then share back with the CCCCO the following matched information, if available:
The CCCCO will then share the following information with the CDE.
[JI1]To CCCCO: please provide a definition of your version of student enrollment data. Our joint technical solution will be forthcoming soon.
Table 1. API Input Parameters
These are the fields that CCCApply will pass to the "SSID" API.
Data Field | Description | Data Type | Length | Format | Required |
SSID | SSID | String | 10 | ||
CCCID | CCC Identification Number | String | 7 | X | |
First Name | Student’s First Name | String | 30 | X | |
Last Name | Student’s Last Name | String | 50 | X | |
Birth Date | Student’s Birthdate | String | 10 | YYYY-MM-DD | X |
CDS Code | High School CDS Code of Attendance | String | 14 | X |
Requirement 1: Develop two new downloadable data fields to store in Apply submitted applications database:
<SSID> - State Student Identification Code
<> = Full HS CDS Code
Requirement 2: Develop data specifications for new data fields. Add to Standard Application Data Dictionary.
Requirement 3: Create JIRAs for adding new fields to Administrator & Report Center.
Requirement 4: Add fields to Standard Download Client
Table 2. API Output Parameters
These are the fields that will be passed back from the "SSID" API and stored in Apply database.
Data Field | Description | Data Type | Length | Format |
SSID | SSID | String | 10 | |
CCCID | CCC Identification Number | String | 7 |
Requirement 5: Ensure <SSID> can be stored in Submitted Applications database.
Linking SSID and CCCApply Through Web Look-up Service
Business Rules for Technical Soluton
1. Student submits an application for admission.
a) Application is implied consent to permit CDE to share student data with CCCCO in order to evaluate the efficacy of educational programs.
2. CCCApply calls a CDE web service (RESTful web service) passing (FirstName, LastName, BirthDate, HighSchoolCDScode, CCCID, SSID (optional).) NOTE: Process uses exact match on all required fields.
Input Definitions
Data Field | Description | Data Type | Length | Format |
SSID | Statewide Student Identification Number | String | 10 | |
CCCID | California Community College Identification Number | String | 7 | |
FirstName | Student’s First Name | String | 30 | |
LastName | Student’s Last Name | String | 50 | |
BirthDate | Student’s Birthdate | String1 | 10 | yyyy-mm-dd |
CDSCode | Student’s County-District-School Code | String | 14 |
a) SQL Server Date datatype, but for the purposes of an HTTP input, defined as string.
3. CDE uses the data to match the student.
a) CDE stores the CCCID with their student record (if match is found).
b) If match, CDE returns the SSID and CCCID to CCCApply via the web service. Else HTTP 400 response returned with “No student found” message.
Output Definitions
Data Field | Description | Data Type | Length | Format |
SSID | Statewide Student Identification Number | String | 10 | |
CCCID | California Community College Identifier | String | 7 |
4. CCCApply stores the SSID along with the CCCApply Application data as a field that downloadable by the colleges.
5. At a later date, CCCCO provides enrollment data to CDE and CDE can use CCCID and SSID for matching. |
FAQs
ANSWER: In our initial discussions with Tim and his staff, we decided that we will do an exact match using the data provided. After we have gone through a cycle, we would take a look at the match rates and see if anything should be modified. FYI, I had one of my staff do a quick analysis of CALPADS data. We found that with the 12M+ student records that currently exist in CALPADS, 98.081% of the records have a unique combination of first name, last name and DOB. If you change the combination to first initial, last name and DOB, 93.789% of the records are unique. Of course, this does not take into account slight variations in spelling of names or misspellings, which leads us to the next question.
ANSWER: As of now, we will only be looking at a perfect match. We haven’t had any discussion with Tim and his staff on looking for partial matches or variations in names. With our first cut, we are not using any built-in or third party phonetic algorithms (e.g. Soundex), but can implement it into the solution. With the initial discussions we had, we were going to go through an exact match process to see the accuracy and make adjustments. Making modifications to the matching process will be very straightforward.
ANSWER: I don’t know if we would get a better match, especially if we get multiple matches. Below is the result for the first initial match. If we want to look at increasing the potential match rate, CALPADS has alias fields that can be used to match against. Even with exact matches, this should increase the percentage. We should probably have a discussion on the details of the match and what we changes we want to introduce. If we use Soundex (SQL Server), then we will be able to introduce phonetic matching to the names, which should increase the match rate. Keep in mind that as we get a little “fuzzier” on our criteria, we will be increasing the odds of getting false positives.
First Initial + Last Name + Birthdate
Result Count | Name Count | % of total | Running % |
1 | 11080627 | 93.789% | 93.789% |
2 | 603303 | 5.106% | 98.896% |
3 | 92121 | 0.780% | 99.675% |
4 | 25139 | 0.213% | 99.888% |
5 | 8504 | 0.072% | 99.960% |
6 | 3072 | 0.026% | 99.986% |
7 | 1112 | 0.009% | 99.995% |
8 | 363 | 0.003% | 99.999% |
9 | 119 | 0.001% | 100.000% |
10 | 44 | 0.000% | 100.000% |
11 | 6 | 0.000% | 100.000% |
12 | 5 | 0.000% | 100.000% |
13 | 1 | 0.000% | 100.000% |
14 | 1 | 0.000% | 100.000% |
Top
The new fields being created for this work are:
<ssid>
<hs_cds_full>
<col[0]_cds_full>
The <ssid> is a RESTRICTED field and will NOT be added to the College Download Client; the other two CDS fields will be downloadable.
Top
2/25/15: Santa Rosa JC has inquired about the status of this via email. Per Tim , add to the next Steering Meeting (April 2015) agenda to revisit.
<< Per Tim's email response to SRJC: This has come up. The SSID has been assigned to students who attended HS since about 2006.
CSU requests this field as an option in CSU Mentor. The problems they experience is bad or no data because there is no verification mechanism and students don't know what it is or where to find the number.
CCCCO has been in negotiations with CDE for them to stand up a web service that we could use to match students against their CALPADS database and return the SSID to us, but so far they have not been able to get this to be a priority.
Since bad data may be worse that no data, the last time it came up to CCCApply Steering, it was tabled.>>