CCC Data Public Documentation

CCC Data Program Overview

A part of the Data Management Grant from the California Community Colleges Chancellor's Office, CCC Data provides the necessary infrastructure to the California Community College System to aggregate data across disparate systems to an enterprise data warehouse. 

Access to these data is provided to institutional researchers, college and district administrators, and other decision makers at the 116 California Community Colleges, district offices, and the Chancellor’s Office, where these critical data may be used to support instructional and institutional decision-making aligned with the Chancellor's Vision for Success.

Search All CCC Data Services Public Documentation

Data Warehouse Resources

Data Available in the CCC Data Warehouse

 

CCC Data Overview

Info Graphic showing Data Pipelines, CCC Data Warehouse, CCC Data Lake, and the Report Server, with bullet points on what each component provides. Bullet points are provided below.

Data Pipelines:

  • Several proven methods available to integrate data with CCC data

  • Extract, Transform, and Loads data.

  • Runs nightly (or more frequently)

CCC Data Lake

  • Collects data, preserving changes

  • Used to power Data Warehouse, Data-Marts, and analytics

  • built on Amazon AWS S3

CCC Data Warehouse

  • Structured source of master data segmented by MIS code

  • Connects data to generate reports and analytics for end users

  • Built on Amazon AWS Redshift

Report Server (Optional)

  • Researchers can connect their own tools or use the Report Server

  • Business intelligence tool

  • Provides access to data in the Data Warehouse

  • Built on Tibco Jaspersoft

 

Architecture Overview

The CCC Data project is an AWS-centric, cloud-based solution that stores and structures data sets from the CCC data sources. The data is quickly discovered, retrieved, and used for analytics, reporting, or data mining.

Database inputs to the Data Warehouse, and outputting to the CCC Report Server and the Direct Connect Data Warehouse. Full text description is in the paragraph below.

Step 1: Data Source Integration

The process begins with the ingestion of data from multiple primary sources. These sources include:

  • CCC Apply: Applications, international students, and Promise Grant data.

  • MMPS: Multiple Measures Placement Service.

  • My Path: Student portal data.

  • COCI: Curriculum Inventory.

  • C-ID: Course Identification numbering system.

  • MIS: Management Information Systems.

  • Canvas: Learning Management System data (one per college).

Step 2: Data Processing (Data Pipelines)

All source data is fed into the Data Pipelines, powered by SuperGlue. This stage involves the extraction, transformation, and loading (ETL) processes necessary to move data from the sources into the cloud environment.

Step 3: Storage (CCC Data Lake)

The processed data is first landed in the CCC Data Lake, which utilizes Amazon S3. This acts as a centralized repository for raw or slightly processed data at scale.

Step 4: Refinement and Warehouse Loading

From the Data Lake, the data follows two paths:

  • Primary Path: Data is loaded into the CCC Data Warehouse, powered by Amazon Redshift, for high-performance structured querying.

  • Virtual Path (Redshift Spectrum): A secondary path allows users to query data directly from the S3 Data Lake without officially loading it into the warehouse tables, using Redshift Spectrum.

Step 5: Data Consumption and Analytics

The final stage involves delivering the data to the end-users through two main avenues:

  • Reporting: An optional CCC Report Server (Jasper) generates formatted reports and dashboards.

  • Direct Access: Researchers and analysts can use Researcher DW Direct Connect via ODBC/JDBC protocols to access the Data Warehouse or Spectrum directly for custom analysis.


 

Support

Need help? 

  • Contact CCCTC Support to submit a support request

  • Or CCC Data support site for questions and answers regarding the CCC Data Warehouse, DW Report Server, and DW direct connection.

News and Information

CCC Data awarded the 2020 Best of California Award for “Best Application Serving an Agency’s Business Needs.”