Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 45 Next »

This document provides information regarding establishing direct access (ODBC/JDBC) connection to the CCC Data Warehouse for authorized CCC colleges and districts.

Contents


Overview

CCC Data Warehouse is developed by the CCC Technology Center in coordination with and at the direction of the CCC Chancellor's Office.  

A part of the Data Services Program initiative from the California Community Colleges Chancellor's Office, CCC Data Warehouse provides the necessary infrastructure to the California Community College System to aggregate data across disparate systems to an enterprise data warehouse. 

Secure access to the data is provided to institutional researchers, college and district administrators, and other decision makers across the California Community Colleges, district offices, and the Chancellor’s Office, where these critical data may be used to support instructional and institutional decision-making aligned with the Chancellor's Vision for Success

CCC Data Warehouse Direct Access

For authorized Researchers at CCC institutions who want to connect their preferred analytic/business intelligence tools, the CCC (Redshift) Data Warehouse now supports direct connection as an alternate or additional option to the Data Warehouse Report Server.  Once connected to the CCC Data Warehouse via direct access connection, your college will be able to:

  • Connect to the CCC Data Warehouse to accesses available data sources:

    • Connection to occur through a site-to-site VPN.

    • Access through service account login credentials provided.

  • Run SQL queries against available data sources.

  • Connect local analytics/BI application through ODBC or JDBC access.

  • Connect CCC Data to a local data warehouse by setting up these data as external tables within your local data warehouse.

Local applications may include an analytics or business intelligence application such as Tableau or Power BI. 

Note that the CCC Data Warehouse is not to be used as a staging ground for data distribution to your local data warehouse; this usage will not be permitted as it places an undue load and cost to the CCC Data infrastructure

Available Data Sources

Direct connection supports access to the following data sources within the CCC Data Warehouse:

  • CCCApply: Application (Credit and Non-Credit App)

  • CCCApply LGBTQ (AB620)

  • CCCApply: International Application

  • CCCApply: College Promise

  • Multiple Measures Placement Service (MMPS)

  • Canvas Data (Opt-in required) - with 2.1 release

  • Chancellors Office Curriculum Inventory (COCI) - with 2.1 release

  • Course Identification Numbering System (C-ID) - with 2.1 release

  • NOVA (CCCCO Only) - with 2.1 release

  • Launchboard (CCCCO Only) - with 2.1 release

  • MIS (CCCCO Only) - with 2.1 release

Data dictionaries for the above data sources are available on the CCC Data Warehouse documentation site

Learn more about the CCC Data Warehouse Report Server for authorized CCC institutional researchers.

Source Data Set

CCCCO Access

College Access

Requires Opt-In for inclusion

Introduced in Release Version

DW Direct
Connection

DW

Report Server

DW Direct
Connection

DW

Report Server

CCCApply Standard Application

(tick)

(tick)

(tick)

(tick)

CCC Data 2.0

CCCApply Noncredit Application

(tick)

(tick)

(tick)

(tick)

CCC Data 2.0

CCCApply International Application

(tick)

(tick)

(tick)

(tick)

CCC Data 2.0

California College Promise Grant Application

(tick)

(tick)

(tick)

(tick)

CCC Data 2.0

LGBTQ Data (and Report in DW Report Server)

(tick)

(tick)

(tick)

(tick)

CCC Data 2.0

Multiple Measures Placement Service (MMPS)

(tick)

(tick)

(tick)

(tick)

CCC Data 2.0

COCI: Chancellor's Office Curriculum Inventory

(tick)

(tick)

(tick)

(tick)

CCC Data 2.1

C-ID: Course ID Numbering System

(tick)

(tick)

(tick)

(tick)

CCC Data 2.1

CANVAS (Opt-in)

(tick)

(minus)

(tick)

(minus)

(tick)

CCC Data 2.1

MIS

(tick)

(minus)

(minus)

(minus)

CCC Data 2.1

NOVA

(tick)

(minus)

(minus)

(minus)

CCC Data 2.1

Launchboard

(tick)

(minus)

(minus)

(minus)

CCC Data 2.1


Implementation Process

Upon the request of an authorized individual at a college or district office (often the Dean overseeing Institutional Research), the CCC Technology Center (CCCTC) Enabling Services team will facilitate the process of working with the college IT group to establish and confirm a site-to-site VPN connection between the CCC Data Warehouse and the college (or district office).

Requesting Direct Access

The request is to be made of the colleges College Relationship Manager, if you do not know who your CRM is, send an email to crms@ccctechcenter.org.

Process Overview

  • The authorized college or district Researcher will contact their College Relationship Manager (CRM) to request direct access to the CCC Data Warehouse, and

    • College will identify a primary Research contact who will be responsible for the log in credentials

    • College will identify a primary IT/Network Admin contact who will work with Enabling Services to establish the site-to-site connection

  • Enabling Services will facilitate an implementation project

  • Enabling Services will work with the identified IT/Network contact at the college to configure a site-to-site VPN and configure the custom VPN tunnel and provide detailed instructions based on college’s system configuration

    • College will provide required system configuration information, including firewall make, model and version, and IP information

    • Primary contacts will receive account credentials and VPN configuration information via secure process.

  • Enabling Services will provide login credentials to authorized user in secure manner

  • Enabling Services will schedule an Implementation Checklist meeting via Zoom for college IT and CCCTC to confirm successful setup and connection

    • College will confirm successful VPN setup and connection tasks

    • Primary contacts will participate in an implementation checklist meeting with ES and CCCTC (InterVision)

    • College will confirm local connection and successful login, as well as perform optional data validation activities

  • CCCTC will provide online documentation to support data access validation activities for Researchers

Colleges and districts using their own analytics/BI applications are responsible for their own support in the use of these tools.

Account & VPN Configuration Requirements

To facilitate setup of a CCC Data Warehouse Direct Access account and VPN tunnel, the following information must be provided to your Enabling Services engineer during the implementation process:

  • Identify a local IT/Network Admin resource to serve as the primary contact

  • Work with the CCCTC to configure a site-to-site VPN connection  

  • Support your local Researchers working remotely to connect to the VPN 

Data Warehouse Account Credentials

CCCTC Enabling Services will configure the login credentials to the CCC Data Warehouse, which will be provided to the primary IT contact at your college through a secure transmission. 

Note that the login credentials are a Service Account, while they are issued to an individual they are intended to be used to connect applications (such as Tableau) to the CCC Data Warehouse. Protecting these credentials is the responsibility of the college.

Data Access Validation & Testing

Once connected to the CCC Data Warehouse with your own tool, colleges are encouraged to conduct a series of data access tests and activities, which may include:

Query Data Sources

Through local resources, the participant is able to connect to, and run at least one query against each of the available datasets (CCCApply application, CCCApply international application, CCCApply Promise Grant, and Multiple Measures Placement Service).

Connect Data to Local Data Warehouse

If a local data warehouse is available, configure a CCC Data Warehouse table as an “external table” in the local database source. This supports data within the CCC Data Warehouse to be connected to your local data warehouse without the need to copy data to the local machine or district server.

CCCTC Enabling Services Support is available to assist with issues connecting to the CCC Data Warehouse.  


Canvas DW Direct Connect Service

Beginning spring 2021, colleges may request to access their Canvas data through the CCC Data Warehouse via direct access connection (ODBC/JDBC). The ability to access Canvas data will require the college Canvas Administrator to generate the Canvas API credentials and securely pass the values to the CCCTC Enabling Services Implementation Engineer as part of the configuration process. Once received and implemented, the data pipeline will be configured to pass your Canvas data to the CCC Data Warehouse. Following that initial pass, the data will be updated nightly. 

To get started, please send an email requesting Canvas DW Direct Connect access for your college to your College Relationship Manager (CRM) at the CCCTC Enabling Services.

Required: Colleges must be “live” with the basic DW Direct Connect service in order to implement Canvas DW Direct Connect access. If your college has not yet configured site-to-site VPN access with the CCC Data Warehouse, please contact Enabling Services to help you get started. 

Learn more about the Canvas DW Direct Connect service, including the college preparatory requirements before and during the implementation process: How to Obtain & Submit Your Canvas Data API Access Credentials for the CCC Data Warehouse.


Connecting to Redshift Cluster with SQL Workbench

Overview

This document is designed for those looking to gain access to the CCCTC Datawarehouse Redshift cluster databases.

SQL Workbench is a database GUI for logging into many different DBs. This guide will cover logging into the Redshift Database cluster using SQL Workbench.

Prerequisites

You will need a username and password to login to the database itself, as well as the DB Endpoint (DNS) which will be provided in this document. For you to be able to connect to the RedShift DB, you first need to be logged on into the Colleges’ VPN.

Installing SQL Workbench

We will be following similar instructions to the ones on AWS's website to install SQL Workbench, so feel free to refer to those.

https://docs.aws.amazon.com/redshift/latest/mgmt/connecting-using-workbench.html

The following is a bit more in depth on the actual installation steps.

Install Java

Java install documentation can easily be found for all major platforms simply by looking online. Java 8 is compatible with the currently build of SQL Workbench (Build 127), but future builds will require Java 11, so it might be a good idea to install the latter.

Installers for the latest Java can be found at the link below:

https://www.sql-workbench.eu/downloads.html

Download the installer for your platform and run it to initiate the setup/installation wizard.

Download SQL Workbench

You can find the download page for SQL Workbench below:

https://www.sql-workbench.eu/downloads.html

Select the generic package for all systems with the optional libraries. It is probably a good idea to put this somewhere with an informative name that is easy to access on your machine.

Figure 1 Download with Optional Libraries

Unzip the download folder and open the `.jar` file that is inside to open SQL Workbench. If you are on Mac, follow the instructions below to generate an SQL Workbench executable:

https://www.sql-workbench.eu/macos-binary.html

Figure 2 - Select the Jar

Figure 3 - First Time Logging In 

Figure 4 - Make sure Autocommit is checked.

Download Redshift Drivers and Test Connection

You will need to download Redshift DB Drivers to enable SQL Workbench to communicate with Redshift. You can find the drivers for this here:

https://docs.aws.amazon.com/redshift/latest/mgmt/configure-jdbc-connection.html#download-jdbc-driver

If you did not install Java 1.11.118 or later above (Java 11), then you will need to download the drivers that included the dependent libraries for AWS SDK. It doesn’t not hurt to have these, so it might be a good idea to just download that version if you are unsure.

Once you have downloaded and unzipped the Redshift drivers on your machine, go back to your SQL Workbench install and select `Amazon Redshift` from the DB Drivers dropdown. You should see a prompt asking you to configure the driver. Select ‘Yes’.

The ‘Manage Drivers’ menu will come up. Click the folder icon and navigate to the drivers you installed earlier. 

Select the Redshift driver, which will be named like ‘Redshift*.jar’. 

With the drive selected, a sample URL will autofill and several connection fields will need to be filled out. The important fields are:

  • Name: The name for this connector that you will reference later when you want to access this database again. This is for your use only, so make it descriptive enough to where you understand clearly which DB you are accessing. For example, if you are going to query the COCI DB in Pilot, something like `DWH-COCI-Pilot` might be appropriate. Signifying the environment in this name is highly recommended.

  • URL: Where to reach the database. Edit the auto filled string here to match this:

Environment

Endpoint

Pilot

jdbc:redshift://dwh-pilot.ccctechcenter.org:5439/DATABASE

Production

jdbc:redshift://dwh-prod.ccctechcenter.org:5439/DATABASE

* Where DATABASE is the name of the database to which you are trying to connect.

  • Username: the PostgreSQL username you are using to login.

  • Password: the password for the username above. 

Click `Test` in the lower right to verify you have connected successfully. Then click OK.

Application (Data Source)

DB Name

CCCApply Application

application

CC College Promise Grant

bogfw_application

CCC International Application

intl_application

Multiple Measures

mmps

Troubleshooting: If the test fails, the two most common fixes are,

1.Verify you have entered the URL, your username, and password correctly.

2. Make sure you are connected to the Colleges’ VPN.

Making Queries

After the database connection is validated, you will be given the following prompt that will allow you to entry SQL Queries.

A tab labeled “Database Explorer” will allow you view more information about the database that your user has access to. Table names can be found here, as well as some basic additional information about the DB.

Figure 4 - Database Explorer Tab Provides More DB Information

Finally, you can run SQL against objects your DB user has access to.

Figure 5 - SQL Allowed by Your User Can be Run in the Statement Tab

 

Example Query to run on Data Warehouse Data

To get a row count of a table you can run the following query

select count(*) from dw_apply_read_523.contact;

You will need to replace the dw_apply_read_523 with the appropriate schema your user has access do. You can also use the database explorer in SQL workbench to navigate the data.

To Run Queries on Data Warehouse Data

To view the application database tables, run the following query once logged in to the (Redshift) Data Warehouse: 

  1. select schemaname,tablename from SVV_EXTERNAL_TABLES order by tablename;

 All of the read-only table names for your selected application will display, for example:


CCC Data Warehouse Direct Access FAQs

Q: Can I have more than one VPN connection for my district?

A. Not at this time. If this is needed, please discuss with your CRM.

Q: My IT department does not want to create a site-to-site VPN, can I just have an individual VPN connection? 

A. No, the CCC Tech Center is not staffed to handle individual user VPN requests.

Q: Can I have more than one individual at the college (or district office) connect to the VPN at a time?

A: The login credentials are provided as a Service Account and intended to be used by authorized users to connect local resources to the CCC Data Warehouse.

More: See the CCC Data Warehouse Frequently Asked Questions for more information and frequently asked questions.

  • No labels