Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Of the 2.1 several million applications submitted through CCCApply each year, the vast majority of them are valid - submitted by legitimate applicants that who want to attend a California community college. These applications contain personal identifiable data and other critical information that needs to get to the college as quickly and safely as possible. However, for the percentage of applications that are bad and that are submitted through CCCApply for nefarious purposes with the intent to commit fraud, we've developed a system that will analyze, flag, suspend, and ultimately, block the fraud attempt through a spam filter web service and user interface.

Development of the spam filter web service and user interface began in early 2017 to assist colleges in making accurate and informed decisions on whether an application is fraud or not.  The tool consists of three main components: the post-submission web service, the machine-learning model and prediction service, and the user interface to review and confirm identified fraud. 

This page talks about the development project, what it includes, and how it operates.

Table of Contents
maxLevel3
minLevel2

...

With the development of the Spam Filter Web Service, every application will now be is intercepted after submission and routed to the spam filter machine learning model and prediction service to see if the data meets the criteria that constitutes it as spam or fraud.

The applications that are legitimate and do not meet the criteria for spam are quickly passed through to the college via their selected data delivery method.

For the applications that are frauds, however, the model extracts the data and looks for "identifiers" which are then fed into machine learning algorithm for full analysis. The prediction service then calculates a probability of how confident it is that the application is bad; in other words it "suggests a level of confidence" between 1 and 100.  The closer the number is to 100, the more likely it is fraudulent. This is called the Confidence Threshold. 

...

  1. Application is submitted to CCCApply
  2. Application is stored with a fraud status flag set to PENDING
  3. Application is posted to the prediction service where model is applied
  4. Prediction service returns the probability rating that the application is fraudulent or not.
  5. Based on the probability rating, the fraud status flag is updated with “Checked Fraud” or “Not Checked Fraud”
  6. Applications set with “Checked Fraud” are sent to the Suspension folder (User Interface) awaiting confirmation by A&R Staffcollege staff
  7. College staff confirm fraud labels via User Interface
  8. Application fraud label confirmation trains the machine learning model
  9. Model is refined over time to better identify and filter fraudulent applications


Post-submission Development

...