Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Soon after the first wave of fraud applications were identified in June 2016, the CCC Technology Center took immediate steps to strengthen the security of the CCCApply system and protect our students' personal identifiable data (read more about all the ways we are addressing fraud in CCCApply). Meanwhile, we contracted with a machine learning data research team to perform data analysis on several thousand fraud applications examples that were collected from the colleges that initially reported the spam.

Table of Contents


Image Added

Research Objectives

The objectives for the research project were simple:

  • Understand why we are seeing an influx of fraudulent applications across the CCC system
  • Understand the motivations behind these fraudulent attacks
  • Identify trends, commonalities and patterns in the data
  • Identify the tools and techniques being used by spammers
  • What can CCCApply do to prevent fraud now and in the future?
  • What can the colleges do to prevent fraud now and in the future?

Additional objectives were added based on the recommendations and outcomes of the research, including commencing a small pilot of four colleges to get feedback and understand their workflow processes, as well as develop a process for collecting data throughout the design and development phase of the project. 

...

Image Removed

Data Analysis

Based on what they learned in the initial review, the research team conducted a multi-part data analysis of all submitted applications (without using any student personal information). In the first review the focus was on one college that provided a large number of bad applications between June 1, 2016 - August 15, 2017.  The second review looked at all other colleges who provided examples of bad applications in the same time period; and the third pull looked at all remaining submitted application data. It was important to compare the bad applications to good applications in order to start detecting trends and patterns in the fraudulent "formula".  After reviewing all three data pulls, even without including personal identifiable information, we learned a great deal.

...


After the initial review, the data analysts recommended developing a spam filter service using on a continuous learning/training model - based on a custom algorithm that will get smarter each time an application is flagged as "spam". This filter service is being built for CCCApply Standard application, with a back-end user interface that will be accessible in the new CCCApply Administrator (deploying in June).
Both the spam filter service and the admin interface are under-development now - with an expected release date of June 2018. This is a huge project and will require the cooperation and participation of all colleges - not just the colleges being targeted with spam - in order to "train" the algorithm with accurate data - both good, legitimate applications as well as the bad, fraudulent applications.  

...