CCCApply Sub-Committee: Spam Filter Web Service

Meeting Information

Date & Time:  Tuesday, December 9, 2019   3:00PM

Zoom Recording:  CCCApply Spam Filter User Group Meeting - Dec 9, 2019

Next meeting:  Monday, January 20, 2020 at 3:00 pm 


IMPORTANT:  Please help CCCApply help you! Please complete the CCCApply College Spam SURVEY


Take-Aways from the Meeting

Colleges continue to be victims of a wide-scale, ongoing cyber attack from multiple bad actors who are committing identity theft and illegal fraud by submitting fake applications for admission through CCCApply. 

The underlying purposes of these attacks includes access to free .edu email addresses, free software and technology discounts and benefits afforded to students, and legitimacy of identity information for residency and financial aid benefits.

Unfortunately, Colleges are unwittingly contributing to these illegal acts by continuing to provide the incentives these criminals are seeking. (This is, unfortunately, adding hardship to our colleges and our system overall.)

Some Colleges are still unaware of the reasons behind these Identity attacks and/or are seeking confirmation from the Chancellor's Office to enact procedural changes in their onboard process.                                                                                                                                                                                                                 

The CCCApply spam filter service was implemented in September 2018 and continues to evolve. 

The Tech Center needs feedback and support from the colleges to:

  • Remove incentives that are encouraging cyber criminals to continue this behavior
  • Make procedural changes to remove those incentives
  • Support all colleges by diligently monitoring and tagging suspended applications identified by the model/prediction service
  • Support the overall functionality of the spam model and prediction service by consistently tagging the fraud that is caught by the spam filter and the False Negatives which are missed by the service
  • Participate in a User Group to provide feedback and promote continuous improvements to the spam filter service


SEE THE SPAM FILTER FAQ - Fall 2019


Our goal for the CCCApply Spam Filter Web Service is to implement continuous improvement and enhancement to build a model that is continuously retraining itself on a cadence that is sustainable for our colleges.

A CCCApply Spam Filter User Group was formed and colleges are invited to participate

  • the group will meet monthly (third Monday of each month at 3pm - through June 30, 2020)
  • share communication and information in a secure manner
  • Share best practices to prevent fraud and security threats


Next meeting:  Monday, January 27, 2020 at 3:00 pm.  


College Survey

Help us to better understand fraud coming in at your college. Please complete the survey 

CCCApply Spam Filter College Survey


Spam Filter Project Update 

Speakers:  Machine-Learning Data Analysts: Harsha Gopianandan, Ananth Gopalakrishnan


Despite spikes in spam over the six weeks or so, the performance of the spam filter service is netting a 95-98% accuracy.  Below are the stats from Nov 23 - 27, 2019.

November 23 - 27

Total applications: 19,943
True Positive 938
True Negative 18,965
False Positive 13
False Negative 22
Accuracy 99.7994283709 %
Precision 98.6330178759 %
Recall 97.7083333333 %


.

Notes from Nov 12 meeting:

Harsha Gopianandan from the Spam Filter Machine Learning team explained about false positives and false negatives and why we are seeing both across the system.

He explained how the model works and encouraged colleges to keep tagging diligently so that model can learn from all new signatures.

Getting ready to update the model again with the Model with PII (December 2019). 

We talked about sending false negatives to the Tech Center and we will bulk tag them and upload to the model, so the model can benefit from the identified false negatives

Roadmap

  • August 24 - Last Model Update
  • November 22  - Pii Data Added to Model and Other Enhancements (IP Region, Email Domain)
  • December 12 - Manual Retraining Model with Updated Data


Ongoing

Update the model every 2-3 weeks.  

Update with the latest data.

Update the model with new features. 

*Note: Derived data from PII data is used. 
Taking that info and transforming it in a way that is useful to the model.


Top Criteria 

College ID

Email domain

Major Code

Address

Phone number

Name + DOB

Are you seeing a large amount of False Negatives?

If your college is seeing blocks of false negatives (fraud applications that were NOT caught and suspended by the spam filter)...we can help you tag them so that all colleges benefit from those signatures,

Please send quantities of 20 or more to Support Services using the instructions below.

SPAM Drop File Information

Please provide bulk fraudulent applications in the format specified below:

  • File Format = .TXT
  • File Naming Convention = CollegeMISCode_Fraud_mmddyy.txt
  • Confirmation #  (only1 confirmation # per line)

Send to:  staffsupportccctc@openccc.zendesk.com


We ask that all colleges follow the file format above. If you would like to include any information other than confirmation #, please provide that information in a separate file.

For other assistance, please contact:

Merrie Wales mwales@ccctechcenter.org

Patty Donohue pdonohue@ccctechcenter.org



Feedback / Issues

Colleges want specific email address domains blocked.  Patty will confirm that this can be done using an Error Message Rule in the Administrator by the college.

We currently block (new) IP Regions if they appear to be used in fraud signatures

Mitch from Santa Rosa College says they no longer give out .edu addresses until the student is fully matriculated and registered for classes or paid

Spammers are committing IDENTITY THEFT - stealing real personal identifiable information for seeking financial aid (fraud).

Becky from Shasta - built in a process to get a report to review apps quickly (includes IP Address in report) - internal   

San Bernardino Valley & Crafton Hills College has implemented - GMail account is restricted until after they register - originally set up to email staff and other students, but found they are phishing other students, so they no longer allow them to email other students.  (look into Gmail registration settings - different set of policies) - and this cut their spam in half. 

Question asked if we could do model updates more frequently than 2-3 weeks - we would like to and will see what's possible. We 

Address Validation > Why are we allowing students to bypass the CASS address validation? This could be looked at - spammers are putting in bad addresses. 


What are spammers going for?

Spammers are specifically going for the FREE Google Drive storage 



Change Enhancements

  • Add a way to search by other data (whatever the columns are) CCCID, Email Address, DOB, etc. 
  • Add the IP Address to the spam filter summary table
  • Add the Country that matches the IP Address or Region to the spam filter summary table
  • Look into a Zip Code / Area Code validation check
  • Create ability to create a rule based on any data that puts the application directly into the Spam Filter (or prevents it/whitelist)  - ask Josh (don't use the model if the app meets the rule logic)  (would have to bypass the prediction service 
  • What's the situation with ReCAPTCHA (is our current version so outdated that it's ineffective?) Is OpenCCC redesign planning to include an updated version?  Which tool and which version?  Why do we not have one in the Application?  Get history on this and inform the group.