Online Meeting Information
Date & Time: Tuesday, December 9, 2019 3:00PM
Location ZOOM: https://cccconfer.zoom.us/j/6770513851 MEETING ID: 677-051-3851
Use VoiP or Call TOLL Telephone: Dial: +1 408 638 0968 (US Toll) or +1 646 876 9923 (US Toll)
Agenda
TIME | DESCRIPTION |
3:00pm | Introductions: - (Attendees, please add your Name, College, and your Title in the chat window when you enter into the me |
Status update (stats on incoming fraud and model overview) | |
Roadmap: Upcoming changes to Model and UI) | |
Communication | |
Gather feedback on Open Issues and Concerns | |
Review Survey Questions & FAQ | |
4:00 pm | Schedule F/U Call? - Close Meeting |
Upcoming Meetings: 2019-2020 CCCApply Sub-Committee Meeting Schedule
Key Messages:
Share updates and stats on performance
Encourage colleges to continue to monitor and tag spam
Survey
Discuss concerns
Next meeting: Schedule for end of January 2020.
College Survey
Help us to better understand fraud coming in at your college. Please complete the survey
Spam Filter Project Update
Speakers: Machine-Learning Data Analysts: Harsha Gopianandan, Ananth Gopalakrishnan
Despite spikes in spam over the six weeks or so, the performance of the spam filter service is netting a 95-98% accuracy. Below are the stats from Nov 23 - 27, 2019.
November 23 - 27
Total applications: 19,943
True Positive 938
True Negative 18,965
False Positive 13
False Negative 22
Accuracy 99.7994283709 %
Precision 98.6330178759 %
Recall 97.7083333333 %
.
Notes from Nov 12 meeting:
Harsha Gopianandan from the Spam Filter Machine Learning team explained about false positives and false negatives and why we are seeing both across the system.
He explained how the model works and encouraged colleges to keep tagging diligently so that model can learn from all new signatures.
Getting ready to update the model again with the Model with PII (December 2019).
We talked about sending false negatives to the Tech Center and we will bulk tag them and upload to the model, so the model can benefit from the identified false negatives
Roadmap
- August 24 - Last Model Update
- November 22 - Pii Data Added to Model and Other Enhancements (IP Region, Email Domain)
- December 12 - Manual Retraining Model with Updated Data
Ongoing
Update the model every 2-3 weeks.
Update with the latest data.
Update the model with new features.
*Note: Derived data from PII data is used.
Taking that info and transforming it in a way that is useful to the model.
We can't automatically block domains, because the spammers keep changing their tactics.
Top 10 Criteria
College ID
Email domain
Major Code
Communication Issues
FAQ is finished and will be sent to your contact list.
Send Us False Negatives
If your college identifies a quantity (20 or more) of fraud applications that were NOT caught and suspended by the spam filter, please send them to us using the instructions below.
SPAM Drop File Information
Please provide bulk fraudulent applications in the format specified below:
- File Format = .TXT
- File Naming Convention = CollegeMISCode_Fraud_mmddyy.txt
- Confirmation # (only1 confirmation # per line)
Merrie Wales mwales@ccctechcenter.org
Patty Donohue pdonohue@ccctechcenter.org
staffsupportccctc@openccc.zendesk.com
We ask that all colleges follow the file format below. If you would like to include any information other than confirmation #, please provide that information in a separate file. For ease of input in the model
Feedback / Issues
Colleges want specific email address domains blocked. Patty will confirm that this can be done using an Error Message Rule in the Administrator by the college.
We currently block (new) IP Regions if they appear to be used in fraud signatures
Mitch from Santa Rosa College says they no longer give out .edu addresses until the student is fully matriculated and registered for classes or paid
Spammers are committing IDENTITY THEFT - stealing real personal identifiable information for seeking financial aid (fraud).
Becky from Shasta - built in a process to get a report to review apps quickly (includes IP Address in report) - internal
San Bernardino Valley & Crafton Hills College has implemented - GMail account is restricted until after they register - originally set up to email staff and other students, but found they are phishing other students, so they no longer allow them to email other students. (look into Gmail registration settings - different set of policies) - and this cut their spam in half.
Question asked if we could do model updates more frequently than 2-3 weeks - we would like to and will see what's possible. We
Address Validation > Why are we allowing students to bypass the CASS address validation? This could be looked at - spammers are putting in bad addresses.
What are spammers going for?
Spammers are specifically going for the FREE Google Drive storage
Change Enhancements
- Add a way to search by other data (whatever the columns are) CCCID, Email Address, DOB, etc.
- Add the IP Address to the spam filter summary table
- Add the Country that matches the IP Address or Region to the spam filter summary table
- Look into a Zip Code / Area Code validation check
- Create ability to create a rule based on any data that puts the application directly into the Spam Filter (or prevents it/whitelist) - ask Josh (don't use the model if the app meets the rule logic) (would have to bypass the prediction service
- What's the situation with ReCAPTCHA (is our current version so outdated that it's ineffective?) Is OpenCCC redesign planning to include an updated version? Which tool and which version? Why do we not have one in the Application? Get history on this and inform the group.