Credit Cards and Deep Feature Synthesis

MIT researchers have developed a system that reduces false positives for credit card frauds. Researchers call it automated feature engineering, which allows them to monitor the spending of an individual and add features based on their spending habits.[1]  To do this, they extract 200 detailed features per individual transaction to provide examples that would be available if the user was present.  It additionally would capture the average spent on certain days and at certain vendors.  This allows them to establish a general sense of user norms and patterns.  Their theory was tested on 1.8 million transactions from a large bank, using this method and were able to reduce false positive predictions by 54%, as opposed to traditional methods.  The MIT system was estimated to have saved the bank thousands in revenue.

What made reducing false positives possible is the implementation of Deep Feature Synthesis (DFS).  This is an automated approach that extracts highly detailed features of the data.  DFS was a critical element in developing the system.  Using this model, they were able to run through 900 million transactions from around 7 million individual cards. Their system was able to train and test the model in confirming 120,000 fraudulent transactions.  Deep Feature Synthesis takes specific information out of databases or log files to make multi-table and transactional datasets.[2] DFS uses these tables to pull specific information.  An example of this would be pulling airline flight information.  Going through transactions and log files the system can pull all the data needed to fill the table.  It then takes the information gathered and pulls specific facts like the most expensive fight.  This is where primitives[3]are identified and added for the user.  Primitives are simple functions that take two inputs and generate one output.  The next step is stacking primitives.  Staking primitives takes multiple functions with inputs and outputs and cross examine them.  Specifically, in the MIT system, researchers took a function to calculate distance between transactions and cross check it with whether or not a mobile phone was used. For example, if two transaction are 200 miles apart, within an hour, there is a high probably fraudulent credit card transactions are taking place.  Mobile phone use reduces the probabilities, yet the use of GPS coordinates would pin point where the transaction too place. 

This is a clear example of cyber security working in tandem with mathematics, IT science and traditional security to help curb the mounding tide of financial fraud.  

For questions, comments or assistance regarding this report, please contact Wapack Labs at 603-606-1246, or feedback@wapacklabs.com

[1]http://news.mit.edu/2018/machine-learning-financial-credit-card-fraud-0920

[2]https://www.featurelabs.com/blog/deep-feature-synthesis/

[3]Primitive types are the most basic data types available within the Java language. There are 8: boolean , byte , char , short , int , long , float and double.  These types serve as the building blocks of data manipulation in Java. Such types serve only one purpose — containing pure, simple values of a kind.

E-mail me when people leave their comments –

You need to be a member of Red Sky Alliance to add comments!