Get notified about Paul Smith

Sign up to a email when Paul Smith publishes a new video

okay let's let's get going so next up we
have Paul Smith but before I introduce
him I would just like to say that we are
doing go-to nights in Stockholm four
times this year
next up is going to be in April where
we're going to talk about modern
front-end development and it's titled
post JavaScript so the sites for
registration is going to be up tonight
so please go and register if you're
interested and tell all your friends
also we have go to Copenhagen coming up
a big cool conference I urge you all to
attend it's coming up in October and for
those of you who who fill out our survey
that we're gonna send out in the coming
days we will raffle off a ticket for a
three-day pass so go do that once you
leave tonight you can pick up a goodie
bag outside and now let's continue with
a story about fraud and how it could be
avoided using modern technology working
okay hi everyone welcome back I hope you
see to believe refreshed yeah I'm going
to tell you about a fraud case I
investigated around 10 years ago and
it's a European base fraud and go into
these two technologies that will we
prefer we do use multiple because not
one piece of software will solve all
your data problems just a bit about
myself I spent 24 years in financial
services all around risk I did
originally start out as a program
back in the early 90s but then very
quickly moved into the fraud area
I was looking off to be a fraud
investigator for eight years and
investigating frauds throughout Europe
as an independent expert I now work for
I fought selected which is part of the
try fault group we specialize in
business consultancy we've got a whole
bunch of turkeys worldwide ooh a
specialist in all the products that we
use first of all anti money laundering
is a big area now the new EU directive
if you're not aware is coming in and you
lie the first and it's quite a scare
story really has not many businesses are
ready for it the guidelines at the give
from the the legislation is very fluffy
it gives very simple examples there's
lots of increase in the penalties on now
these prison sentences for chief risk
officers as of July up to 14 years the
payments have been brought down from
15,000 euro to a thousand which just
means a huge following their data and
increase in that data over that period
and they've introduced connected
payments but they don't say how long the
connections are so that could be a
hundred euro for the next ten months and
that's a connecting payment is how they
interpret it so we use technologies to
try and prove due diligence as part that
antimony Lauren your activity this
particular case of talking about which
is back in the past when it was an
investigator it was committed in Europe
it's quite a significant value for the
region and and so it's why I got brought
in to review the case initially it was
10 loans identified that was involved
with a fraud and I arrived and well this
is gonna be really boring but within a
few days it just blew out of all
proportion and it's actually money
laundering case and made it very
so this
ticular company you're not gonna name
for obvious reasons they have this
process of doing welcome calls when you
have new customers and new loans they
staff have down time he was making these
calls rather than it being a dedicated
call center so there was just a handful
of stuff doing these these calls and the
fraud was actually found when one of
these admin staff realized the same
numbers were coming up again and again
and there was always going to voicemail
so she started reporting that and then
when an investigation happened
they found the tenth the first time
fraud which was still significant so I
went out and did lots of interviews with
lots of internal staff we had a whole
raft of procedures in place to stop
fraud happening but there's lots of
breaches in those procedures by staff
because it was a very manual process in
terms of detection and Prevention we had
separate security team they missed all
the signs and didn't actually find the
fraud so it was actually found by
operational staff doing these calls and
after a few days I actually use graph
theory but in sequel environment I had a
team working for me back in back in the
day who were all data scientists and I
got them in sequel to join all these
potential customers from the fraud
together which took them a number of
days today but it was invaluable because
it helped me understand that 90% of the
loans taken by these brokers were
involved with his fraud not just 10
loans and then there was a revelation
after a week la actually there was two
corrupt police officers involved and
what was happening allegedly because we
never proved it because it was an
internal police investigation going on
so they never confirmed it to us but
there was allegations are running
racketeering and authorized gambling and
they were legitimizing those funds by
using these brokers to take out loans
and then what they would do is those
loans will be passed to their accounts
on the different personas
so it looked like they were getting
money from different people and people
and payments from different people in
reality all that money went back into
making the loan payments so nobody
visited because of bad debt everything
was 100% right nobody made any phone
calls the security as a consequence did
a visit so once we interviewed the
brokers they verified that pretty much
everything that done for the last 12
months was part of this deception this
fraud oh yeah he went on for 12 months
because it was the brokers they knew the
checks and balances in the systems and
that enabled them to bypass all those
checks and balances and those procedures
which allowed the fraud to happen they
got found out because they got started
getting lazy there was buying multiple
SIM cards and individual numbers to
begin with for the first six or seven
months well then he started reusing
those Sims but he didn't have a phone
for every single sim that they bought
and every long that was taken so that
we're putting the sim into the phone
every few days that be a voicemail left
for that welcome call and then they'd
ring back and say I'm such a body under
the false details that long was taken
out on and that's why the admin staff
started to realize it was the same
numbers because it was constantly going
to voicemail and you have to dial it a
number of times before somebody would
ring back and say it was person X who
take this Frawley alone at the time that
particular company had some quite legacy
systems in place one of the main ones it
was a logistic regression but logistic
regressions not great for fraud because
you don't always know all your outcomes
in my personal experience had never seen
a good logistic regression algorithm
that's good at tracking fraud or risk we
also had some high risk models that was
tracking basically the high growth and
value now these models actually flagged
it up but the security guys weren't
doing the job they they had such
aggressive targets that they weren't
able to do the job but these reports
were highlights in
that there was a potential problem here
and we had some experts based centrally
in each of the countries um but there
was only two of them in this occasion in
this country and they just overloaded
with work but they were the real special
they would have found it but they just
had too much work sitting on checking
the rest of the country so these are the
technologies that we typically use for
combating AML fraud
we also using for recommendation agent
engines as well so we have the data
element which is elastic you know and
react on no sequel technologies schemer
schema-less if you want it to be so it's
great for adding things in when you
don't know what you want in terms of
data because you only find things out
when you start analyzing it and doing
champion challenges so we also use neo4j
and and spark again data stores in their
own right but have different function I
will take spark we can we use for deep
learning algorithms neo4j we use for
graph databases and then we use
tensorflow Python psyche and siano for
building deep learning algorithms we've
got guys in a company a lot clever at me
building those algorithms the first type
we had I actually agreed with everything
that was being said there in terms of
doing those algorithms it's very very
unique skill in making those algorithms
work what everybody at the moment loves
talking about machine learning and deep
learning so I'm gonna focus on two today
and that's Helio and Nia and the next
talks actually on hemiola
but it's more of a deeper dive than I'm
doing I'm just keeping it very
high-level and how we use humor so he
mio reported in analytics analytics
databases it's great for devops data
management performance management does a
multitude of things it's great at fast
ingestion particularly of logs the two
that I'm always interested in because
it's my background it's where my
expertise is is around reporting in
analytics and analytical databases and
what we use you meal for flour is the
backbone for running our deep learning
algorithms because it can lead just data
so quickly
and you've got to have fuss technologies
behind these algorithms otherwise your
online process will just grind to a halt
and you know you then gonna lose
customers as a consequence and you know
if the term Big Data that everybody
loves or hates the best description of
Big Data ever heard it's big data's like
teenage sex everybody says they're doing
it but nobody actually is so the number
of companies that yeah we do big data
what's big more data and yesterday a
great thing I hear is that key cunning
just these IT logs but most people think
IT logs are just for IT people and for
monitoring performance but these real
value in the data in there and that's
that's what I cluster big data new
innovative data and because we can get
these out on i.t logs in very quickly we
can also pinpoint and extract the data
that's important from a risk perspective
or it could be a recommendation fraud
AML multitude of uses for the business
not necessarily just for IT and it can
hold a hell of a lot of information so
every single data scientist I've ever
worked with they always want all the
data forever and they hate deleting data
well with technologies like you me oh
you can do that and you can still search
on it really fast as well so in terms of
deep learning and i've already on a talk
on deep learning and the real
game-changer was in 2008 when there was
a new way of traversing the trees in
deep learning some very clever academics
change the way it works in terms of
speed then all these great community
open-source products came along and got
released into the general environment so
you don't have to write them from
scratch yourself so you know sparking
2013 10 to flow in January last year so
it's great news that all these products
are there so you can actually make team
loop learning and more of a commodity in
terms of into your business what
commodity hardware as well is really
important because it does take a lot of
in power so the fact that hardware's
coming down in cast really helps in
achieving those deep learning algorithms
and the implementations so we use Gmail
as the data store pretty much for those
d learning algorithms with many clients
and like I say you can access a huge
amount of history cuz you never know
what you want tomorrow you've got to
choose your features carefully as has
been said before you don't know if those
features are right or not you need
access to the whole history or the whole
data set so you can pick and choose your
features once you've done your a/b
testing on it if I would have had this
when he investigated that fraud
I guarantee for one of the run these
algorithms they would have found that
fraud absolutely guarantee it but at the
time we was using sequel
it was slow horrific we're gonna lose
customers if we were put in algorithms
in and to be honest the the actual
methodology hunt come on that far I was
using AI back in 1995 and we ditched
that project because we just couldn't
get it into that process fast enough so
that we were doing the conversions for
customers and selling those loans so
it's not really anything new it's just
got better a lot better so moving on to
neo again whatever had neo it would have
found this fraud Neos a fantastic tool
for analyzing connections in the
background it's a specific graph
database so it's almost if any of you
ever seen or heard about network
databases of old two three decades ago
it is a network database and because it
has all its connections stored it's
really really fast in terms of searching
that data finding those complex
connections when I have the team do it
in sequel it took days with neo it would
have took minutes to do exactly the same
analysis and the great thing is neo you
can use an online environment as well as
offline for investigation use or for
doing your for your analyst to look for
new things to look for to plug in to
your online process who's got great
ization behind it to do that so very
quick explanation of graph theory rough
theory itself which is graph databases
are built on that is 400 years old so
it's not a new methodology swished must
Swiss mathematician originally created
it for a problem for crossing only going
down all the streets in königsberg in
Prussia it's easy for me to say
and basically the emperor wanted to go
around only visit every street once and
the muffins Hitchin created the theory
to work out if it was possible and it
turned out it wasn't but it's all about
edges and connections that's what graph
theory is so in this very very simple
example you've got a number of customers
for instance person a B and C they're
only physically links where somehow over
or links were linked from tonight we
have connections you have connections
everywhere and in the case of this
simple example there is a connection
with a shared address and a shared phone
number so that just means that very
simply you can make a link between
person a and B on a slightly more
you've got connections between phone
numbers addresses this is actually what
happened on the fraud case I just talked
about so we started using the the Sims
the phone numbers for the Sims as the
nearest relative because you had to give
almost a reference and that's why we saw
all the connections and we realized that
the fraud ring had grown to 90% of all
very easily in an online environment run
queries to work out the shortest
connection between people or things or
products or whatever thing you want to
search for so once you found that first
fraud on person a as an investigator
with the visualization tool which looks
like out only a bit nicer with bubbles I
can drill into work and move things
around I can dig into the data find new
connections and find new things so you
prioritize your investigations on that
in the same way it can be used for
so there are several online retailers
are use Nia and graph theory to make
recommendations instantly when customers
are buying things we have a online
gambling client who uses it for making
in-game recommendations within 20
milliseconds so as soon as you start
playing a game instantly you've got I'm
who else has played the game what
demographics did he share with me what
have I played that they've played and
you can look at the entire user base and
work out everybody played recommend it
the more likely to make more money
because you're going to recommend that
game and neo does that really quickly so
is our recommended tool certainly to
start with for fraud and AML because
it's black and white it's easy to
understand a lot of senior people in a
lot of the companies particularly
financial services don't like deep
learning because they can't understand
why it's making the decisions with neo
you can see it black and white
simple as that I say brilliant
visualization eats for the business user
not the IT department and it's great for
online real detection or recommendations
but others have started out the all
these things can't always be solved by
all one product you've got to have a
tool kit particularly around fraud and
AM L so this will find an element that
what you should be doing other things in
the basics as well oh that was
interesting thank you for your time okay
so do we have any questions let me just
there you go thank you I have this one
doubt on your presentation in the neo 40
this case of network analyzer this one
problem is that if one node is more
dominant like you have more domain in
structure then it will be vias give
biased information and then you'll go to
the second this fraud analysis it could
be more bias so the challenge is how you
would do it first fraud analysis and how
to overcome it yeah absolutely right
as part of the query though it's not
just about making the connections part
the query language you can say what not
to look for as well as part of a complex
query so you can rule out some of those
elements so you know but there's also a
concept called the magic number in Neo
so if things are too common it will
ignore them so there's a growers a great
example from neo where they analyze
public dates are on music tastes and
whether you like pop music or whether
you like classical music predominantly
everybody likes Coldplay so you have to
rule out Coldplay and the way you do
that is a magic number it's just too
many connections into that from
everybody in your data set and that's
how you deal with it there was a
question up here as well no okay good
here yes thank you so you have shown a
great example how these tools that we
have today could have been a game
changer with your case ten years ago do
you see any problem areas that are
similar today that might be difficult to
solve now with fraud detection and that
so the biggest problem today is with the
advancement of online financial services
you've got a real neat way as a froster
to hide things and commit money
laundering as a consequence because you
can send things out the country really
easily and back in again between
organizations within minutes in most
cases as part of the Nooyi e u directive
that was talking about a public database
for all transactions across the EU
that's just not going to happen because
it's too big and it's got to be
government funded and it just won't
happen by July so they're still talking
about it what happened but that's
probably the biggest challenge in terms
of fraud and money laundering that that
financial services are facing today any
further questions okay good