1
00:00:03,240 --> 00:00:09,370
so hello good evening my name is Wu and
2
00:00:06,970 --> 00:00:11,739
I'll try to keep this short because I
3
00:00:09,370 --> 00:00:14,219
guess everyone just wants to go and have
4
00:00:11,740 --> 00:00:18,550
dinner and a beer I sure do
5
00:00:14,220 --> 00:00:20,349
so actually some of the issues that were
6
00:00:18,550 --> 00:00:22,689
discussed during the Q&A is from
7
00:00:20,349 --> 00:00:24,810
previous presentations were the topic of
8
00:00:22,689 --> 00:00:28,329
my thesis which is the enrichment and
9
00:00:24,810 --> 00:00:31,179
creation of IOC's of quality out of a
10
00:00:28,329 --> 00:00:34,030
since it was my master's thesis
11
00:00:31,179 --> 00:00:37,570
those are my advisers and I did it in
12
00:00:34,030 --> 00:00:40,060
the university of lisbon why is why do
13
00:00:37,570 --> 00:00:42,190
we care about this because the same
14
00:00:40,060 --> 00:00:45,280
reason why we know will not get out of a
15
00:00:42,190 --> 00:00:48,339
job in the near future it costs a lot of
16
00:00:45,280 --> 00:00:50,680
money 4 percent of global GDP next year
17
00:00:48,340 --> 00:00:55,660
is expected to be lost to cybercrime and
18
00:00:50,680 --> 00:00:57,820
why does this happen two reasons we're
19
00:00:55,660 --> 00:00:59,919
working against people that are highly
20
00:00:57,820 --> 00:01:02,290
focused and highly dedicated to what
21
00:00:59,920 --> 00:01:06,040
they're doing they see that the return
22
00:01:02,290 --> 00:01:10,030
on investments is has a lot of potential
23
00:01:06,040 --> 00:01:12,340
and they also are using new ways of
24
00:01:10,030 --> 00:01:14,290
attacking which also increase the
25
00:01:12,340 --> 00:01:19,479
difficulty that we have in combating
26
00:01:14,290 --> 00:01:22,600
this threat so where are we we started
27
00:01:19,479 --> 00:01:24,700
by having no defenses whatsoever and the
28
00:01:22,600 --> 00:01:26,649
beginning the internet was something
29
00:01:24,700 --> 00:01:29,290
that was supposed to be for sharing and
30
00:01:26,650 --> 00:01:33,810
no one even considered the hypothesis of
31
00:01:29,290 --> 00:01:37,000
it's being misused it quickly we
32
00:01:33,810 --> 00:01:39,100
understood the error of our ways and we
33
00:01:37,000 --> 00:01:40,930
started installing perimeter defenses
34
00:01:39,100 --> 00:01:43,780
the problem with the perimeter defense
35
00:01:40,930 --> 00:01:47,560
is if you build a wall if someone
36
00:01:43,780 --> 00:01:50,200
manages to cross that wall you don't
37
00:01:47,560 --> 00:01:53,500
have anything else so we move to in that
38
00:01:50,200 --> 00:01:55,360
defenses which works fine until you find
39
00:01:53,500 --> 00:01:57,850
an adversary that is capable of adapting
40
00:01:55,360 --> 00:02:01,600
to the what you have within your network
41
00:01:57,850 --> 00:02:05,380
and creates you will change the way he
42
00:02:01,600 --> 00:02:07,750
is attacking to adapt to your new to
43
00:02:05,380 --> 00:02:11,519
what you're monitoring and so we moved
44
00:02:07,750 --> 00:02:16,140
to dynamic response defense which is
45
00:02:11,520 --> 00:02:18,970
three things advanced malware detection
46
00:02:16,140 --> 00:02:21,279
event anomaly detection and what brings
47
00:02:18,970 --> 00:02:25,300
us all here intelligence driven defense
48
00:02:21,280 --> 00:02:29,080
and that's where we decided to work so
49
00:02:25,300 --> 00:02:31,650
where are we now in this field we moved
50
00:02:29,080 --> 00:02:34,420
to from manually sharing their knowledge
51
00:02:31,650 --> 00:02:37,120
to creating platforms to help us share
52
00:02:34,420 --> 00:02:40,899
the knowledge however and most of you
53
00:02:37,120 --> 00:02:43,630
are surely aware there was a report by
54
00:02:40,900 --> 00:02:47,170
in Issa and the beginning of this year
55
00:02:43,630 --> 00:02:48,790
which criticized the initiatives not in
56
00:02:47,170 --> 00:02:50,619
the sense that they weren't needed but
57
00:02:48,790 --> 00:02:52,810
in the sense that we still have a lot to
58
00:02:50,620 --> 00:02:55,000
work until we get to a place where we
59
00:02:52,810 --> 00:02:57,190
can be comfortable what with the efforts
60
00:02:55,000 --> 00:03:00,280
we're putting into this field and so
61
00:02:57,190 --> 00:03:02,590
they I indicated a lot of issues that
62
00:03:00,280 --> 00:03:04,540
appear first of all we have a high
63
00:03:02,590 --> 00:03:06,940
volume of information that is shared
64
00:03:04,540 --> 00:03:08,980
which makes it hard for people to
65
00:03:06,940 --> 00:03:10,560
understand what they're seeing and to
66
00:03:08,980 --> 00:03:13,660
get the information that they need
67
00:03:10,560 --> 00:03:16,090
furthermore we don't have common sharing
68
00:03:13,660 --> 00:03:18,640
standards we have certain standards that
69
00:03:16,090 --> 00:03:22,450
are more use than others but we should
70
00:03:18,640 --> 00:03:25,779
move towards something that is everyone
71
00:03:22,450 --> 00:03:27,609
agrees it's the way to go and we all use
72
00:03:25,780 --> 00:03:29,290
the same standards instead of having in
73
00:03:27,610 --> 00:03:30,970
my platform I have to have a converter
74
00:03:29,290 --> 00:03:33,519
at the end I have to have other
75
00:03:30,970 --> 00:03:35,650
converters so that I can receive input
76
00:03:33,519 --> 00:03:38,470
and then output the information for
77
00:03:35,650 --> 00:03:42,819
something that you other people can use
78
00:03:38,470 --> 00:03:46,720
and we are collecting data we're not
79
00:03:42,819 --> 00:03:49,510
doing intelligence this is a very
80
00:03:46,720 --> 00:03:52,030
serious issue because collecting data is
81
00:03:49,510 --> 00:03:54,250
not the same as creating intelligence
82
00:03:52,030 --> 00:03:57,599
creating intelligence is answering a
83
00:03:54,250 --> 00:04:01,810
question that you need to have answered
84
00:03:57,599 --> 00:04:03,488
collecting data is just or D and that's
85
00:04:01,810 --> 00:04:06,069
what most of the platforms are currently
86
00:04:03,489 --> 00:04:08,560
doing we're ordering data but then we
87
00:04:06,069 --> 00:04:11,768
have a hard time making it into
88
00:04:08,560 --> 00:04:14,080
something useful so out of these issues
89
00:04:11,769 --> 00:04:16,539
or the thesis we had to choose something
90
00:04:14,080 --> 00:04:19,840
to focus on and we decided to focus on
91
00:04:16,539 --> 00:04:21,760
three challenges first we wanted to
92
00:04:19,839 --> 00:04:24,460
reduce the quantity of information that
93
00:04:21,760 --> 00:04:26,349
reaches the analysts we don't want the
94
00:04:24,460 --> 00:04:27,960
analysts to have to sort through all the
95
00:04:26,350 --> 00:04:30,210
information that is receiving
96
00:04:27,960 --> 00:04:32,430
a daily basis if we consider that there
97
00:04:30,210 --> 00:04:35,068
are tens of thousands of new malware
98
00:04:32,430 --> 00:04:37,169
samples that are detected daily it's
99
00:04:35,069 --> 00:04:40,199
impossible for a human being to deal
100
00:04:37,169 --> 00:04:41,758
with that kind of information second we
101
00:04:40,199 --> 00:04:44,310
want to increase the quality of the
102
00:04:41,759 --> 00:04:45,210
information that is being shared or of
103
00:04:44,310 --> 00:04:48,419
the intelligence
104
00:04:45,210 --> 00:04:50,400
this means four things we have to reduce
105
00:04:48,419 --> 00:04:52,229
the timeliness the time between
106
00:04:50,400 --> 00:04:54,989
detection and the information actually
107
00:04:52,229 --> 00:04:57,090
sharing reaching its goal so the
108
00:04:54,990 --> 00:04:59,190
information analysts the security
109
00:04:57,090 --> 00:05:01,619
analysts or the systems that are
110
00:04:59,190 --> 00:05:03,479
defending our network we have to
111
00:05:01,620 --> 00:05:05,460
guarantee that it's both accurate and
112
00:05:03,479 --> 00:05:07,400
relevant so we have to guarantee that
113
00:05:05,460 --> 00:05:10,799
the information that we are using
114
00:05:07,400 --> 00:05:13,560
actually has some answers or needs and
115
00:05:10,800 --> 00:05:16,259
it's not if I work in a bank and I
116
00:05:13,560 --> 00:05:16,740
actually do I don't care about the
117
00:05:16,259 --> 00:05:19,199
treads
118
00:05:16,740 --> 00:05:21,930
that's effects like those navigation
119
00:05:19,199 --> 00:05:26,159
systems of an airplane I just care about
120
00:05:21,930 --> 00:05:28,289
stress that's will impact on what I have
121
00:05:26,159 --> 00:05:31,199
within my network and finally
122
00:05:28,289 --> 00:05:34,710
completeness and here it's something
123
00:05:31,199 --> 00:05:37,650
that it was approached in the last talk
124
00:05:34,710 --> 00:05:40,530
in the sense which is if we use
125
00:05:37,650 --> 00:05:44,698
different names when we're analyzing a
126
00:05:40,530 --> 00:05:46,830
different same sample when we reach the
127
00:05:44,699 --> 00:05:49,500
end and if I'm analyzing the network
128
00:05:46,830 --> 00:05:52,620
part of the malware and someone is
129
00:05:49,500 --> 00:05:55,800
analyzing the system part we'll reach
130
00:05:52,620 --> 00:06:00,599
two different IO sees that won't be
131
00:05:55,800 --> 00:06:03,180
related and in the end maybe they are in
132
00:06:00,599 --> 00:06:05,520
my database both of them but I'll lose
133
00:06:03,180 --> 00:06:06,240
part of that information because they
134
00:06:05,520 --> 00:06:11,159
aren't connected
135
00:06:06,240 --> 00:06:14,400
and finally automation is key we don't
136
00:06:11,159 --> 00:06:16,500
want to lose time having someone to have
137
00:06:14,400 --> 00:06:19,530
to work on that information we want to
138
00:06:16,500 --> 00:06:21,719
have something that we let go we
139
00:06:19,530 --> 00:06:23,758
configure it and after that it's just
140
00:06:21,719 --> 00:06:27,599
running on our system and completing it
141
00:06:23,759 --> 00:06:29,490
so we looked at NIST as a solution to
142
00:06:27,599 --> 00:06:32,460
start implementing our solution and what
143
00:06:29,490 --> 00:06:35,759
we found is new events can be duplicates
144
00:06:32,460 --> 00:06:37,979
which means it increases the storage
145
00:06:35,759 --> 00:06:40,960
requirements and it's information that's
146
00:06:37,979 --> 00:06:45,070
not useful to anyone that's
147
00:06:40,960 --> 00:06:47,770
we're soaring there ii miss creates
148
00:06:45,070 --> 00:06:49,930
direct connections which means and we've
149
00:06:47,770 --> 00:06:52,120
seen multiple representations during the
150
00:06:49,930 --> 00:06:53,740
day you have an event you see all the
151
00:06:52,120 --> 00:06:56,710
events that share attributes with that
152
00:06:53,740 --> 00:07:00,100
one what if the next one on that level
153
00:06:56,710 --> 00:07:03,120
is something useful to us you would have
154
00:07:00,100 --> 00:07:05,560
to go manually or lose the information
155
00:07:03,120 --> 00:07:08,880
which means that you're either losing
156
00:07:05,560 --> 00:07:13,180
time or losing information so we try to
157
00:07:08,880 --> 00:07:16,240
resolve this situation how first of all
158
00:07:13,180 --> 00:07:18,280
we considered clustering and aggregating
159
00:07:16,240 --> 00:07:20,979
information that is related to one
160
00:07:18,280 --> 00:07:23,948
another so as to create a new enriched
161
00:07:20,979 --> 00:07:26,710
IOC which brings all the information
162
00:07:23,949 --> 00:07:31,270
that is connected into a single report
163
00:07:26,710 --> 00:07:33,520
that will reach the analyst this means
164
00:07:31,270 --> 00:07:35,320
working at two levels of the
165
00:07:33,520 --> 00:07:36,758
architecture we have to work in the
166
00:07:35,320 --> 00:07:39,009
configuration of the threat intelligence
167
00:07:36,759 --> 00:07:41,560
platform which means we have to know
168
00:07:39,009 --> 00:07:43,360
what we want to answer so that we make
169
00:07:41,560 --> 00:07:46,659
the correct choices when preparing our
170
00:07:43,360 --> 00:07:48,930
platform to answer them and second we
171
00:07:46,659 --> 00:07:51,520
have to work in the internal processing
172
00:07:48,930 --> 00:07:54,580
capabilities which is the platform needs
173
00:07:51,520 --> 00:07:59,229
to be able to do this operation by
174
00:07:54,580 --> 00:08:03,760
itself instead of doing it you instead
175
00:07:59,229 --> 00:08:06,669
of disappear the analyst doing it so
176
00:08:03,760 --> 00:08:09,400
then lastly Automation by the design
177
00:08:06,669 --> 00:08:11,620
which is basically something that is
178
00:08:09,400 --> 00:08:16,239
required at pre requirement for anything
179
00:08:11,620 --> 00:08:20,020
we do so we designed this solution we
180
00:08:16,240 --> 00:08:24,400
have the sources which should be focused
181
00:08:20,020 --> 00:08:27,609
on thread feeds that matter to us we
182
00:08:24,400 --> 00:08:30,130
have a layer of other threat
183
00:08:27,610 --> 00:08:33,669
intelligence platforms to try to use
184
00:08:30,130 --> 00:08:35,799
what they bring to the to the what they
185
00:08:33,669 --> 00:08:37,838
bring to the product and by this I mean
186
00:08:35,799 --> 00:08:41,828
for instance using in talent you to
187
00:08:37,839 --> 00:08:43,870
enrich IPS and DNS so that when we have
188
00:08:41,828 --> 00:08:46,390
the information reaching what we
189
00:08:43,870 --> 00:08:48,700
developed we have hooks and hooks here
190
00:08:46,390 --> 00:08:52,300
are information that will allow to
191
00:08:48,700 --> 00:08:54,820
create connections to other events that
192
00:08:52,300 --> 00:08:57,400
we have in our database and we created
193
00:08:54,820 --> 00:09:00,160
two modules that the duplicator module
194
00:08:57,400 --> 00:09:03,040
which has the name indicates allow us to
195
00:09:00,160 --> 00:09:05,260
eliminate information that no serves no
196
00:09:03,040 --> 00:09:07,150
purpose because it's a duplicate and a
197
00:09:05,260 --> 00:09:09,910
correlator module which has an
198
00:09:07,150 --> 00:09:13,480
aggregation part and a representation
199
00:09:09,910 --> 00:09:17,140
part to create even rich deoxy so how do
200
00:09:13,480 --> 00:09:20,020
the duplication work we considered an
201
00:09:17,140 --> 00:09:22,569
event as a set and if we consider an
202
00:09:20,020 --> 00:09:26,530
event as a set of attributes you can use
203
00:09:22,570 --> 00:09:30,670
set theory to compare two events and to
204
00:09:26,530 --> 00:09:32,589
decide which one should be this means
205
00:09:30,670 --> 00:09:35,050
that you have to have criterias and the
206
00:09:32,590 --> 00:09:37,840
criterias are found within the metadata
207
00:09:35,050 --> 00:09:40,390
so you'll see for instance if you have
208
00:09:37,840 --> 00:09:42,910
to add events that are have the same
209
00:09:40,390 --> 00:09:45,069
information you see if one of them has
210
00:09:42,910 --> 00:09:47,680
already been validated by human so if
211
00:09:45,070 --> 00:09:50,020
the dress level is higher you can
212
00:09:47,680 --> 00:09:52,770
consider that that one is more valuable
213
00:09:50,020 --> 00:09:57,600
than the previous one which was still
214
00:09:52,770 --> 00:10:01,780
has so hadn't been analyzed and so forth
215
00:09:57,600 --> 00:10:04,030
regarding aggregation methods we the we
216
00:10:01,780 --> 00:10:05,949
defined two methods one of them is
217
00:10:04,030 --> 00:10:08,560
closer to what nest Pass which is the
218
00:10:05,950 --> 00:10:11,770
naive method and we basically focus on
219
00:10:08,560 --> 00:10:15,189
the naive methods on direct connection
220
00:10:11,770 --> 00:10:17,170
so we look at an event we see if it
221
00:10:15,190 --> 00:10:20,380
shares attributes with other events and
222
00:10:17,170 --> 00:10:22,449
then we take that group of events as a
223
00:10:20,380 --> 00:10:26,470
cluster and a potential new enriched
224
00:10:22,450 --> 00:10:29,770
yolk this has a problem an event can
225
00:10:26,470 --> 00:10:34,090
appear on multiple clusters which is
226
00:10:29,770 --> 00:10:36,880
logical we have another alternative
227
00:10:34,090 --> 00:10:38,680
which is the N level aggregation in the
228
00:10:36,880 --> 00:10:42,640
end a level of aggregation what we do
229
00:10:38,680 --> 00:10:45,910
and similarly what you were doing is we
230
00:10:42,640 --> 00:10:49,210
create a graph with where the nodes or
231
00:10:45,910 --> 00:10:52,150
new events or all the events in our
232
00:10:49,210 --> 00:10:55,150
database we then look at shared
233
00:10:52,150 --> 00:10:58,150
attributes to create edges and we set
234
00:10:55,150 --> 00:11:00,730
filters to allow only certain edges to
235
00:10:58,150 --> 00:11:03,310
be created and then we that identify all
236
00:11:00,730 --> 00:11:06,130
the sub graphs and those sub graphs will
237
00:11:03,310 --> 00:11:08,380
form and you enrich to your pour a new
238
00:11:06,130 --> 00:11:12,670
possible enriched yaagh
239
00:11:08,380 --> 00:11:14,650
just an image to give an example if we
240
00:11:12,670 --> 00:11:16,990
were using the knife approach consider
241
00:11:14,650 --> 00:11:19,510
that we have these events in the
242
00:11:16,990 --> 00:11:21,880
database when we look at the first
243
00:11:19,510 --> 00:11:24,520
events it shares one attribute with the
244
00:11:21,880 --> 00:11:27,400
second one in rich dark we move to the
245
00:11:24,520 --> 00:11:31,660
second another energy are another
246
00:11:27,400 --> 00:11:34,590
enriched job and so forth if we move to
247
00:11:31,660 --> 00:11:37,990
an end level aggregation with the same
248
00:11:34,590 --> 00:11:41,560
using exactly the same database we will
249
00:11:37,990 --> 00:11:46,060
first create the notes represent the
250
00:11:41,560 --> 00:11:50,439
notes we would then for each relation
251
00:11:46,060 --> 00:11:54,310
create an edge and finally we would
252
00:11:50,440 --> 00:11:56,560
identify all the sub graphs that exist
253
00:11:54,310 --> 00:11:59,310
in this case we have two and these two
254
00:11:56,560 --> 00:12:04,599
would be possible
255
00:11:59,310 --> 00:12:06,400
enriched yachts so we needed to make a
256
00:12:04,600 --> 00:12:09,820
proof of concept so we developed an
257
00:12:06,400 --> 00:12:12,520
peyten tree over a mis installation we
258
00:12:09,820 --> 00:12:15,010
focused in these two modules so we
259
00:12:12,520 --> 00:12:17,290
basically used everything that miss we
260
00:12:15,010 --> 00:12:19,270
could use out of nests so as not to lose
261
00:12:17,290 --> 00:12:22,000
time I didn't have that much time to
262
00:12:19,270 --> 00:12:24,250
develop so it was basically get the most
263
00:12:22,000 --> 00:12:27,580
out of nests and we made the
264
00:12:24,250 --> 00:12:30,790
implementation to allow two two choices
265
00:12:27,580 --> 00:12:32,950
the first one is we could choose subsets
266
00:12:30,790 --> 00:12:37,300
of yolks that are in our database and
267
00:12:32,950 --> 00:12:40,180
use them as do the selection that way
268
00:12:37,300 --> 00:12:42,069
well at this time we were just working
269
00:12:40,180 --> 00:12:45,069
in a proof-of-concept so we didn't have
270
00:12:42,070 --> 00:12:47,200
a specific target in the future this
271
00:12:45,070 --> 00:12:50,170
would allow if we want to focus on
272
00:12:47,200 --> 00:12:52,540
specific sectors or in specific threats
273
00:12:50,170 --> 00:12:54,099
we can select only those yokes that are
274
00:12:52,540 --> 00:12:56,380
in our database that relate to that
275
00:12:54,100 --> 00:13:01,210
issue instead of losing time analyzing
276
00:12:56,380 --> 00:13:04,420
all the database then we also we also
277
00:13:01,210 --> 00:13:06,400
made valid relationships so we set
278
00:13:04,420 --> 00:13:09,430
different filters that could be used to
279
00:13:06,400 --> 00:13:11,590
make the creation of the enriched yachts
280
00:13:09,430 --> 00:13:14,349
and we made two important assumptions
281
00:13:11,590 --> 00:13:16,870
the first one is trust level correlates
282
00:13:14,350 --> 00:13:20,020
to quality which means that we are
283
00:13:16,870 --> 00:13:22,020
pressing the other participants that
284
00:13:20,020 --> 00:13:24,930
contributes to our database that
285
00:13:22,020 --> 00:13:26,970
if they set a trust level at two they
286
00:13:24,930 --> 00:13:28,979
did their job and that the quality of
287
00:13:26,970 --> 00:13:31,740
the information of that invent is
288
00:13:28,980 --> 00:13:33,630
actually better than another event that
289
00:13:31,740 --> 00:13:36,180
is in the network without having being
290
00:13:33,630 --> 00:13:38,820
certified and that black lists are
291
00:13:36,180 --> 00:13:41,279
correctly tagged this is extremely
292
00:13:38,820 --> 00:13:43,950
important and black lists aren't the
293
00:13:41,279 --> 00:13:46,260
only case there are other types of
294
00:13:43,950 --> 00:13:49,140
events that appear that can mess up the
295
00:13:46,260 --> 00:13:50,580
way we're doing things because they if
296
00:13:49,140 --> 00:13:53,580
you have an event that creates
297
00:13:50,580 --> 00:13:56,100
relationships that aren't indeed useful
298
00:13:53,580 --> 00:13:56,670
it will create an enriched yuk that has
299
00:13:56,100 --> 00:13:58,920
no value
300
00:13:56,670 --> 00:14:01,740
so if you have an a black list it will
301
00:13:58,920 --> 00:14:04,199
bring events from difference incidents
302
00:14:01,740 --> 00:14:06,690
because some of them only lists I piece
303
00:14:04,200 --> 00:14:09,779
without caring if they are related to a
304
00:14:06,690 --> 00:14:13,070
single threat and they will cooperate
305
00:14:09,779 --> 00:14:16,830
everything around them and it's a mess
306
00:14:13,070 --> 00:14:20,880
so we selected an experimental set we
307
00:14:16,830 --> 00:14:23,339
had like we opened 34 feeds from of
308
00:14:20,880 --> 00:14:25,770
organizations collected eleven hundred
309
00:14:23,339 --> 00:14:29,310
and seventy-four events most of them as
310
00:14:25,770 --> 00:14:33,060
you can see are of a high trust level so
311
00:14:29,310 --> 00:14:36,989
in our vision of the world they actually
312
00:14:33,060 --> 00:14:39,630
have high quality and we ran our
313
00:14:36,990 --> 00:14:40,200
platform on that they decide to see how
314
00:14:39,630 --> 00:14:44,220
it would work
315
00:14:40,200 --> 00:14:46,620
and so we did we selected the subset
316
00:14:44,220 --> 00:14:48,839
only those with a trust level of two
317
00:14:46,620 --> 00:14:52,800
were selected and we eliminated all
318
00:14:48,839 --> 00:14:55,529
those that had a tag of blacklist and we
319
00:14:52,800 --> 00:14:58,290
said okay let's see with all the filters
320
00:14:55,529 --> 00:15:00,510
that are currently connected how does it
321
00:14:58,290 --> 00:15:02,969
work and as you can see there are two
322
00:15:00,510 --> 00:15:07,020
factors here that are important or that
323
00:15:02,970 --> 00:15:10,079
are interesting as the filter goes
324
00:15:07,020 --> 00:15:12,360
deeper in detail you reduce the number
325
00:15:10,079 --> 00:15:14,430
of enriched of potentially enriched
326
00:15:12,360 --> 00:15:17,399
yolks that you have which makes sense
327
00:15:14,430 --> 00:15:19,439
because you allow less connections and
328
00:15:17,399 --> 00:15:22,410
the connections that you allow are those
329
00:15:19,440 --> 00:15:24,149
that interest you the most and if you
330
00:15:22,410 --> 00:15:25,860
use the native approach you get a lot
331
00:15:24,149 --> 00:15:28,680
more potential than rich jerks
332
00:15:25,860 --> 00:15:31,050
then if you use the closer approach so
333
00:15:28,680 --> 00:15:33,959
after doing this first experiments we
334
00:15:31,050 --> 00:15:35,819
focused on the cluster approach with the
335
00:15:33,959 --> 00:15:40,880
most restrictive filter
336
00:15:35,820 --> 00:15:44,130
and what we got was this we found 11
337
00:15:40,880 --> 00:15:46,920
potentially enriched chucks which we
338
00:15:44,130 --> 00:15:49,920
then manually went through the their
339
00:15:46,920 --> 00:15:53,790
components and they make sense in here
340
00:15:49,920 --> 00:15:56,099
it's make sense because it's it's the
341
00:15:53,790 --> 00:15:58,800
data sets is still not we needed a
342
00:15:56,100 --> 00:16:01,280
bigger data set we needed to have
343
00:15:58,800 --> 00:16:05,810
evaluate this in a different way we have
344
00:16:01,280 --> 00:16:09,120
developed metrics like you did to try to
345
00:16:05,810 --> 00:16:11,579
to make sense of the information and try
346
00:16:09,120 --> 00:16:15,030
to validate what we have but it's still
347
00:16:11,580 --> 00:16:17,490
an ongoing process to be able to relate
348
00:16:15,030 --> 00:16:20,250
for certain the relevance of these
349
00:16:17,490 --> 00:16:22,140
enriched drugs and actually and you'll
350
00:16:20,250 --> 00:16:26,130
see that that's something at work that
351
00:16:22,140 --> 00:16:28,890
is currently being done so this is a
352
00:16:26,130 --> 00:16:31,740
representation of what we got when we
353
00:16:28,890 --> 00:16:33,569
represented the graph and we're going to
354
00:16:31,740 --> 00:16:37,170
look just at the details of one of them
355
00:16:33,570 --> 00:16:40,200
so this is the enriched df9 it's
356
00:16:37,170 --> 00:16:43,380
composed of several difference attribute
357
00:16:40,200 --> 00:16:46,190
events that are all related or mostly
358
00:16:43,380 --> 00:16:48,870
related through this vulnerability and
359
00:16:46,190 --> 00:16:51,330
another one appears related through this
360
00:16:48,870 --> 00:16:54,210
vulnerability the filter we were using
361
00:16:51,330 --> 00:16:59,430
was if the events shared vulnerability
362
00:16:54,210 --> 00:17:01,650
or attackers so to make to bring forth
363
00:16:59,430 --> 00:17:04,800
something that made more sense and would
364
00:17:01,650 --> 00:17:08,760
be more useful and so here we have the
365
00:17:04,800 --> 00:17:11,760
the list of events that compose it not
366
00:17:08,760 --> 00:17:13,709
very interesting one interesting factor
367
00:17:11,760 --> 00:17:16,589
is and that's something that has
368
00:17:13,709 --> 00:17:19,140
appeared in the literature is DF you can
369
00:17:16,589 --> 00:17:21,179
see if you look at this kind of data you
370
00:17:19,140 --> 00:17:22,770
can see the evolution of a track or in
371
00:17:21,180 --> 00:17:27,180
this case for instance the evolution
372
00:17:22,770 --> 00:17:31,530
from 2014 to 2016 of a vulnerability
373
00:17:27,180 --> 00:17:34,380
over time who used it when why to attack
374
00:17:31,530 --> 00:17:36,750
who so this sort of information at the
375
00:17:34,380 --> 00:17:39,300
strategic level we are already getting
376
00:17:36,750 --> 00:17:41,820
something that is useful but now at the
377
00:17:39,300 --> 00:17:44,070
tactical level were where we're going to
378
00:17:41,820 --> 00:17:47,399
use these Yuk's to inject into a defense
379
00:17:44,070 --> 00:17:49,200
network so needs some working to do just
380
00:17:47,400 --> 00:17:49,380
to be sure that when we create rules we
381
00:17:49,200 --> 00:17:51,060
are
382
00:17:49,380 --> 00:17:53,840
actually creating rules that are useful
383
00:17:51,060 --> 00:17:58,379
and not just cluttering the system so
384
00:17:53,840 --> 00:18:01,280
conclusion we created a new system to to
385
00:17:58,380 --> 00:18:03,660
create intelligence out of us since we
386
00:18:01,280 --> 00:18:05,760
defined two methods to correlate and
387
00:18:03,660 --> 00:18:07,410
aggregate threat intelligence we
388
00:18:05,760 --> 00:18:11,400
developed the platform that proves that
389
00:18:07,410 --> 00:18:14,700
the methods sort of work and we did an
390
00:18:11,400 --> 00:18:16,860
experiment that shows that our proof of
391
00:18:14,700 --> 00:18:19,650
concept was working and as I was saying
392
00:18:16,860 --> 00:18:21,990
currently we are this was work within
393
00:18:19,650 --> 00:18:24,660
the project of DCM which is a European
394
00:18:21,990 --> 00:18:27,530
sponsored project and we're working with
395
00:18:24,660 --> 00:18:32,810
a partner to evaluate the risk score of
396
00:18:27,530 --> 00:18:32,810
the detected yolks and thank you
397
00:18:33,110 --> 00:18:41,908
[Applause]
398
00:19:55,700 --> 00:20:01,950
so repeat repeating the question if I
399
00:19:58,770 --> 00:20:05,100
understood it correctly is if the
400
00:20:01,950 --> 00:20:07,140
approach we're using is not comfort
401
00:20:05,100 --> 00:20:08,969
productive in the long run because we're
402
00:20:07,140 --> 00:20:11,970
limiting the access of the information
403
00:20:08,970 --> 00:20:15,150
and it's a good point what the one you
404
00:20:11,970 --> 00:20:19,490
make it's a good point because indeed it
405
00:20:15,150 --> 00:20:22,890
can happen and even if it doesn't and
406
00:20:19,490 --> 00:20:25,919
when it happens like you could have had
407
00:20:22,890 --> 00:20:28,620
the information in the past but this is
408
00:20:25,919 --> 00:20:31,770
still like this is the beginning of the
409
00:20:28,620 --> 00:20:33,870
beginning of doing this one part of the
410
00:20:31,770 --> 00:20:37,168
problem with this issue is for instance
411
00:20:33,870 --> 00:20:39,899
or threat intelligence quality there is
412
00:20:37,169 --> 00:20:41,700
nothing to quantify it I've looked for
413
00:20:39,900 --> 00:20:44,090
it because it should be something that
414
00:20:41,700 --> 00:20:46,679
you should be able to quantify it but
415
00:20:44,090 --> 00:20:48,480
it's something that you're it's on a
416
00:20:46,679 --> 00:20:52,559
case-by-case basis and what you're
417
00:20:48,480 --> 00:20:55,350
saying makes sense but what I can say is
418
00:20:52,559 --> 00:20:57,149
it all depends on the configurations and
419
00:20:55,350 --> 00:20:59,189
that's why it's so important at the
420
00:20:57,150 --> 00:21:01,679
beginning it I'm not saying that you're
421
00:20:59,190 --> 00:21:04,320
wrong you're absolutely correct that you
422
00:21:01,679 --> 00:21:07,020
should have the best database possible
423
00:21:04,320 --> 00:21:09,120
the idea is to only get the information
424
00:21:07,020 --> 00:21:11,070
that matters at that moment to the
425
00:21:09,120 --> 00:21:13,979
people that are working because you can
426
00:21:11,070 --> 00:21:17,040
always have that in storage you can
427
00:21:13,980 --> 00:21:19,050
accumulate whatever knowledge you think
428
00:21:17,040 --> 00:21:21,418
will useful in the future in your
429
00:21:19,050 --> 00:21:24,780
storage and then use it to process it
430
00:21:21,419 --> 00:21:27,240
later on because the the idea is that is
431
00:21:24,780 --> 00:21:31,770
that this should will work automatically
432
00:21:27,240 --> 00:21:35,250
so currently it's on a I need to run the
433
00:21:31,770 --> 00:21:37,260
the platform basis but it was created
434
00:21:35,250 --> 00:21:39,179
and I designed it so that you could put
435
00:21:37,260 --> 00:21:42,540
it as a thread and it will basically be
436
00:21:39,179 --> 00:21:44,700
running over and over on your database
437
00:21:42,540 --> 00:21:47,070
and checking for the creation of new
438
00:21:44,700 --> 00:21:50,070
enriched yokes or potentially enriched
439
00:21:47,070 --> 00:21:53,939
jobs and yes currently I have a concern
440
00:21:50,070 --> 00:21:57,450
about false positives because we don't
441
00:21:53,940 --> 00:22:01,410
know what a false positive is so if you
442
00:21:57,450 --> 00:22:04,200
don't we got eleven potential enriched
443
00:22:01,410 --> 00:22:06,660
yokes and that's for me and it's an
444
00:22:04,200 --> 00:22:08,940
issue that I have with my instance with
445
00:22:06,660 --> 00:22:11,370
my advisors because for them it's yeah
446
00:22:08,940 --> 00:22:15,929
you created these enriched yokes
447
00:22:11,370 --> 00:22:19,080
no I created eleven events that have the
448
00:22:15,929 --> 00:22:22,080
information from other events that
449
00:22:19,080 --> 00:22:24,600
should be correlated but there is no way
450
00:22:22,080 --> 00:22:27,059
to be certain at this point that they
451
00:22:24,600 --> 00:22:29,750
are actually what we've created is
452
00:22:27,059 --> 00:22:32,908
better than what we had in the past and
453
00:22:29,750 --> 00:22:34,950
so it's right now we have the proof of
454
00:22:32,909 --> 00:22:37,320
concept and we created something the
455
00:22:34,950 --> 00:22:39,600
future is with the risk score seeing if
456
00:22:37,320 --> 00:22:43,980
that's something that we're creating has
457
00:22:39,600 --> 00:22:45,840
added value and we can see with the
458
00:22:43,980 --> 00:22:48,300
human eye some added value because you
459
00:22:45,840 --> 00:22:50,928
can see for instance difference on a
460
00:22:48,300 --> 00:22:55,168
vulnerability that reappears over time
461
00:22:50,929 --> 00:22:58,980
which would probably not be that's is it
462
00:22:55,169 --> 00:23:02,070
easy to detect you can create other like
463
00:22:58,980 --> 00:23:04,740
we created measures to see how the how
464
00:23:02,070 --> 00:23:07,710
different events and event evolves over
465
00:23:04,740 --> 00:23:11,600
time so you can get those metrics but
466
00:23:07,710 --> 00:23:19,140
this is thus the beginning of a new age
467
00:23:11,600 --> 00:23:21,199
more questions no thank you
468
00:23:19,140 --> 00:23:21,200
you