00:00:00
[Music]
00:00:07
today I want to talk about frigate and I
00:00:09
want to talk about using double take
00:00:11
which is a unified API an API that
00:00:14
allows you to train and process images
00:00:16
for use and facial
00:00:18
recognition so I've been using it for a
00:00:20
couple days and this is going to be a
00:00:22
little bit different uh type of video
00:00:24
from what I'm normally doing I'm going
00:00:26
to talk more about what my experience is
00:00:28
in this video and maybe later on on
00:00:30
we'll do a little more howto on how I
00:00:32
set it up because I've been playing with
00:00:34
it for the last couple days and it
00:00:35
hasn't been something that I'm really
00:00:38
excited about I was uh on the edge about
00:00:41
making a video anyway on this because
00:00:44
I'm having issues with it or it's it's
00:00:46
just not living up to my expectation of
00:00:49
what I want it to do and with AI and
00:00:52
facial recognition that's not uncommon
00:00:54
you're going to have some learning that
00:00:56
has to be done and it's not going to be
00:00:58
perfect so let's go into what Double
00:01:00
Take is first this is the double take uh
00:01:03
GitHub page and let's just go through a
00:01:05
little bit of the features here as I
00:01:07
mentioned this is a unified UI and API
00:01:09
for processing and training images for
00:01:11
facial recognition it doesn't actually
00:01:13
do the work it relies on detectors to do
00:01:17
the work but it gives you a very nice
00:01:19
easy convenient way to train images and
00:01:22
do other stuff with them so there the uh
00:01:25
the question is why would you do this uh
00:01:27
there's a lot of great software to
00:01:29
perform the actual facial recognition
00:01:30
but they all behave differently Double
00:01:33
Take was created to take the
00:01:35
complexities of those and combine them
00:01:38
into an easy to ous UI so basically
00:01:39
Double Take is a UI for detection
00:01:41
services and it talks about a lot of the
00:01:43
spons uh the the features here and one
00:01:46
of the features you'll notice here is a
00:01:48
an home assistant add-on I tried and
00:01:51
tried and tried to use the home
00:01:53
assistant add-on on on my ooid N2 plus
00:01:57
and I I had an issue with it where it
00:01:59
would just caused my ooid to slow down
00:02:01
to a crawl so much to the point that I
00:02:04
had to log in and kill the process via
00:02:07
SSH cuz I couldn't get back into the box
00:02:10
I don't know that that's everybody's
00:02:11
experience I don't really understand
00:02:12
what's going on because a lot of the
00:02:14
forums I've read talk about double take
00:02:16
being a very lowend or
00:02:19
low uh consumption type of application
00:02:22
and it shouldn't do that so something's
00:02:23
going on with my box or I've got too
00:02:25
many things running on my ooid already
00:02:28
so I've detached that and run that in a
00:02:30
Docker container I am running it on a
00:02:34
virtual machine on My Windows desktop so
00:02:37
I'm running it in a bunto virtual
00:02:39
machine in Virtual box so I virtualized
00:02:42
it into a Docker container or into a
00:02:44
Docker uh system running in that VM now
00:02:46
you can also run it on Docker under
00:02:48
Windows directly if you want to do that
00:02:50
or a number of other ways and you may
00:02:52
have success with the add-on as well has
00:02:54
this architecture support uh the
00:02:57
detectors it uses are deep stack com
00:02:59
proface and face box and it does support
00:03:02
the frig in VR that is what I'm running
00:03:04
I have a number of videos on frig how
00:03:06
I've got frig set up frig's been running
00:03:08
well for a while frig does object
00:03:09
detection very well I don't have to do
00:03:11
anything for object detection but what I
00:03:13
wanted to try to do was layer some face
00:03:15
detection or person uh recognition on
00:03:18
top of that uh fa uh that frig stuff
00:03:22
that it's doing it talks about uh how
00:03:24
you set things up here so we go through
00:03:26
all the setups it has the ability to do
00:03:28
notifications with home assistant for
00:03:31
various things that go on with uh the
00:03:33
Double Take and then of course it
00:03:35
publishes its results to uh sensors via
00:03:39
mqtt so you need to have mqtt up and
00:03:42
running and finally you can use the go
00:03:44
toy service which I actually have up and
00:03:46
running as well this is go toi or got
00:03:50
toi I would say go toi and someone will
00:03:52
correct me as they have in my other
00:03:54
videos on how I pronounce things but
00:03:56
this gives you uh an ability to send
00:03:58
notifications to this gotify service I
00:04:01
am also running gotify on in a Docker
00:04:04
container on my VM so I'm running uh
00:04:07
deepstack uh compr face and uh gotify
00:04:11
and double tag all of my VM on My
00:04:13
Windows desktop so I'm using virtual box
00:04:15
for that you can go through and read the
00:04:18
documentation here on all of this um but
00:04:21
it's very easy to get up and running
00:04:23
again we're not going to talk so much
00:04:24
about getting it up and running this
00:04:26
video on a follow-up video if there's
00:04:28
enough interest in this I'll go through
00:04:29
through how I set up my Docker
00:04:30
containers and everything else I just
00:04:32
want to kind of talk through briefly how
00:04:35
uh I perceive this to be working for me
00:04:38
now this is the interface for double tap
00:04:41
and this is what happens whenever you
00:04:43
see images that are being detected it
00:04:45
only shows this if a face is picked up
00:04:49
in the image I could have frig notify me
00:04:53
uh whenever there's activity an object
00:04:56
detected or something else it will do
00:04:57
that but if there's no face detect ected
00:05:00
by one of the detectors uh coer face uh
00:05:03
or in my case deep stack either one of
00:05:06
those it will not show here but these
00:05:08
have picked up faces and you can see
00:05:09
them here now one of the first things I
00:05:11
want to talk about is the size of the
00:05:15
image uh if I walk into the room my face
00:05:18
is really tiny on here and you'll notice
00:05:22
here that this is pink and what this
00:05:24
tells me uh is that it didn't it even
00:05:26
though it might have detected me the
00:05:28
confidence is too low for for a specific
00:05:31
reason and one of the one of the checks
00:05:33
it performs is the size of the Box area
00:05:36
if you have a teeny tiny box area it's
00:05:39
just not going to be as accurate as
00:05:41
having something like this size box so
00:05:44
here it says the Box area is too low and
00:05:47
so it marks it as uh it it doesn't send
00:05:50
it through as I detected something even
00:05:52
though it shows it
00:05:54
here when you have a green box and um a
00:05:59
a face detection or a name or percentage
00:06:01
here then what you get is an actual an
00:06:05
actual
00:06:07
detection and so for these two faces
00:06:09
here because they were up close I'm
00:06:11
getting myself detected what this means
00:06:14
is for most of my cameras uh on frig
00:06:18
these are the size images that I'm going
00:06:20
to get and even though I have this
00:06:23
particular camera set to crop these
00:06:25
images at a size of 500 the picture
00:06:29
isn't n good enough for me to be able to
00:06:32
have the system reliably do anything
00:06:34
with it and this is face detection going
00:06:37
into a room so if you're doing presence
00:06:39
detection in a room you would have to
00:06:41
get up and close and personal to the
00:06:42
camera or you would have to to adjust
00:06:45
the camera so it only sees a small area
00:06:47
here so that the face is bigger this is
00:06:49
my first issue with this kind of face
00:06:51
detection if we look at my
00:06:54
cameras uh for frig you'll see that all
00:06:57
of these
00:06:58
cameras um are in a wide area and even
00:07:02
though it pulls in snapshots of these
00:07:03
areas it just isn't going to pick it up
00:07:07
well enough so far that I've seen to do
00:07:09
this now I'm going to continue to play
00:07:11
with it and see if I can uh train it to
00:07:14
make it a little bit better there are
00:07:16
ways to change the size of this box so
00:07:20
for example the default is 10,000 for
00:07:23
the size of the or for the Box area I
00:07:27
can make the Box area a th000 and have
00:07:29
it pick it up but you're going to get a
00:07:30
lot of false positives so that's part
00:07:33
one I kind of brought this up kind of
00:07:35
out of order because I wanted to show
00:07:36
you this screen since I'm already here
00:07:38
in this interface which is really nice
00:07:40
it's a very nice interface for training
00:07:43
if I wanted to train this image here or
00:07:45
any image for that matter all I have to
00:07:47
do is click on this
00:07:49
box and I can pull down one of these
00:07:52
names for uh instance my name and I can
00:07:54
click on train and it's going to ask me
00:07:57
if I want to train this file and if I
00:07:59
click on yes it's going to go in here
00:08:01
and now it's going to start training my
00:08:03
files this is a super easy way to tell
00:08:06
your
00:08:07
detectors uh what images to use for what
00:08:10
particular purpose and you create a
00:08:12
folder for each person or thing that you
00:08:15
want to train or each person so I would
00:08:18
call this Del you know if I had a
00:08:20
delivery person that was always the same
00:08:21
and I knew their name I would create a
00:08:23
delivery folder and then I could just
00:08:25
pick these images and train them into
00:08:27
the into the delivery folder
00:08:30
so another way to train your uh images
00:08:33
here using this deep stack interface or
00:08:35
this I'm sorry this double take
00:08:37
interface is to upload files I pick a
00:08:41
folder I click upload and then I can
00:08:44
pick some images here from this list so
00:08:47
I'll just pick a couple of my
00:08:50
names couple of my
00:08:53
images and I will click on open and what
00:08:56
it's going to do is start training
00:08:57
automatically and I picked four Images
00:08:58
but you see 8 here it's because I have
00:09:01
two detectors configured right now I
00:09:04
have a deep stack detector and I have a
00:09:07
comper face detector and I'm using both
00:09:10
of those at the same time right now
00:09:12
because I want to see which one's better
00:09:14
there's been a lot of
00:09:15
discussion uh stating that comper face
00:09:18
is actually better than deepstack and so
00:09:21
far I am seeing that I think comper face
00:09:23
does a better job of detecting
00:09:26
faces now the training time will depend
00:09:28
on how many images you put in there and
00:09:31
also how busy your system is so we'll
00:09:34
let that train for a few minutes and
00:09:35
we'll come back to it the compra face is
00:09:39
the uh is an open source face
00:09:41
recognition system and again it's a
00:09:44
Docker container there's an official
00:09:46
website for it as well and it's like I
00:09:49
said it's a free and open source
00:09:50
recognition system can easily integrated
00:09:52
in any system and uh Double Take allows
00:09:55
you to use that as well as uh face boox
00:09:59
I think is the other one and and the
00:10:01
Deep stack so we're using uh using both
00:10:04
of those and I think again that uh Comer
00:10:07
face actually works better than deep
00:10:09
stack once these images are trained
00:10:11
you'll see them in your trained uh
00:10:13
folder here and then it'll start using
00:10:16
those to match against what's coming in
00:10:19
and these are the images that I chose uh
00:10:21
for that now deep stack has been having
00:10:24
issues even now it's got some time out
00:10:26
issues so you'll notice that it's
00:10:27
trained for comper face but it's not
00:10:29
trained for deep stack and that's
00:10:31
because uh there are timeout issues with
00:10:33
deep uh deep stack uh as I've played
00:10:36
with this over the last couple days I'm
00:10:38
pretty sure I'm just going to get rid of
00:10:39
deep stack and let copper face do all
00:10:40
the work because it seems to be the best
00:10:43
now if you were to have fold those uh
00:10:45
photos that come up as unknown here you
00:10:48
can train them and then you can rescan
00:10:49
them here reprocess the images and those
00:10:51
reprocess images will then hopefully
00:10:55
come back to the person you just trained
00:10:57
or a person that you have in your
00:10:58
trained folder
00:10:59
so we have the main interface for
00:11:01
matches and this is in this does live
00:11:03
updates uh you can either enable or
00:11:05
disable live
00:11:07
updates and you can also disable or
00:11:09
disable this filtering disable or enable
00:11:11
the filtering and I can choose who I
00:11:14
have uh for matches or misses uh what
00:11:17
kind of detector I'm using which camera
00:11:19
they come on and then all these other
00:11:21
settings so the configuration of double
00:11:23
te is done through the UI as well
00:11:25
typically you'll do that first I'm kind
00:11:27
of doing everything backwards cuz I'm
00:11:28
showing you what's happening as we go
00:11:30
along in the wrong order but you would
00:11:32
come to config first and in the
00:11:34
configuration once you're in the
00:11:35
configuration you would set your mqtt
00:11:38
host username and password you would set
00:11:40
your frig URL with its Port make sure
00:11:43
you're using the right port and there
00:11:45
are some settings that you can set
00:11:47
within here that are talked about within
00:11:49
this configuration page both on GitHub
00:11:51
and the docker
00:11:53
page you can leave everything as default
00:11:55
to start with and then come back later
00:11:57
on and do stuff with it if you need to
00:11:59
change it I added a couple of things
00:12:01
here under events I wanted
00:12:04
attempts uh to be no latest snapshot
00:12:08
because I was getting multiple notices
00:12:09
for the snapshot I wanted I mean the the
00:12:12
latest image so you get latest image
00:12:14
snapshots and mqtt if you have it
00:12:16
enabled the latest image is what shows
00:12:20
up as the frig is sending it in and then
00:12:24
you have the snapshot which is the
00:12:26
snapshot that you get from frig and then
00:12:28
you have m qtt which is then sent once
00:12:31
everything is processed the thing about
00:12:33
mqtt is there's quite a delay because it
00:12:35
has to finish the event before it sends
00:12:37
it in so if you want to do snapshots um
00:12:40
or if you don't want to do a latest at
00:12:42
least do snapshots and I've commented it
00:12:44
out because it does a default of 10 and
00:12:46
again all this is set in here and it
00:12:48
talks about these types of changes that
00:12:51
you can make and these changes actually
00:12:53
allow you to set things like the latest
00:12:55
so this the number of times devil te
00:12:57
will request a frigate image for facial
00:12:59
recogition number of times it will
00:13:01
request a frig snapshot and number of
00:13:03
times or um whether it processes images
00:13:06
from the mqtt topic and then you can add
00:13:09
delay Express in seconds between each
00:13:11
detection Loop right now it's set the
00:13:12
default of one these are all defaults so
00:13:15
mqtt is false by default and then it'll
00:13:17
do the last five snapshots the last five
00:13:20
uh latest photos from frig to try to
00:13:22
process the image change those as you
00:13:25
need to but it's a lot of tweaking and a
00:13:27
lot of of that kind of thing you need to
00:13:28
do to make sure that you get it just the
00:13:30
way you want it and then you set up your
00:13:32
detectors you've got your comper face
00:13:34
detector again I'm running these on a VM
00:13:37
so there's my local IP with the port
00:13:39
number and then you generate a key for
00:13:41
comp perace since these are only
00:13:43
accessible inside my network I'm not
00:13:44
worried about the key and uh tokens
00:13:46
being exposed in this video uh that's
00:13:49
okay I can change as at will and
00:13:51
probably will after the video is filmed
00:13:53
deep stack same thing you can run deep
00:13:55
stack uh and then you can set a key and
00:13:57
a timeout keys are not used by default
00:13:59
so I would leave that blank unless you
00:14:01
really need it and then I've got my
00:14:03
gotify or gotify URL which is also
00:14:06
running with a token and that's what
00:14:08
sends my gotify images or notifications
00:14:11
to my gotify service which I'm listening
00:14:13
to on a browser and also on my phone so
00:14:16
that's my configuration it's super
00:14:18
simple to set up and uh you can also do
00:14:20
a Secrets file here this Secrets file
00:14:24
will allow you to set Secrets within
00:14:26
this configuration file and it also Al
00:14:29
pulls your secrets file if you do this
00:14:30
as a home assistant add-on now there's a
00:14:33
home assistant add-on for Double Take
00:14:36
and comper face and deep stack and of
00:14:38
course mqtt and frig both have add-ons
00:14:41
you can run all of those on your home
00:14:42
assistant device if you have enough
00:14:44
horsepower I just don't have enough
00:14:46
horsepower to do everything and by the
00:14:49
way while I'm thinking about it uh you
00:14:51
can see instant status of all your
00:14:53
connections up here so everything you
00:14:55
have configured here will show up here
00:14:56
my double take's running my mqtt is
00:14:58
running running deep stack keeps getting
00:15:01
a timeout and that's what we talked
00:15:02
about I've seen some discussion about
00:15:04
this in recent uh comments that there's
00:15:06
a timeout issue for some reason with
00:15:08
deep stack but I'm running both of these
00:15:10
so really I'll probably again get rid of
00:15:13
deep stack and just run comeras so as
00:15:15
images come in from frig Double Take is
00:15:17
going to go grab those images send them
00:15:19
to the processors the detectors for
00:15:22
processing and then come back with this
00:15:25
uh kind of display right here where you
00:15:27
see uh the images that it picks up with
00:15:30
faces in it and then gives you an
00:15:32
indication of whether or not they
00:15:35
actually worked you can set up
00:15:36
notifications and home assistant for
00:15:38
this as well I am still playing with
00:15:41
that and haven't quite tweaked it so
00:15:43
stay tuned for another video and I keep
00:15:45
saying I'm going to show you all the
00:15:46
configuration later I probably will I
00:15:48
want to give you an idea of this I want
00:15:50
to hear comments about this I want to
00:15:51
hear from you whether or not you're this
00:15:54
is something that uh is even useful to
00:15:57
you do you have uses for facial
00:16:00
recognition does your environment allow
00:16:02
you to set up cameras that are close in
00:16:04
enough to give you a good image so that
00:16:06
you can get facial recognition if I'm
00:16:09
looking at people walking down the
00:16:10
street on my frig cameras for example
00:16:13
and I see this image here someone's
00:16:15
walking down this street right here I'm
00:16:17
not going to see them it's not going to
00:16:19
pick them up there's no way for my
00:16:21
camera to zoom in here and pick up
00:16:22
people on the street they would have to
00:16:24
walk right up in this area right here
00:16:26
for me to be able to have that useful
00:16:28
even out here may not be enough and it
00:16:31
just really depends on your your angle
00:16:33
of what you're looking at on this camera
00:16:36
here may not work so again I want to
00:16:38
hear what your thoughts are on whether
00:16:41
or not this is useful do you want to see
00:16:43
me uh show you my config how I set up
00:16:46
the docker containers and everything
00:16:47
because I will run through that from
00:16:48
scratch again if you would like me to do
00:16:50
that let me know what you think about
00:16:52
this in the comments below let me know
00:16:54
if you're using double take deep stack
00:16:56
com forace let me know if you're using
00:16:58
the add-ons and home assistant or you're
00:17:00
using a standalone in what kind of
00:17:01
environment throw all that in the
00:17:02
comments let's have a discussion about
00:17:04
this because if I'm doing this wrong and
00:17:06
it's likely I might be I want to know
00:17:08
how I can tweak and tune it to make it
00:17:10
useful I would I think one very simple
00:17:13
use case for me is I don't want to be
00:17:15
alerted when my family's walking around
00:17:17
in any of the images here I don't I
00:17:19
don't need to be alerted when they're
00:17:20
doing stuff here I want to be alerted
00:17:22
when there's an unknown face that pops
00:17:24
up so if I can say okay with reliable
00:17:28
results
00:17:29
it picks up my family or whoever I want
00:17:31
in the images as people I know and
00:17:34
everyone else becomes unknown that would
00:17:36
be fine I know how to alert on that but
00:17:38
I just can't get the thing to detect
00:17:39
because the image sizes are too small
00:17:42
and maybe I haven't trained it enough
00:17:44
I'm using high quality images from
00:17:47
either selfies or photos I'm not using
00:17:50
camera images to show it because if you
00:17:53
use camera images and they're too small
00:17:56
the you're going to get an unreliable
00:17:58
result as as well so tell me about it I
00:18:00
know this wasn't a big how-to video it's
00:18:02
just a quick talk about what frig uh and
00:18:05
Double Take and deep stack on com forace
00:18:08
have done for me in the last couple of
00:18:09
days I love the work that's going on I
00:18:12
don't want to minimize the amount of
00:18:13
work that's been going on I think this
00:18:15
is an excellent uh start for this I just
00:18:18
need to know how to make it better for
00:18:20
my use cases and maybe I have to have a
00:18:23
zoom uh zoomable camera maybe I have to
00:18:25
have a different type of camera uh all
00:18:28
that kind of stuff just to see if I get
00:18:29
reliable results or maybe my use case
00:18:32
isn't valid for what this is used for so
00:18:34
with that let me go and end the video
00:18:36
here again it's a little bit different
00:18:38
than what I'm used to to filming for you
00:18:40
it's not a big how-to but if there's
00:18:42
enough interest I will build the entire
00:18:45
configuration of what I'm using from
00:18:46
start to finish on a separate video If
00:18:49
you're not a subscriber uh make sure you
00:18:50
hit that subscribe button uh thank you
00:18:52
for watching if you like the video hit
00:18:54
that thumbs up and uh Channel
00:18:56
memberships are available for you as
00:18:57
well if you would like to contribute and
00:18:59
support the channel and we'll see you on
00:19:00
the next one
00:19:03
[Music]