|
`find-yttranscript-links' (subtitles)
The main page about this video is here.
Its index is here.
Its subtitles in Lua are here.
The rest of this page contains a conversion of the subtitles in Lua
to a slightly more readable format.
00:01 Hi! My name is Eduardo Ochs,
00:03 I'm the author of an Emacs package called eev,
00:06 and this video is about a function that I added
00:09 to eev in March... it's called
00:11 `find-yttranscript-links'...
00:16 and... well, this page is about this video
00:18 and that function, with all the details...
00:22 and that function is inspired by a blog post
00:25 that Andrea Giugliano wrote in...
00:28 actually, two blog posts -
00:31 in the end of March...
00:33 the first blog post was called
00:35 "An Elisp snippet to download
00:37 YouTube video transcripts"...
00:40 and the second one was about rewriting
00:42 his function of the first blog post
00:45 in a slightly different style.
00:51 When I read those blog posts
00:53 I immediately thought that WOW,
00:56 I _need_ something like that in eev,
00:58 but I would like to rewrite it
01:01 in the style that I prefer to put the
01:04 things in eev in...
01:06 and that's the reason for the dog
01:10 with the square haircut here -
01:13 I'm going to call him the "square dog"...
01:17 when people say "dog" no one expects
01:24 a square dog, so this is not the dog
01:28 that people would expect by default...
01:31 and the interface that this function uses
01:35 is as unexpected as a square dog.
01:41 So let me show how this thing works.
01:46 I prepared two examples here.
01:49 The first example uses a video that I
01:53 have not downloaded with youtube-dl yet,
01:56 and the second example uses a video that
01:59 I have downloaded.
02:00 A few months ago I wanted to convince
02:05 a friend to read this manga here,
02:08 "Lone Wolf and Cub"...
02:11 and I found this video about Lone Wolf
02:13 and Cub, aind I wanted to to send him a
02:17 link to a specific part of the video...
02:20 so I watched first the whole video, and
02:23 then I wanted to find a specific part,
02:25 and I only remembered a few keywords of
02:28 it... so, what did I do?
02:32 First of all... uh, no, one second.
02:34 This thing here, this sequence of 11
02:37 characters, is what I call the "hash"
02:39 of the video, and YouTube videos
02:42 are identified by their hashes.
02:45 When I run `M-x find-youtubedl-links' -
02:50 this is my standard function for
02:52 downloading videos from YouTube and for
02:54 doing things related to downloaded
02:58 videos...
03:00 and when I run that function it detects
03:02 the hash around point and it
03:06 does some things with that hash...
03:08 in this case the hash is going to be
03:10 this one, "aZWq6CEzhuQ"...
03:13 and `find-youtubedl-links' is going to
03:16 generate a temporary buffer
03:19 in which several parts of it
03:21 were replaced by that hash.
03:24 so this thing here is the hash,
03:30 and these other... stubs
03:35 have not been replaced yet...
03:38 and one thing that I'm going to do is
03:39 that I'm going to replace the stem
03:42 by "lonewolf".
03:45 Let me duplicate this line and in this
03:49 copy I'm going to replace the "{stem}"
03:52 by "lonewolf".
03:54 if I execute this sexp here it will
03:58 regenerate this buffer with these values
04:01 for some variables of the template, and
04:04 the "{stem}" that appears here, here,
04:08 here, and in lots of other places is
04:09 going to be replaced by "lonewolf".
04:12 Let me do that now...
04:14 Here - the buffer was slightly changed...
04:17 here we see "lonewolf" in several
04:20 places... and let me delete the parts that
04:24 I'm not going to use. I'm not interested
04:26 in this part here...
04:28 by the way, this sexp here goes to a tutorial
04:31 that explains how to use youtube-dl...
04:34 sorry, how to use youtube-dl from eev.
04:39 If we follow this we go
04:41 to this tutorial here, to this section,
04:43 that starts by explaining what is the hash
04:46 of the video, and so on... but let me go back.
04:49 I also do not want this section here,
04:52 I said that I do not want to download this
04:55 video, so I also do not want these things
04:58 here... remember that this is a
05:00 temporary buffer, so I can do any mess
05:02 that I want with it...
05:07 this section is about playing the local
05:08 copy - I do not have a local copy...
05:13 I'm interested in these parts here.
05:19 If I execute this then Emacs
05:22 is going to call Google Chrome
05:27 with a certain URL to play the video
05:30 starting from a certain position. For
05:32 example, if I replace this by 2:34, if
05:36 executed this it will play the video
05:39 with this hash starting from this
05:41 position...
05:42 and if I execute this sexp here
05:48 it will define a function with this "code"
05:51 in its name, so it's "lonewolf"...
05:55 the code is here... and this function
05:58 here, `file-lonewolfvideo', it will play
06:02 that video from a certain point...
06:05 and it was defined as a function
06:09 that plays that video by opening a
06:12 browser on a certain URL that plays
06:15 the video from a certain position.
06:19 So, for example, if I put 2:34 here
06:22 it will play the video starting from
06:25 2:34... but I do not want to play the
06:28 video now.
06:34 I want to run this. If I run this sexp
06:37 here it will open another temporary buffer
06:39 with another name, so it will not delete
06:42 this temporary buffer... note that this
06:44 temporary buffer is called
06:46 "*find-youtubedl-links*"...
06:48 let me execute this... it opens another
06:52 temporary buffer, called
06:55 "*find-yttranscript-links*"
07:03 let me run this with a smaller font...
07:09 remember that when I use eepitch the
07:13 main key is f8, and f8 acts in one way
07:17 on lines that start with a red star,
07:19 like these four lines here, and in
07:23 another way on lines that do not start
07:26 with red stars...
07:28 and I usually use lines with red stars
07:31 to set up a target buffer, but I can also
07:34 use them to do other kinds of setups.
07:37 These three lines here will
07:40 prepare a target buffer that is running
07:43 python, but in this case I will usually
07:47 execute these four lines here,
07:49 starting from this one...
07:51 and the action of this first one -
07:54 remember that when we execute
07:57 lines that start with red stars with f8
08:01 the rest of the line after the red star
08:05 is executed as Lisp...
08:07 So... if I start by typing an f8 here
08:11 the action of this first line is to put
08:15 this buffer here, that is in fundamental
08:17 mode, in Python mode...
08:21 so now the colors are more expressive.
08:24 And now I'm going to run this these
08:27 three lines again, they will kill this
08:29 target buffer and will regenerate it
08:31 again...
08:33 and now I'm going to use f8 again to
08:36 send these lines to the target buffer.
08:38 Remember that when I type an f8 on a
08:40 line that do not start with a red star
08:42 the f8 sends the line to the target
08:46 buffer...
08:48 usually the target buffer is running
08:50 an interactive program, and the program
08:56 thinks that the user had typed the
08:59 contents of that line.
09:01 So I'm going to send these commands here
09:04 to this Python interpreter here,
09:06 starting by this one.
09:09 This one here, that calls this
09:13 function with a very long name here,
09:15 takes a few seconds to run, so I'm going
09:18 to type f8 here and wait until it's
09:22 done... it was quick.
09:25 And now I'm going to run these things
09:27 here, and this print here.
09:36 And this print here prints the
09:38 transcript of the video in a certain
09:40 format. Sometimes I want other formats so
09:44 it's easy to switch to another format...
09:47 in this case when I ran these three lines here
09:52 I defined two variables, trits0 and trits1,
09:58 for a transcript... and the second definition
10:06 for trits1 over... oh god, how do I say
10:11 that in English? Has overridden
10:14 the previous one... so I executed this,
10:18 and then this, and the current definition
10:23 of trits1 is the one that corresponds to
10:28 this expression here,
10:30 that gives me this format...
10:33 if I execute this line here I'm going
10:37 to get another format... another
10:39 definition for trits1 and another
10:41 format. Let me do that now, and let me
10:44 print the current value of trits1...
10:48 it is in this much shorter format.
10:53 I do not want to explain how to use this
10:56 this shorter format now... the other
10:58 format is easier to understand.
11:01 So let me use the previous format again...
11:05 and now I have a sequence of sexps, and
11:09 each one of these sexps is a link that...
11:12 I mean, when I execute it it is going to
11:17 act as a hyperlink that does something,
11:19 and this something is to
11:22 invoke Google Chrome to play
11:25 this video here
11:27 starting from a certain position.
11:30 And when I wanted to send a part
11:34 of this video to my friend
11:39 I wanted to show him the part that
11:41 mentioned Akira Kurosawa...
11:44 so let me search for Kurosawa here...
11:47 found it! Let me see...
11:51 I want to start a few seconds before
11:53 that, let me see where... ah, here:
11:59 "...but Lone Wolf and cub is above all
12:02 cinematic, blah blah blah blah... it
12:04 uses the vocabulary of movies..."
12:07 I want to start from this position, so
12:10 let me check if this is what
12:12 I'm looking for... let me execute this.
12:16 It will open another tab in the browser...
12:20 it takes a few seconds...
12:22 no, sorry, many seconds...
12:27 and it plays an ad...
12:34 that's the part that I'm interested in.
12:36 I'm not interested in the ads.
12:39 Let me close the YouTube window.
12:43 Anyway, so this is the part that I want,
12:50 and I can copy that to my buffer with
12:55 the annotations about this video...
13:03 and let me delete this and go back to
13:11 this thing here, where I saved some
13:14 hyperlinks... so I was saving these
13:17 hyperlinks here, and of course
13:20 I'm not going to use all of them,
13:24 but the ones that can be useful, that
13:28 deserve to be saved if I want to replay
13:30 this action at some point, are
13:37 these ones. So, these ones create
13:40 a kind of setup in which this function
13:43 is defined, and now that this function
13:45 is defined I can use this sexp here to
13:50 play the video from a certain position.
13:53 But let me go back. I said a long time
13:58 ago that I was going to show two
14:00 examples. In the first one I was going to
14:02 use a video that I do not have a local
14:05 copy of, that is the first one... and the
14:07 second one is an example in which I use
14:10 a video that I have downloaded the local
14:13 copy: this one.
14:16 Remember that I can either type
14:19 `find-youtubedl-links' here...
14:27 but I can also type
14:29 `find-yttranscript-links' here...
14:31 it also detects the hash around point.
14:40 Let me see... no, I want the other
14:42 one, here... and I want to define
14:53 short names for the hyperlinks to this
14:55 video. I'm going to use this stem here,
14:58 "haskelltocore".
15:01 Let me use a smaller font...
15:07 this title here was derived from...
15:11 I mean, Emacs looked at my directory
15:15 with local copies of videos that I
15:16 have downloaded using
15:19 `find-youtubedl-links', and it found
15:22 that I have a video with this hash here,
15:23 with this title here, and with this
15:27 extension. So this is the title,
15:31 this is the hash, this is the
15:35 extension, and so on. And I have just
15:39 adjusted this by hand, and I'm going
15:43 to run this sexp here to regenerate
15:45 this buffer, so these "{stem}"s
15:48 here are going to be replaced by
15:49 "haskelltocore".
15:53 Again, I'm going to delete the things
15:55 that I do not want to use...
15:58 I have already downloaded the video, so I
16:01 do not need this, and I'm going to
16:05 delete this part that plas the video
16:07 on YouTube because I just want the
16:09 local copy.
16:12 And now let me copy this to my notes.
16:24 I want to execute this sexp here,
16:27 and this sexp here is going to define
16:30 the function `find-haskelltocorevideo',
16:37 this function here...
16:39 and it's going to define
16:41 that function in a way that is very
16:43 different from the way that I defined
16:45 this function here. When I ran
16:48 `code-youtubevideo' with these arguments
16:51 here it defined `find-lonewolfvideo'
16:54 as a function that plays a video on
16:56 YouTube by telling a browser to open
17:00 a certain URL,
17:03 and now what I'm going to do is that I'm
17:07 going to define the function... I mean,
17:10 this very long sexp here defines this
17:12 function here as a function that plays
17:15 a local copy of a video - a local video.
17:20 The function that plays local copies does
17:22 not know if the local copy
17:24 was downloaded from YouTube
17:27 or if it is something else.
17:32 So, this sexp here plays the local
17:36 copy of the video -
17:41 it's a very good talk about Haskell -
17:44 and suppose that I want to find
17:46 something that I remember
17:48 that I saw in the video
17:50 when I watched it, and I remember a few
17:53 keywords about it.
17:55 So, I can use this to access the
17:57 transcript...
18:02 note that
18:05 this sexp here was invoked with
18:09 the hash of the video,
18:11 and with this short name here, that can
18:14 be used to generate the name of this
18:17 function here...
18:20 If I run this I get this
18:24 temporary buffer, and I want to execute
18:27 this block here - let me do that now...
18:35 this line here takes several seconds
18:38 to run, so I'm going to press f8 once and
18:42 then wait for the next prompt here...
18:48 got it. And now I'm going to generate
18:54 these variables here, and I'm going to
18:56 print the contents of the variable
18:57 trits1... it is this thing here...
19:04 let me choose a keyword...
19:08 let me search for "monad", for example -
19:13 just a random keyword... I don't remember
19:16 what is the part that I was
19:18 looking for, sorry...
19:24 let's see what happens when I run
19:26 this thing here...
19:31 it plays a local copy.
19:39 So, yeah, that's it. So: I can use that
19:43 to find things in videos...
19:47 if I want to download the
19:51 transcript of a video I need something
19:53 that is a bit tricky...
19:56 if I just run this sexp here,
20:01 this sexp with `find-youtubedl-links',
20:10 this block here, that has the commands
20:12 for downloading the video...
20:16 in this block youtube-dl is called with
20:18 several different command line options...
20:23 I sometimes choose one that downloads the
20:26 video with low resolution, and
20:28 sometimes I run this one to download the
20:33 the best possible version...
20:35 and note that I here I have an option
20:39 that says that I want youtube-dl to
20:43 download the subtitles of the video, and
20:46 for YouTube subtitles and transcripts
20:49 are different things... subtitles are
20:52 edited by hand, and transcripts are
20:55 generated automatically from the
20:59 program that does speech recognition...
21:04 and to download the transcript I need
21:08 other options... they are not here...
21:12 I need another trick, so I need to
21:14 change them by hand...
21:16 so when I downloaded the copy of this
21:19 video I did not download the
21:21 transcript. The local copy does not have
21:24 subtitles...
21:27 I'm not going to show how to download
21:30 this transcript now, and to
21:33 convert it to a format that makes it
21:36 work as subtitles for the video,
21:41 but it is possible to do that.
21:43 It's not the focus of the video
21:45 and I did not prepare that now...
21:51 anyway, let me go back. This is what I
21:54 wanted to show... no, there's one
21:57 thing missing, but the main idea
21:59 is that I have enhanced my
22:04 `find-youtubedl-links'
22:07 with a way to download the transcript
22:09 of a video.
22:11 This way can be accessed by this last
22:15 block here in the temporary buffer...
22:18 and so this is new - this is something
22:21 that I added to `find-youtubedl-links'
22:25 in March... and also,
22:28 there are several
22:30 new tricks here. One trick, that is not
22:34 very obvious and not everyone has
22:36 figured out that red star lines can do
22:38 that, is that we can use a red star line
22:41 to do other things of setups, like
22:44 changing the mode of a buffer - even
22:48 a temporary buffer...
22:49 so this one puts it in Python mode.
22:53 The rest is more familiar...
22:56 and note that this first line here says
23:01 to Python to import a library that
23:05 downloads transcripts from YouTube...
23:07 either transcripts or subtitles, it
23:10 works for both cases...
23:13 when people are running this for the
23:15 first time they probably do not have
23:17 that library... so what can they do?
23:21 This sexp here calls pip3, that is the
23:26 package manager for Python 3, with this
23:29 argument here, that is the name of our
23:31 package...
23:33 and this thing here opens a temporary
23:35 buffer that lets me
23:38 install that package.
23:40 In this case I already have that package
23:43 installed, so I do not need to run
23:45 these things...
23:50 and let me show something that I have
(find-2022yttranscriptvideo "23:53" "added to this...
23:55 let's say, this module of eev, just a
23:59 few days ago. I've added these two
24:02 functions here. This one here
24:06 can be used to
24:10 show information about a package
24:12 that is already installed... let me
24:14 copy these two lines to my notes,
24:17 one second...
24:21 my notes are here...
24:27 so this sexp here
24:32 is also like a square dog, it's an
24:36 unusual kind of interface...
24:41 and it has this small Python program
24:44 here, that mentions the name of a package
24:46 in a certain point...
24:49 and in this small program here I import
24:53 a library called "importlib", that
24:59 handles "import" and the files that
25:02 import uses, and
25:04 its sublibrary metadata,
25:09 and pretty-print, and so on...
25:12 let me do that now...
25:16 here is a new Python
25:19 interpreter and I'm going to to run
25:22 these things...
25:23 if I do not have this package then this
25:27 import will fail, but I have this
25:29 package.
25:30 And now I'm going to run these things
25:33 here to get information about this local
25:35 package. Its init file is here...
25:42 now I'm going to run importlib with
25:44 a certain submethod to get information
25:47 about this package...
25:49 and I'm going to print some information
25:53 that is in this that data structure
25:58 here... and ta-da!
26:00 This is quite primitive, but basically
26:02 it shows how to obtain
26:05 information about Python a package
26:08 that is already installed
26:11 in a very low-level way...
26:15 so I think that people who are learning
26:17 Python are going to find this interesting.
26:20 And there's something similar, but that
26:25 checks the remote repository of Python
26:28 packages...
26:30 if execute this I get another
26:33 temporary buffer here...
26:41 and here I have
26:43 some information about the
26:45 functions that I'm going to to use now...
26:51 basically I'm going to
26:54 to ask this site here about information
26:57 about this package...
27:00 and well, it's...
27:04 sorry I'm not going to give
27:06 the details now... let me just show what
27:08 this does.
27:11 Let me run this again...
27:13 this takes a second because it has to
27:18 to talk to the internet,
27:20 and now I'm going to
27:22 explore this data structure a bit.
27:26 For example, this subfield of a subfield
27:29 here
27:31 has information about the home page of
27:35 the upstream version of this package...
27:38 I can visit this URL here...
27:45 I have just asked Emacs to open
27:52 a new tab in the browser with
27:54 this URL here... so
27:58 this is all the information about this
28:00 package, and
28:03 I think that not all these files are
28:06 downloaded when I install this package...
28:08 some of these files may be
28:10 documentation, examples...
28:12 I do not know this very well yet...
28:20 Yeah, so: that's it.
28:22 Let me finish the video now. So:
28:26 yeah, bye! =)
|