## Building Browser-Based Video Editor in Vanilla JS

I made a video editor over the long weekend and wanted to share my experience
of diving into JavaScript in 2021.

The code can be found here:
[https://github.com/bwasti/mebm](https://github.com/bwasti/mebm)
and a demo can be found
[here](https://bwasti.github.io/mebm/#https%3A%2F%2Fthumbs.gfycat.com%2FFrequentDevotedKingsnake-mobile.mp4).

<video controls="controls" width=100% height="auto" autoplay>
  <source type="video/mp4" src="https://github.com/bwasti/mebm/raw/main/README_assets/usage.mp4"></source>
  <p>Your browser does not support the video element.</p>
</video>

Here's a quick run down of how and why I built it.

Why:

* Final Cut Pro is hundreds of dollars, I just want to make memes
* Other folks have gotten [full freaking neural nets](https://storage.googleapis.com/tfjs-models/demos/face-landmarks-detection/index.html) running, I should be able to get a video editor working
* I'm not a web dev professionally, so it'd be good to brush up

How:

Turns out modern browsers (Safari included) have a shocking number of features.
After browsing around for a bit, I conclude that this whole project
is basically just going to be patching together the impressive
work of browser developers.  I can probably get it done in a weekend.

I start in familiar territory:
a `window.requestAnimationFrame(loop)` where I paint
a video element to a canvas. I hide the video element with a 
low `z-index`, a large offset and `overflow: none`. 
A user can move a cursor around ("scrub")  to control the time of the video 
(which I easily set with `video.currentTime = t`).  Perfect.

My sample hardcoded `.mov` works well out of the box.
If the user scrubs, I pause the video, set a different time,
and then resume the video.  But then I try to get fancy and reactive.
My Player class has an attribute `time`, so I should be able to just set
`player.time` to whatever, whenever, 
wherever in the code and the video will JUMP to that time.
That would certainly make life a lot easier.
So in the `requestAnimationFrame` loop 
(which conveniently gives the callback a time parameter) 
I set `video.currentTime = this.time`.  
Now the user interaction (and anything else) of could just 
set `player.time` and immediately see the video at the right time.
Great.

Then I tried an mp4 and ... crap.
I could scrub to certain times really easily,
but ran into jittery messes every so often.
It seemed to be every 2 seconds that it would freeze.
Turns out that rapidly setting `video.currentTime`
is effectively useless with mp4s.
This is because the browser isn't smart enough to cache the 
partial decoding of mp4s and would occasionally hit frames that 
required looking at many of the frames prior to it.  Slow.

So I came up with a dumb idea.
Decode every frame up front and save them into an array of
`ImageData` objects.
I initially thought that there's no way a browser could
1) decode quickly and 
2) store all these massive image files for a full video.
I was wrong!  
The little % that shows when you drag a video in is actually the 
decode process and it seems that browsers are given plenty of ram 
so its quite easy to work with thousands of images.

Cool, videos now play and I can scrub around easily.
Audio doesn't work, but oh well, I'm just making gifs.

So what do I actually want in a meme?
Just pasting text and pictures, right?
I quickly realize that a bunch of stationary images and text on video 
is boring.  REALLY boring.

So I decide to add key-frame animation.
If you do anything to an image or text like scale it up, 
move it around, etc. (jk there is no "etc", you can only do that), 
I want to capture the moment in video time
(checking that trusty `player.time` attribute) and record it to a 
set of frames associated with the image or text and label it a "key" frame.
Then, as you play the video, 
the image or text will slowly morph into the next "key" frame.
This is exactly the same behavior as css key frames.

I eventually find the easiest way to implement this is about as 
dumb as the video implementation.
Record a datapoint for every frame in the video and then 
at each time step look up the datapoint and transform the image or 
text accordingly (scale + x + y).
Since we can easily store and draw full images at like 60fps
I figure this won't hurt.
However, it does mean I need to keep every datapoint updated with every user 
interaction.
That's not exactly ideal because the user might have a transform at 
the beginning of the video and one transform at the end, 
so I'd need to update *every single* point in-between.  Seems dumb.

But dumb is best, so I do exactly that and call it a day.

Now on to importing and exporting!  [StackOverflow](https://stackoverflow.com/a/50683349) 
to the rescue.
There's an incredible MediaRecorder API in modern browsers that makes 
exporting super easy.  You just record the canvas as if its a webcam 
and then write the output to a file.
I wasn't super happy with a popup blocker ruining the download, 
so I instead inject a link (as is done in the SO post) 
and make the user click "download" afterward.
Uploading is also straightforward using the
`FileReader` and [file drag and drop API.](https://developer.mozilla.org/en-US/docs/Web/API/HTML_Drag_and_Drop_API/File_drag_and_drop)
(Not be confused with the HTML drag and drop API.  Which I do. Repeatedly.)
And voilĂ  the editor is useable!

With a functional edit and export flow I'm pretty happy with the results.
Time to start making memes!

Thanks for reading :^)