The result is pretty nice. However, the only problem is the slow inference speed. I'm now refactoring the project structure and changing the model to a much faster YOLO model.
It's ready for a quick demo. However, there are stiil some little improvements have to make. And I'll build an web app on top of it for people to use it online.
Thanks for asking! The only model I used in this project is the YOLOv4 object detection model to detect the ball in each frame. I collected about 200 images to train it.
For the other parts like the tracking and the overlay timing, I programmed it by myself.
I implemented SORT algorithm for tracking the ball and some programming logic to capture the overlay timing from each clip.
I would say that the similar workflow could be applied to any ball-related sports. The object detection and the tracking algorithm is basically the same. Then, you could add any sport-specific feature!
For example, I have used a similar method to build AI Basketball Analysis.
1. What method did you use to get the summary out of all the subtitles?
I measured the similarity between words in each sentence. If words in two sentences are not very semantically similar, they will be divided into two different chapters. As for how I measure their semantic similarity, I used word2vec (it will be more accurate if I use something like BERT but this is just a prototype).
2. How to get the subtitles of the video (Youtube API)?
Subtitles are available on the YouTube video's HTML, you can write a crawler to get them. YouTube API might also be a way.
3. How to get the timestamp of the specific word in the subtitle?
I would really like to build something similar! Thanks a lot!
As timestamps are sentence-level only, there is no perfect way to get them for each word. You will need to do the approximation for it. And I didn't do it for my case.
Hope the answers are helpful. Let me know if you have more questions!
This project takes your baseball pitching clips and automatically generates the overlay. A fine-tuned Yolov4 model is used to get the location of the ball. Then, I implemented SORT tracking algorithm to keep track of each individual ball. Lastly, I will apply some image registration techniques to deal with slight camera shift on each clip.
I'm still trying to improve it! Feel free to follow this project, also check out the Todo list.
BTW, I just want to point out that did anyone notice that the pitcher throw the ball with the exact same posture but it turned out to fly on a completely different path. It's just amazing!
This project takes your baseball pitching clips and automatically generates the overlay. A fine-tuned Yolov4 model is used to get the location of the ball. Then, I implemented SORT tracking algorithm to keep track of each individual ball. Lastly, I will apply some image registration techniques to deal with slight camera shift on each clip.
I'm still trying to improve it! Feel free to follow this project, also checkout the Todo list.
BTW, I just want to point out that did anyone notice that the pitcher throw the ball with the exact same posture but it turned out to fly on a completely different path. It's just amazing!
Is there a pure notebook version stripped of the application? I'd like to play with this on iko.ai. Already have the Coco dataset, GPU support, and most of the dependencies in our image.
Hi, the link provided is the pure ML part of the project, without any Flask thing.
Unfortunately, I don't have written a notebook for it. Maybe you could try to figure it out through the source code? The code itself is actually prettly simple and short.
https://github.com/chonyy/AI-basketball-analysis
The result is pretty nice. However, the only problem is the slow inference speed. I'm now refactoring the project structure and changing the model to a much faster YOLO model.