Artificial Intelligence , Machine Learning and Data Science Hubspot

Unlock the Power of Artificial Intelligence, Machine Learning, and Data Science with our Blog Discover the latest insights, trends, and innovations in Artificial Intelligence (AI), Machine Learning (ML), and Data Science through our informative and engaging Hubspot blog. Gain a deep understanding of how these transformative technologies are shaping industries and revolutionizing the way we work. Stay updated with cutting-edge advancements, practical applications, and real-world use.

Thursday, 18 May 2023

DeepFake with Python

DeepFake is composed of Deep Learning and Fake means taking one person from an image or video and replacing it with someone else likeness using technology such as Deep Artificial Neural Networks [1].

Kaggle dataset is used https://www.kaggle.com/c/deepfake-detection-challenge/data

Github Reference: https://github.com/ageitgey/face_recognition

!pip install face_recognition

Data

The data is comprised of .mp4 files, split into ~10GB apiece. A metadata.json accompanies each set of .mp4 files and contains the filename, label (REAL/FAKE), original and split columns, listed below under Columns.
The full training set is just over 470 GB.

References: https://deepfakedetectionchallenge.ai/faqs

Data exploration

DATA_FOLDER = '../input/deepfake-detection-challenge'
TRAIN_SAMPLE_FOLDER = 'train_sample_videos'
TEST_FOLDER = 'test_videos'

print(f"Train samples: {len(os.listdir(os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER)))}")
print(f"Test samples: {len(os.listdir(os.path.join(DATA_FOLDER, TEST_FOLDER)))}")

Files

train_sample_videos.zip
sample_submission.csv
test_videos.zip

Metadata Columns

filename - the filename of the video
label - whether the video is REAL or FAKE
original - in the case that a train set video is FAKE, the original video is listed here
split - this is always equal to “train”.

Check files type

train_list = list(os.listdir(os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER)))
ext_dict = []
for file in train_list:
    file_ext = file.split('.')[1]
    if (file_ext not in ext_dict):
        ext_dict.append(file_ext)
print(f"Extensions: {ext_dict}")

Video data exploration

We check first if the list of files in the meta info and the list from the folder are the same.

meta = np.array(list(meta_train_df.index))
storage = np.array([file for file in train_list if  file.endswith('mp4')])
print(f"Metadata: {meta.shape[0]}, Folder: {storage.shape[0]}")
print(f"Files in metadata and not in folder: {np.setdiff1d(meta,storage,assume_unique=False).shape[0]}")
print(f"Files in folder and not in metadata: {np.setdiff1d(storage,meta,assume_unique=False).shape[0]}")

Example of Fake video aagfhgtpmv.mp4

Let's use the face_recognition package to detect faces in the video

Check out this great kernel here https://www.kaggle.com/brassmonkey381/a-quick-look-at-the-first-frame-of-each-video for how I learned to capture a frame from the video file.

import cv2 as cv
import os
import matplotlib.pylab as plt
train_dir = '/kaggle/input/deepfake-detection-challenge/train_sample_videos/'
fig, ax = plt.subplots(1,1, figsize=(15, 15))
train_video_files = [train_dir + x for x in os.listdir(train_dir)]
# video_file = train_video_files[30]
video_file = '/kaggle/input/deepfake-detection-challenge/train_sample_videos/akxoopqjqz.mp4'
cap = cv.VideoCapture(video_file)
success, image = cap.read()
image = cv.cvtColor(image, cv.COLOR_BGR2RGB)
cap.release()   
ax.imshow(image)
ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
ax.title.set_text(f"FRAME 0: {video_file.split('/')[-1]}")
plt.grid(False)

Now, use OpenCV to detect the faces using the face_recognition package! First, we need to pip install it. Make sure you have internet turned on in your kernel.

Reference: https://github.com/ageitgey/face_recognition

Frame by Frame Face Detection

First we will loop through the frames of the video file and append them to a list called frames

video_file = '/kaggle/input/deepfake-detection-challenge/train_sample_videos/akxoopqjqz.mp4'

cap = cv2.VideoCapture(video_file)

frames = []
while(cap.isOpened()):
    ret, frame = cap.read()
    if ret==True:
        frames.append(frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        break
cap.release()

print('The number of frames saved: ', len(frames))

Artificial Intelligence , Machine Learning and Data Science Hubspot

Thursday, 18 May 2023

DeepFake with Python

Data

Data exploration

Files

Metadata Columns

Check files type

Video data exploration

Let's use the face_recognition package to detect faces in the video

Frame by Frame Face Detection

No comments:

Post a Comment

How to Develop a Character-Based Neural Language Model in Keras

Report Abuse

Labels

"Donate for a Noble Cause